Hello. On Tue, Mar 14, 2023 at 10:07:40AM +0000, Daniel Dao wrote: > IMO this violated the principle of cpuset and can be confusing for end users. > I think I prefer Waiman's suggestion of allowing an implicit move to cpuset > when enabling cpuset with subtree_control but not explicit moves such as when > setting cpuset.cpus or writing the pids into cgroup.procs. It's easier to reason > about and make the failure mode more explicit. > > What do you think ? I think cpuset should top IO worker's affinity (like sched_setaffinity(2)). Thus: - modifying cpuset.cpus update task's affinity, for sure - implicit migration (enabling cpuset) update task's affinity, effective nop - explicit migration (meh) update task's affinity, ¯\_(ツ)_/¯ My understanding of PF_NO_SETAFFINITY is that's for kernel threads that do work that's functionally needed on a given CPU and thus they cannot be migrated [1]. As said previously for io_uring workers, affinity is for performance only. Hence, I'd also suggest on top of 01e68ce08a30 ("io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers"): --- a/io_uring/sqpoll.c +++ b/io_uring/sqpoll.c @@ -233,7 +233,6 @@ static int io_sq_thread(void *data) set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu)); else set_cpus_allowed_ptr(current, cpu_online_mask); - current->flags |= PF_NO_SETAFFINITY; mutex_lock(&sqd->lock); while (1) { Afterall, io_uring_setup(2) already mentions: > When cgroup setting cpuset.cpus changes (typically in container > environment), the bounded cpu set may be changed as well. HTH, Michal [1] Ideally, those should always remain in the root cpuset cgroup.