public inbox for [email protected]
 help / color / mirror / Atom feed
From: Daniel Dao <[email protected]>
To: Jens Axboe <[email protected]>
Cc: io-uring <[email protected]>,
	Waiman Long <[email protected]>,
	[email protected]
Subject: Re: [PATCH] io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers
Date: Tue, 14 Mar 2023 10:07:40 +0000	[thread overview]
Message-ID: <CA+wXwBRGzfZB9tjKy5C2_pW1Z4yH2gNGxx79Fk-p3UsOWKGdqA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>

On Wed, Mar 8, 2023 at 2:27 PM Jens Axboe <[email protected]> wrote:
>
> Every now and then reports come in that are puzzled on why changing
> affinity on the io-wq workers fails with EINVAL. This happens because they
> set PF_NO_SETAFFINITY as part of their creation, as io-wq organizes
> workers into groups based on what CPU they are running on.
>
> However, this is purely an optimization and not a functional requirement.
> We can allow setting affinity, and just lazily update our worker to wqe
> mappings. If a given io-wq thread times out, it normally exits if there's
> no more work to do. The exception is if it's the last worker available.
> For the timeout case, check the affinity of the worker against group mask
> and exit even if it's the last worker. New workers should be created with
> the right mask and in the right location.

The patch resolved the bug around enabling cpuset for subtree_control for me.
However, it also doesn't prevent user from setting cpuset value that
is incompatible
with iou threads. For example, on a 2-numa 4-cpu node, new iou-wrks are bound to
2-3 while we can set cpuset.cpus to 1-2 successfully. The end result
is a mix of cpu
distribution such as:

  pid 533's current affinity list: 1,2 # process
  pid 720's current affinity list: 1,2 # iou-wrk-533
  pid 5236's current affinity list: 2,3 # iou-wrk-533, running outside of cpuset


IMO this violated the principle of cpuset and can be confusing for end users.
I think I prefer Waiman's suggestion of allowing an implicit move to cpuset
when enabling cpuset with subtree_control but not explicit moves such as when
setting cpuset.cpus or writing the pids into cgroup.procs. It's easier to reason
about and make the failure mode more explicit.

What do you think ?

Cheers,
Daniel.

  reply	other threads:[~2023-03-14 10:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-08 14:27 [PATCH] io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers Jens Axboe
2023-03-14 10:07 ` Daniel Dao [this message]
2023-03-14 16:25   ` Michal Koutný
2023-03-14 16:48     ` Jens Axboe
2023-03-14 18:17     ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+wXwBRGzfZB9tjKy5C2_pW1Z4yH2gNGxx79Fk-p3UsOWKGdqA@mail.gmail.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox