public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Fengnan Chang <changfengnan@bytedance.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: asml.silence@gmail.com, io-uring@vger.kernel.org,
	 Diangang Li <lidiangang@bytedance.com>
Subject: Re: [External] Re: [RFC PATCH] io_uring: fix io worker thread that keeps creating and destroying
Date: Thu, 22 May 2025 20:01:15 +0800	[thread overview]
Message-ID: <CAPFOzZtxRYsCg1BVdpDUH=_bsLEQRvsp5+x-7Kpwow66poUVtA@mail.gmail.com> (raw)
In-Reply-To: <b8cd8947-76fa-4863-a1f6-119c6d086196@kernel.dk>

Jens Axboe <axboe@kernel.dk> 于2025年5月22日周四 19:35写道:
>
> On 5/22/25 3:09 AM, Fengnan Chang wrote:
> > When running fio with buffer io and stable iops, I observed that
> > part of io_worker threads keeps creating and destroying.
> > Using this command can reproduce:
> > fio --ioengine=io_uring --rw=randrw --bs=4k --direct=0 --size=100G
> > --iodepth=256 --filename=/data03/fio-rand-read --name=test
> > ps -L -p pid, you can see about 256 io_worker threads, and thread
> > id keeps changing.
> > And I do some debugging, most workers create happen in
> > create_worker_cb. In create_worker_cb, if all workers have gone to
> > sleep, and we have more work, we try to create new worker (let's
> > call it worker B) to handle it.  And when new work comes,
> > io_wq_enqueue will activate free worker (let's call it worker A) or
> > create new one. It may cause worker A and B compete for one work.
> > Since buffered write is hashed work, buffered write to a given file
> > is serialized, only one worker gets the work in the end, the other
> > worker goes to sleep. After repeating it many times, a lot of
> > io_worker threads created, handles a few works or even no work to
> > handle,and exit.
> > There are several solutions:
> > 1. Since all work is insert in io_wq_enqueue, io_wq_enqueue will
> > create worker too, remove create worker action in create_worker_cb
> > is fine, maybe affect performance?
> > 2. When wq->hash->map bit is set, insert hashed work item, new work
> > only put in wq->hash_tail, not link to work_list,
> > io_worker_handle_work need to check hash_tail after a whole dependent
> > link, io_acct_run_queue will return false when new work insert, no
> > new thread will be created either in io_wqe_dec_running.
> > 3. Check is there only one hash bucket in io_wqe_dec_running. If only
> > one hash bucket, don't create worker, io_wq_enqueue will handle it.
>
> Nice catch on this! Does indeed look like a problem. Not a huge fan of
> approach 3. Without having really looked into this yet, my initial idea
> would've been to do some variant of solution 1 above. io_wq_enqueue()
> checks if we need to create a worker, which basically boils down to "do
> we have a free worker right now". If we do not, we create one. But the
> question is really "do we need a new worker for this?", and if we're
> inserting hashed worked and we have existing hashed work for the SAME
> hash and it's busy, then the answer should be "no" as it'd be pointless
> to create that worker.

Agree

>
> Would it be feasible to augment the check in there such that
> io_wq_enqueue() doesn't create a new worker for that case? And I guess a
> followup question is, would that even be enough, do we always need to
> cover the io_wq_dec_running() running case as well as
> io_acct_run_queue() will return true as well since it doesn't know about
> this either?
Yes,It is feasible to avoid creating a worker by adding some checks in
io_wq_enqueue. But what I have observed so far is most workers are
created in io_wq_dec_running (why no worker create in io_wq_enqueue?
I didn't figure it out now), it seems no need to check this
in io_wq_enqueue.  And cover io_wq_dec_running is necessary.

>
> I'll take a closer look at this later today, but figured I'd shoot some
> questions your way first...
>
> --
> Jens Axboe

  reply	other threads:[~2025-05-22 12:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-22  9:09 [RFC PATCH] io_uring: fix io worker thread that keeps creating and destroying Fengnan Chang
2025-05-22 11:34 ` Jens Axboe
2025-05-22 12:01   ` Fengnan Chang [this message]
2025-05-22 14:29     ` [External] " Jens Axboe
2025-05-23  7:57       ` Fengnan Chang
2025-05-23 15:20         ` Jens Axboe
2025-05-26 11:14           ` Fengnan Chang
2025-05-28 13:39             ` Jens Axboe
2025-05-28 13:56               ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPFOzZtxRYsCg1BVdpDUH=_bsLEQRvsp5+x-7Kpwow66poUVtA@mail.gmail.com' \
    --to=changfengnan@bytedance.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=lidiangang@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox