From: Jens Axboe <axboe@kernel.dk>
To: Fengnan Chang <fengnanchang@gmail.com>,
asml.silence@gmail.com, io-uring@vger.kernel.org
Cc: Fengnan Chang <changfengnan@bytedance.com>,
Diangang Li <lidiangang@bytedance.com>
Subject: Re: [RFC PATCH 2/2] io_uring: fix io may accumulation in poll mode
Date: Wed, 10 Dec 2025 19:15:57 -0700 [thread overview]
Message-ID: <ca81eb74-2ded-44dd-8d6b-42a131c89550@kernel.dk> (raw)
In-Reply-To: <20251210085501.84261-3-changfengnan@bytedance.com>
On 12/10/25 1:55 AM, Fengnan Chang wrote:
> In the io_do_iopoll function, when the poll loop of iopoll_list ends, it
> is considered that the current req is the actual completed request.
> This may be reasonable for multi-queue ctx, but is problematic for
> single-queue ctx because the current request may not be done when the
> poll gets to the result. In this case, the completed io needs to wait
> for the first io on the chain to complete before notifying the user,
> which may cause io accumulation in the list.
> Our modification plan is as follows: change io_wq_work_list to normal
> list so that the iopoll_list list in it can be removed and put into the
> comp_reqs list when the request is completed. This way each io is
> handled independently and all gets processed in time.
>
> After modification, test with:
>
> ./t/io_uring -p1 -d128 -b4096 -s32 -c32 -F1 -B1 -R1 -X1 -n1 -P1
> /dev/nvme6n1
>
> base IOPS is 725K, patch IOPS is 782K.
>
> ./t/io_uring -p1 -d128 -b4096 -s32 -c1 -F1 -B1 -R1 -X1 -n1 -P1
> /dev/nvme6n1
>
> Base IOPS is 880k, patch IOPS is 895K.
A few notes on this:
1) Manipulating the list in io_complete_rw_iopoll() I don't think is
necessarily safe. Yes generally this is invoked from the
owning/polling task, but that's not guaranteed.
2) The patch doesn't apply to the current tree, must be an older
version?
3) When hand-applied, it still throws a compile warning about an unused
variable. Please don't send untested stuff...
4) Don't just blatantly bloat the io_kiocb. When you change from a
singly to a doubly linked list, you're growing the io_kiocb size. You
should be able to use a union with struct io_task_work for example.
That's already 16b in size - win/win as you don't need to slow down
the cache management as that can keep using the linkage it currently
is using, and you're not bloating the io_kiocb.
5) The already mentioned point about the cache free list now being
doubly linked. This is generally a _bad_ idea as removing and adding
entries now need to touch other entries too. That's not very cache
friendly.
#1 is kind of the big one, as it means you'll need to re-think how you
do this. I do agree that the current approach isn't necessarily ideal as
we don't process completions as quickly as we could, so I think there's
merrit in continuing this work.
--
Jens Axboe
next prev parent reply other threads:[~2025-12-11 2:16 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-10 8:54 [RFC PATCH 0/2] io_uring: fix io may accumulation in poll mode Fengnan Chang
2025-12-10 8:55 ` [RFC PATCH 1/2] blk-mq: delete task running check in blk_hctx_poll Fengnan Chang
2025-12-10 9:19 ` Jens Axboe
2025-12-10 9:53 ` Jens Axboe
2025-12-10 8:55 ` [RFC PATCH 2/2] io_uring: fix io may accumulation in poll mode Fengnan Chang
2025-12-11 2:15 ` Jens Axboe [this message]
2025-12-11 4:10 ` Jens Axboe
2025-12-11 7:38 ` Fengnan
2025-12-11 10:22 ` Jens Axboe
2025-12-11 10:33 ` Jens Axboe
2025-12-11 11:13 ` Fengnan Chang
2025-12-11 11:19 ` Jens Axboe
2025-12-12 1:41 ` Fengnan Chang
2025-12-12 1:53 ` Jens Axboe
2025-12-12 2:12 ` Fengnan Chang
2025-12-12 5:11 ` Jens Axboe
2025-12-12 8:58 ` Jens Axboe
2025-12-12 9:49 ` Fengnan Chang
2025-12-12 20:22 ` Jens Axboe
2025-12-12 13:32 ` Diangang Li
2025-12-12 20:09 ` Jens Axboe
2025-12-10 9:53 ` (subset) [RFC PATCH 0/2] " Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ca81eb74-2ded-44dd-8d6b-42a131c89550@kernel.dk \
--to=axboe@kernel.dk \
--cc=asml.silence@gmail.com \
--cc=changfengnan@bytedance.com \
--cc=fengnanchang@gmail.com \
--cc=io-uring@vger.kernel.org \
--cc=lidiangang@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox