From: jrun <[email protected]>
To: Pavel Begunkov <[email protected]>
Cc: io-uring <[email protected]>
Subject: Re: happy io_uring_prep_accept_direct() submissions go hiding!
Date: Thu, 9 Dec 2021 12:56:36 -0500 [thread overview]
Message-ID: <20211209175636.oq6npmqf24h5hthi@p51> (raw)
In-Reply-To: <[email protected]>
On Thu, Dec 09, 2021 at 03:02:12PM +0000, Pavel Begunkov wrote:
> Don't see how a CQE may get missing, so let me ask a bunch of questions:
>
> First, let's try out my understanding of your problem. At the beginning you
> submit MAX_CONNECTIONS/2 accept requests and _all_ of them complete.
correct.
> In the main loop you add another bunch of accepts, but you're not getting CQEs
> from them. Right ?
yes, io_uring_prep_accept_direct() submissions before entering the main loop
complete.any io_uring_prep_accept_direct() submitted from within the main loop
goes missing.
> 1) Anything in dmesg? Please when it got stuck (or what the symptoms are),
> don't kill it but wait for 3 minutes and check dmesg again.
>
nothing in dmesg!
> Or you to reduce the waiting time:
> "echo 10 > /proc/sys/kernel/hung_task_timeout_secs"
oh, my kernel[mek] is missing that; rebuilding right now with
`CONFIG_DETECT_HUNG_TASK=y`; will report back after reboot.
btw, enabled CONFIG_WQ_WATCHDOG=y for workqueue.watchdog_thresh; don't know if
that would help too. let me know.
also any magic with bpftrace you would suggest?
> And then should if anything wrong it should appear in dmesg max in 20-30 secs
>
> 2) What kernel version are you running?
[mek]: Linux 5.15.6-gentoo-p51 #5 SMP PREEMPT x86_64 i7-7700HQ
> 3) Have you tried normal accept (non-direct)?
no, will try, but accept_direct worked for me before introducing pthread into
the code. don't know if it matters.
> 4) Can try increase the max number io-wq workers exceeds the max number
> of inflight requests? Increase RLIMIT_NPROC, E.g. set it to
> RLIMIT_NPROC = nr_threads + max inflight requests.
i only have 1 thread atm but will try this with the new kernel and report back.
> 5) Do you get CQEs when you shutdown listening sockets?
yes! io_uring_prep_close_direct() call, there is only one inside dq_msg(), come
in on subsequent arrival of connect() requests from the client.
tested with and without IOSQE_ASYNC set.
> 6) Do you check return values of io_uring_submit()?
>
> 7) Any variability during execution? E.g. a different number of
> sockets get accepted.
with IORING_SETUP_SQPOLL, i was getting different numbers for:
pending, = io_uring_sq_ready(ring); vs
submitted, = io_uring_submit(ring); according to the commented block at the
beginning of the event loop. don't if that's the way to check what you're
asking. let me know please.
thanks for the help,
- jrun
next prev parent reply other threads:[~2021-12-09 17:56 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-08 19:07 happy io_uring_prep_accept_direct() submissions go hiding! jrun
2021-12-08 19:16 ` [oops!] " jrun
2021-12-09 15:02 ` Pavel Begunkov
2021-12-09 17:56 ` jrun [this message]
2021-12-09 19:34 ` possible bug with unix sockets jrun
2021-12-09 20:34 ` jrun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211209175636.oq6npmqf24h5hthi@p51 \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox