From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, Victor Stewart <[email protected]>
Cc: io-uring <[email protected]>
Subject: Re: bug with fastpoll accept and sqpoll + IOSQE_FIXED_FILE
Date: Fri, 5 Feb 2021 12:46:11 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 04/02/2021 16:50, Jens Axboe wrote:
> On 2/3/21 4:49 AM, Pavel Begunkov wrote:
>> On 02/02/2021 20:56, Pavel Begunkov wrote:
>>> On 02/02/2021 20:48, Jens Axboe wrote:
>>>> On 2/2/21 1:34 PM, Pavel Begunkov wrote:
>>>>> On 02/02/2021 17:41, Pavel Begunkov wrote:
>>>>>> On 02/02/2021 17:24, Jens Axboe wrote:
>>>>>>> On 2/2/21 10:10 AM, Victor Stewart wrote:
>>>>>>>>> Can you send the updated test app?
>>>>>>>>
>>>>>>>> https://gist.github.com/victorstewart/98814b65ed702c33480487c05b40eb56
>>>>>>>>
>>>>>>>> same link i just updated the same gist
>>>>>>>
>>>>>>> And how are you running it?
>>>>>>
>>>>>> with SQPOLL with FIXED FLAG -> FAILURE: failed with error = ???
>>>>>> -> io_uring_wait_cqe_timeout() strangely returns -1, (-EPERM??)
>>>>>
>>>>> Ok, _io_uring_get_cqe() is just screwed twice
>>>>>
>>>>> TL;DR
>>>>> we enter into it with submit=0, do an iteration, which decrements it,
>>>>> then a second iteration passes submit=-1, which is returned back by
>>>>> the kernel as a result and propagated back from liburing...
>>>>
>>>> Yep, that's what I came up with too. We really just need a clear way
>>>> of knowing when to break out, and when to keep going. Eg if we've
>>>> done a loop and don't end up calling the system call, then there's
>>>> no point in continuing.
>>>
>>> We can bodge something up (and forget about it), and do much cleaner
>>> for IORING_FEAT_EXT_ARG, because we don't have LIBURING_UDATA_TIMEOUT
>>> reqs for it and so can remove peek and so on.
>>
>> This version looks reasonably simple, and even passes tests and all
>> issues found by Victor's test. Didn't test it yet, but should behave
>> similarly in regard of internal timeouts (pre IORING_FEAT_EXT_ARG).
>>
>> static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
>> struct get_data *data)
>> {
>> struct io_uring_cqe *cqe = NULL;
>> int ret = 0, err;
>>
>> do {
>> unsigned flags = 0;
>> unsigned nr_available;
>> bool enter = false;
>>
>> err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
>> if (err)
>> break;
>>
>> /* IOPOLL won't proceed when there're not reaped CQEs */
>> if (cqe && (ring->flags & IORING_SETUP_IOPOLL))
>> data->wait_nr = 0;
>>
>> if (data->wait_nr > nr_available || cq_ring_needs_flush(ring)) {
>> flags = IORING_ENTER_GETEVENTS | data->get_flags;
>> enter = true;
>> }
>> if (data->submit) {
>> sq_ring_needs_enter(ring, &flags);
>> enter = true;
>> }
>> if (!enter)
>> break;
>>
>> ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
>> data->wait_nr, flags, data->arg,
>> data->sz);
>> if (ret < 0) {
>> err = -errno;
>> break;
>> }
>> data->submit -= ret;
>> } while (1);
>>
>> *cqe_ptr = cqe;
>> return err;
>> }
>
> So here's my take on this - any rewrite of _io_uring_get_cqe() is going
> to end up adding special cases, that's unfortunately just the nature of
> the game. And since we're going to be doing a new liburing release very
> shortly, this isn't a great time to add a rewrite of it. It'll certainly
> introduce more bugs than it solves, and hence regressions, no matter how
> careful we are.
>
> Hence my suggestion is to just patch this in a trivial kind of fashion,
> even if it doesn't really make the function any prettier. But it'll be
> safer for a release, and then we can rework the function after.
>
> With that in mind, here's my suggestion. The premise is if we go through
> the loop and don't do io_uring_enter(), then there's no point in
> continuing. That's the trivial fix.
Your idea but imho cleaner below.
+1 comment inline
> diff --git a/src/queue.c b/src/queue.c
> index 94f791e..4161aa7 100644
> --- a/src/queue.c
> +++ b/src/queue.c
> @@ -89,12 +89,13 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
> {
> struct io_uring_cqe *cqe = NULL;
> const int to_wait = data->wait_nr;
> - int ret = 0, err;
> + int err;
>
> do {
> bool cq_overflow_flush = false;
> unsigned flags = 0;
> unsigned nr_available;
> + int ret = -2;
>
> err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
> if (err)
> @@ -117,7 +118,9 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
> ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
> data->wait_nr, flags, data->arg,
> data->sz);
> - if (ret < 0) {
> + if (ret == -2) {
> + break;
peek/wait_cqe expect that cqe_ptr is filled on return=0. Looks we need
to return an error or hack up those functions.
> + } else if (ret < 0) {
> err = -errno;
> } else if (ret == (int)data->submit) {
> data->submit = 0;
>
diff --git a/src/queue.c b/src/queue.c
index 94f791e..7d6f31d 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -112,11 +112,15 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
flags = IORING_ENTER_GETEVENTS | data->get_flags;
if (data->submit)
sq_ring_needs_enter(ring, &flags);
- if (data->wait_nr > nr_available || data->submit ||
- cq_overflow_flush)
- ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
- data->wait_nr, flags, data->arg,
- data->sz);
+
+ if (data->wait_nr <= nr_available && !data->submit &&
+ !cq_overflow_flush) {
+ err = ?;
+ break;
+ }
+ ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
+ data->wait_nr, flags, data->arg,
+ data->sz);
if (ret < 0) {
err = -errno;
} else if (ret == (int)data->submit) {
--
Pavel Begunkov
next prev parent reply other threads:[~2021-02-05 22:56 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-02 5:36 bug with fastpoll accept and sqpoll + IOSQE_FIXED_FILE Victor Stewart
2021-02-02 11:05 ` Pavel Begunkov
2021-02-02 11:23 ` Pavel Begunkov
2021-02-02 16:18 ` Victor Stewart
2021-02-02 16:30 ` Victor Stewart
2021-02-02 16:44 ` Jens Axboe
2021-02-02 17:10 ` Victor Stewart
2021-02-02 17:24 ` Jens Axboe
2021-02-02 17:41 ` Pavel Begunkov
2021-02-02 20:34 ` Pavel Begunkov
2021-02-02 20:48 ` Jens Axboe
2021-02-02 20:56 ` Pavel Begunkov
2021-02-03 11:49 ` Pavel Begunkov
2021-02-04 16:50 ` Jens Axboe
2021-02-05 12:46 ` Pavel Begunkov [this message]
2021-02-05 14:42 ` Jens Axboe
2021-02-05 14:49 ` Pavel Begunkov
2021-02-02 17:46 ` Jens Axboe
2021-02-02 17:50 ` Victor Stewart
2021-02-02 17:57 ` Jens Axboe
2021-02-02 17:46 ` Victor Stewart
2021-02-02 17:21 ` Pavel Begunkov
2021-02-02 17:30 ` Victor Stewart
2021-02-02 17:45 ` Victor Stewart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox