public inbox for [email protected]
 help / color / mirror / Atom feed
From: Dylan Yudaken <[email protected]>
To: "[email protected]" <[email protected]>,
	"[email protected]" <[email protected]>
Cc: Kernel Team <[email protected]>,
	"[email protected]" <[email protected]>
Subject: Re: [PATCH for-next 1/4] io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async
Date: Mon, 30 Jan 2023 10:45:50 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

[-- Attachment #1: Type: text/plain, Size: 3187 bytes --]

On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote:
> On 1/29/23 3:57 PM, Jens Axboe wrote:
> > On 1/27/23 6:52?AM, Dylan Yudaken wrote:
> > > REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
> > > requests. Instead obey that flag.
> > > 
> > > Signed-off-by: Dylan Yudaken <[email protected]>
> > > ---
> > >  io_uring/io_uring.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> > > index db623b3185c8..980ba4fda101 100644
> > > --- a/io_uring/io_uring.c
> > > +++ b/io_uring/io_uring.c
> > > @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb
> > > *req, bool *locked)
> > >  {
> > >         io_tw_lock(req->ctx, locked);
> > >         /* req->task == current here, checking PF_EXITING is safe
> > > */
> > > -       if (likely(!(req->task->flags & PF_EXITING)))
> > > -               io_queue_sqe(req);
> > > -       else
> > > +       if (unlikely(req->task->flags & PF_EXITING))
> > >                 io_req_defer_failed(req, -EFAULT);
> > > +       else if (req->flags & REQ_F_FORCE_ASYNC)
> > > +               io_queue_iowq(req, locked);
> > > +       else
> > > +               io_queue_sqe(req);
> > >  }
> > >  
> > >  void io_req_task_queue_fail(struct io_kiocb *req, int ret)
> > 
> > This one causes a failure for me with test/multicqes_drain.t, which
> > doesn't quite make sense to me (just yet), but it is a reliable
> > timeout.
> 
> OK, quick look and I think this is a bad assumption in the test case.
> It's assuming that a POLL_ADD already succeeded, and hence that a
> subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as
> we can't find it just yet, which means the cancelation itself isn't
> being done. So we just end up waiting for something that doesn't
> happen.
> 
> Or could be an internal race with lookup/issue. In any case, it's
> definitely being exposed by this patch.
> 

That is a bit of an unpleasasnt test.
Essentially it triggers a pipe, and reads from the pipe immediately
after. The test expects to see a CQE for that trigger, however if
anything ran asynchronously then there is a race between the read and
the poll logic running.

The attached patch fixes the test, but the reason my patches trigger it
is a bit weird.

This occurs on the second loop of the test, after the initial drain.
Essentially ctx->drain_active is still true when the second set of
polls are added, since drain_active is only cleared inside the next
io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set.

Previously those FORCE_ASYNC's were being ignored, but now with
"io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async"
they get sent to the work thread, which causes the race. 

I wonder if drain_active should actually be cleared earlier? perhaps
before setting the REQ_F_FORCE_ASYNC flag?
The drain logic is pretty complex though, so I am not terribly keen to
start changing it if it's not generally useful.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch; name="patch.diff", Size: 1863 bytes --]

commit d362fb231310a52a79c8b9f72165a708bfd8aa44
Author: Dylan Yudaken <[email protected]>
Date:   Mon Jan 30 01:49:57 2023 -0800

    multicqes_drain: make trigger event wait before reading
    
    trigger_event is used to generate CQEs on the poll requests. However there
    is a race if that poll request is running asynchronously, where the
    read_pipe will complete before the poll is run, and the poll result will
    be that there is no data ready.
    
    Instead sleep and force an io_uring_get_events in order to give the poll a
    chance to run before reading from the pipe.
    
    Signed-off-by: Dylan Yudaken <[email protected]>

diff --git a/test/multicqes_drain.c b/test/multicqes_drain.c
index 3755beec42c7..6c4d5f2ba887 100644
--- a/test/multicqes_drain.c
+++ b/test/multicqes_drain.c
@@ -71,13 +71,15 @@ static void read_pipe(int pipe)
 		perror("read");
 }
 
-static int trigger_event(int p[])
+static int trigger_event(struct io_uring *ring, int p[])
 {
 	int ret;
 	if ((ret = write_pipe(p[1], "foo")) != 3) {
 		fprintf(stderr, "bad write return %d\n", ret);
 		return 1;
 	}
+	usleep(1000);
+	io_uring_get_events(ring);
 	read_pipe(p[0]);
 	return 0;
 }
@@ -236,10 +238,8 @@ static int test_generic_drain(struct io_uring *ring)
 		if (si[i].op != multi && si[i].op != single)
 			continue;
 
-		if (trigger_event(pipes[i]))
+		if (trigger_event(ring, pipes[i]))
 			goto err;
-
-		io_uring_get_events(ring);
 	}
 	sleep(1);
 	i = 0;
@@ -317,13 +317,11 @@ static int test_simple_drain(struct io_uring *ring)
 	}
 
 	for (i = 0; i < 2; i++) {
-		if (trigger_event(pipe1))
+		if (trigger_event(ring, pipe1))
 			goto err;
-		io_uring_get_events(ring);
 	}
-	if (trigger_event(pipe2))
-			goto err;
-	io_uring_get_events(ring);
+	if (trigger_event(ring, pipe2))
+		goto err;
 
 	for (i = 0; i < 2; i++) {
 		sqe[i] = io_uring_get_sqe(ring);

  reply	other threads:[~2023-01-30 10:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-27 13:52 [PATCH for-next 0/4] io_uring: force async only ops to go async Dylan Yudaken
2023-01-27 13:52 ` [PATCH for-next 1/4] io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async Dylan Yudaken
2023-01-29 22:57   ` Jens Axboe
2023-01-29 23:17     ` Jens Axboe
2023-01-30 10:45       ` Dylan Yudaken [this message]
2023-01-30 15:53         ` Jens Axboe
2023-01-30 16:21           ` Pavel Begunkov
2023-01-27 13:52 ` [PATCH for-next 2/4] io_uring: for requests that require async, force it Dylan Yudaken
2023-01-27 13:52 ` [PATCH for-next 3/4] io_uring: always go async for unsupported fadvise flags Dylan Yudaken
2023-01-27 13:52 ` [PATCH for-next 4/4] io_uring: always go async for unsupported open flags Dylan Yudaken
2023-01-29 22:20 ` [PATCH for-next 0/4] io_uring: force async only ops to go async Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e12d8f56e8ee14b70f6f5e7b1f08ce5baf06f8ec.camel@meta.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox