public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Jann Horn <[email protected]>
Cc: Peter Zijlstra <[email protected]>,
	io-uring <[email protected]>,
	stable <[email protected]>, Josef <[email protected]>,
	Oleg Nesterov <[email protected]>
Subject: Re: [PATCH 2/2] io_uring: use TWA_SIGNAL for task_work if the task isn't running
Date: Mon, 10 Aug 2020 15:06:49 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAG48ez0+=+Q0tjdFxjbbZbZJNkimYL9Bd5odr0T9oWwty6qgoQ@mail.gmail.com>

On 8/10/20 2:35 PM, Jann Horn wrote:
> On Mon, Aug 10, 2020 at 10:25 PM Jens Axboe <[email protected]> wrote:
>> On 8/10/20 2:13 PM, Jens Axboe wrote:
>>>> Would it be clearer to write it like so perhaps?
>>>>
>>>>      /*
>>>>       * Optimization; when the task is RUNNING we can do with a
>>>>       * cheaper TWA_RESUME notification because,... <reason goes
>>>>       * here>. Otherwise do the more expensive, but always correct
>>>>       * TWA_SIGNAL.
>>>>       */
>>>>      if (READ_ONCE(tsk->state) == TASK_RUNNING) {
>>>>              __task_work_notify(tsk, TWA_RESUME);
>>>>              if (READ_ONCE(tsk->state) == TASK_RUNNING)
>>>>                      return;
>>>>      }
>>>>      __task_work_notify(tsk, TWA_SIGNAL);
>>>>      wake_up_process(tsk);
>>>
>>> Yeah that is easier to read, wasn't a huge fan of the loop since it's
>>> only a single retry kind of condition. I'll adopt this suggestion,
>>> thanks!
>>
>> Re-write it a bit on top of that, just turning it into two separate
>> READ_ONCE, and added appropriate comments. For the SQPOLL case, the
>> wake_up_process() is enough, so we can clean up that if/else.
>>
>> https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.9&id=49bc5c16483945982cf81b0109d7da7cd9ee55ed
> 
> I think I'm starting to understand the overall picture here, and I
> think if my understanding is correct, your solution isn't going to
> work properly.
> 
> My understanding of the scenario you're trying to address is:
> 
>  - task A starts up io_uring
>  - task A tells io_uring to bump the counter of an eventfd E when work
> has been completed
>  - task A submits some work ("read a byte from file descriptor X", or
> something like that)
>  - io_uring internally starts an asynchronous I/O operation, with a callback C
>  - task A calls read(E, &counter, sizeof(counter)) to wait for events
> to be processed
>  - the async I/O operation finishes, C is invoked, and C schedules
> task_work for task A
> 
> And here you run into a deadlock, because the task_work will only run
> when task A returns from the syscall, but the syscall will only return
> once the task_work is executing and has finished the I/O operation.
> 
> 
> If that is the scenario you're trying to solve here (where you're
> trying to force a task that's in the middle of some syscall that's
> completely unrelated to io_uring to return back to syscall context), I
> don't think this will work: It might well be that the task has e.g.
> just started entering the read() syscall, and is *about to* block, but
> is currently still running.

Your understanding of the scenario appears to be correct, and as far as
I can tell, also your analysis of why the existing approach doesn't
fully close it. You're right in that the task could currently be on its
way to blocking, but still running. And for that case, TWA_RESUME isn't
going to cut it.

Ugh. This basically means I have to use TWA_SIGNAL regardless of state,
unless SQPOLL is used. That's not optimal.

Alternatively:

if (tsk->state != TASK_RUNNING || task_in_kernel(tsk))
	notify = TWA_SIGNAL;
else
	notify = TWA_RESUME;

should work as far as I can tell, but I don't even know if there's a
reliable way to do task_in_kernel(). But I suspect this kind of check
would still save the day, as we're not really expecting the common case
to be that the task is in the kernel on the way to blocking. And it'd be
kind of annoying to have to cater to that scenario by slowing down the
fast path.

Suggestions?

-- 
Jens Axboe


  reply	other threads:[~2020-08-10 21:06 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-08 18:34 [PATCHSET 0/2] io_uring: use TWA_SIGNAL more carefully Jens Axboe
2020-08-08 18:34 ` [PATCH 1/2] kernel: split task_work_add() into two separate helpers Jens Axboe
2020-08-10 11:37   ` peterz
2020-08-10 15:01     ` Jens Axboe
2020-08-10 15:28       ` peterz
2020-08-10 17:51       ` Jens Axboe
2020-08-10 19:53         ` Peter Zijlstra
2020-08-08 18:34 ` [PATCH 2/2] io_uring: use TWA_SIGNAL for task_work if the task isn't running Jens Axboe
2020-08-10 11:42   ` peterz
2020-08-10 15:02     ` Jens Axboe
2020-08-10 19:21       ` Jens Axboe
2020-08-10 20:12         ` Peter Zijlstra
2020-08-10 20:13           ` Jens Axboe
2020-08-10 20:25             ` Jens Axboe
2020-08-10 20:32               ` Peter Zijlstra
2020-08-10 20:35                 ` Jens Axboe
2020-08-10 20:35               ` Jann Horn
2020-08-10 21:06                 ` Jens Axboe [this message]
2020-08-10 21:10                   ` Peter Zijlstra
2020-08-10 21:12                     ` Jens Axboe
2020-08-10 21:26                       ` Jann Horn
2020-08-10 21:28                         ` Jens Axboe
2020-08-10 22:01                           ` Jens Axboe
2020-08-10 22:41                             ` Jann Horn
2020-08-11  1:25                               ` Jens Axboe
2020-08-11  6:45                               ` Oleg Nesterov
2020-08-11  6:56                                 ` Peter Zijlstra
2020-08-11  7:14                                   ` Oleg Nesterov
2020-08-11  7:26                                     ` Oleg Nesterov
2020-08-11  7:49                                       ` Peter Zijlstra
2020-08-11  7:45                                     ` Peter Zijlstra
2020-08-11  8:10                                       ` Oleg Nesterov
2020-08-11 13:06                                         ` Jens Axboe
2020-08-11 14:05                                           ` Oleg Nesterov
2020-08-11 14:12                                             ` Jens Axboe
2020-08-10 21:27                       ` Jens Axboe
2020-08-10 20:16           ` Jens Axboe
2020-08-13 16:25   ` Sasha Levin
2020-08-19 23:57   ` Sasha Levin
2020-08-19 23:59     ` Jens Axboe
2020-08-20  0:02       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox