Re: [PATCH for-5.15] io_uring: fix lacking of protection for compl_nr

public inbox for [email protected]
 help / color / mirror / Atom feed

From: Jens Axboe <[email protected]>
To: Pavel Begunkov <[email protected]>,
	Hao Xu <[email protected]>
Cc: [email protected], Joseph Qi <[email protected]>
Subject: Re: [PATCH for-5.15] io_uring: fix lacking of protection for compl_nr
Date: Fri, 20 Aug 2021 21:10:25 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 8/20/21 4:59 PM, Pavel Begunkov wrote:
> On 8/20/21 11:46 PM, Jens Axboe wrote:
>> On 8/20/21 4:41 PM, Pavel Begunkov wrote:
>>> On 8/20/21 11:30 PM, Jens Axboe wrote:
>>>> On 8/20/21 4:28 PM, Pavel Begunkov wrote:
>>>>> On 8/20/21 11:09 PM, Jens Axboe wrote:
>>>>>> On 8/20/21 3:32 PM, Pavel Begunkov wrote:
>>>>>>> On 8/20/21 9:39 PM, Hao Xu wrote:
>>>>>>>> 在 2021/8/21 上午2:59, Pavel Begunkov 写道:
>>>>>>>>> On 8/20/21 7:40 PM, Hao Xu wrote:
>>>>>>>>>> coml_nr in ctx_flush_and_put() is not protected by uring_lock, this
>>>>>>>>>> may cause problems when accessing it parallelly.
>>>>>>>>>
>>>>>>>>> Did you hit any problem? It sounds like it should be fine as is:
>>>>>>>>>
>>>>>>>>> The trick is that it's only responsible to flush requests added
>>>>>>>>> during execution of current call to tctx_task_work(), and those
>>>>>>>>> naturally synchronised with the current task. All other potentially
>>>>>>>>> enqueued requests will be of someone else's responsibility.
>>>>>>>>>
>>>>>>>>> So, if nobody flushed requests, we're finely in-sync. If we see
>>>>>>>>> 0 there, but actually enqueued a request, it means someone
>>>>>>>>> actually flushed it after the request had been added.
>>>>>>>>>
>>>>>>>>> Probably, needs a more formal explanation with happens-before
>>>>>>>>> and so.
>>>>>>>> I should put more detail in the commit message, the thing is:
>>>>>>>> say coml_nr > 0
>>>>>>>>
>>>>>>>>   ctx_flush_and put                  other context
>>>>>>>>    if (compl_nr)                      get mutex
>>>>>>>>                                       coml_nr > 0
>>>>>>>>                                       do flush
>>>>>>>>                                           coml_nr = 0
>>>>>>>>                                       release mutex
>>>>>>>>         get mutex
>>>>>>>>            do flush (*)
>>>>>>>>         release mutex
>>>>>>>>
>>>>>>>> in (*) place, we do a bunch of unnecessary works, moreover, we
>>>>>>>
>>>>>>> I wouldn't care about overhead, that shouldn't be much
>>>>>>>
>>>>>>>> call io_cqring_ev_posted() which I think we shouldn't.
>>>>>>>
>>>>>>> IMHO, users should expect spurious io_cqring_ev_posted(),
>>>>>>> though there were some eventfd users complaining before, so
>>>>>>> for them we can do
>>>>>>
>>>>>> It does sometimes cause issues, see:
>>>>>
>>>>> I'm used that locking may end up in spurious wakeups. May be
>>>>> different for eventfd, but considering that we do batch
>>>>> completions and so might be calling it only once per multiple
>>>>> CQEs, it shouldn't be.
>>>>
>>>> The wakeups are fine, it's the ev increment that's causing some issues.
>>>
>>> If userspace doesn't expect that eventfd may get diverged from the
>>> number of posted CQEs, we need something like below. The weird part
>>> is that it looks nobody complained about this one, even though it
>>> should be happening pretty often. 
>>
>> That wasn't the issue we ran into, it was more the fact that eventfd
>> would indicate that something had been posted, when nothing had.
>> We don't need eventfd notifications to be == number of posted events,
>> just if the eventfd notification is inremented, there should be new
>> events there.
> 
> It's just so commonly mentioned, that for me expecting spurious
> events/wakeups is a default. Do we have it documented anywhere?

Not documented to my knowledge, and I wasn't really aware of this being
a problem until it was reported and that above referenced commit was
done to fix it. Might be worthwhile to put a comment in ev_posted() to
detail this, I'll do that for 5.15.

-- 
Jens Axboe

     prev parent reply	other threads:[~2021-08-21  3:10 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20 18:40 [PATCH for-5.15] io_uring: fix lacking of protection for compl_nr Hao Xu
2021-08-20 18:59 ` Pavel Begunkov
2021-08-20 20:39   ` Hao Xu
2021-08-20 21:32     ` Pavel Begunkov
2021-08-20 22:07       ` Hao Xu
2021-08-20 22:09       ` Jens Axboe
2021-08-20 22:21         ` Hao Xu
2021-08-20 22:28         ` Pavel Begunkov
2021-08-20 22:30           ` Jens Axboe
2021-08-20 22:41             ` Pavel Begunkov
2021-08-20 22:46               ` Jens Axboe
2021-08-20 22:59                 ` Pavel Begunkov
2021-08-21  3:10                   ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox