public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: "J. Hanne" <[email protected]>, [email protected]
Subject: Re: Memory ordering description in io_uring.pdf
Date: Wed, 21 Sep 2022 19:54:30 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 9/18/22 10:56 AM, J. Hanne wrote:
> Hi,
> 
> I have a couple of questions regarding the necessity of including memory
> barriers when using io_uring, as outlined in
> https://kernel.dk/io_uring.pdf. I'm fine with using liburing, but still I
> do want to understand what is going on behind the scenes, so any comment
> would be appreciated.

In terms of the barriers, that doc is somewhat outdated...

> Firstly, I wonder why memory barriers are required at all, when NOT using
> polled mode. Because requiring them in non-polled mode somehow implies that:
> - Memory re-ordering occurs across system-call boundaries (i.e. when
>   submitting, the tail write could happen after the io_uring_enter
>   syscall?!)
> - CPU data dependency checks do not work
> So, are memory barriers really required when just using a simple
> loop around io_uring_enter with completely synchronous processing?

No, I don't beleive that they are. The exception is SQPOLL, as you mention,
as there's not necessarily a syscall involved with that.

> Secondly, the examples in io_uring.pdf suggest that checking completion
> entries requires a read_barrier and a write_barrier and submitting entries
> requires *two* write_barriers. Really?
> 
> My expectation would be, just as with "normal" inter-thread userspace ipc,
> that plain store-release and load-acquire semantics are sufficient, e.g.: 
> - For reading completion entries:
> -- first read the CQ ring head (without any ordering enforcement)
> -- then use __atomic_load(__ATOMIC_ACQUIRE) to read the CQ ring tail
> -- then use __atomic_store(__ATOMIC_RELEASE) to update the CQ ring head
> - For submitting entries:
> -- first read the SQ ring tail (without any ordering enforcement)
> -- then use __atomic_load(__ATOMIC_ACQUIRE) to read the SQ ring head
> -- then use __atomic_store(__ATOMIC_RELEASE) to update the SQ ring tail
> Wouldn't these be sufficient?!

Please check liburing to see what that does. Would be interested in
your feedback (and patches!). Largely x86 not caring too much about
these have meant that I think we've erred on the side of caution
on that front.

> Thirdly, io_uring.pdf and
> https://github.com/torvalds/linux/blob/master/io_uring/io_uring.c seem a
> little contradicting, at least from my reading:
> 
> io_uring.pdf, in the completion entry example:
> - Includes a read_barrier() **BEFORE** it reads the CQ ring tail
> - Include a write_barrier() **AFTER** updating CQ head
> 
> io_uring.c says on completion entries:
> - **AFTER** the application reads the CQ ring tail, it must use an appropriate
>   smp_rmb() [...].
> - It also needs a smp_mb() **BEFORE** updating CQ head [...].
> 
> io_uring.pdf, in the submission entry example:
> - Includes a write_barrier() **BEFORE** updating the SQ tail
> - Includes a write_barrier() **AFTER** updating the SQ tail
> 
> io_uring.c says on submission entries:
> - [...] the application must use an appropriate smp_wmb() **BEFORE**
>   writing the SQ tail
>   (this matches io_uring.pdf)
> - And it needs a barrier ordering the SQ head load before writing new
>   SQ entries
>   
> I know, io_uring.pdf does mention that the memory ordering description
> is simplified. So maybe this is the whole explanation for my confusion?

The canonical resource at this point is the kernel code, as some of
the revamping of the memory ordering happened way later than when
that doc was written. Would be nice to get it updated at some point.

-- 
Jens Axboe



  reply	other threads:[~2022-09-22  1:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-18 16:56 Memory ordering description in io_uring.pdf J. Hanne
2022-09-22  1:54 ` Jens Axboe [this message]
2022-09-25 10:34   ` J. Hanne
2022-09-25 12:03     ` J. Hanne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox