public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, [email protected]
Subject: Re: [PATCH 1/3] io_uring: move to using private ring references
Date: Tue, 15 Aug 2023 23:50:40 +0100	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 8/15/23 22:45, Jens Axboe wrote:
> On 8/15/23 11:45 AM, Pavel Begunkov wrote:
>> On 8/11/23 18:12, Jens Axboe wrote:
>>> io_uring currently uses percpu refcounts for the ring reference. This
>>> works fine, but exiting a ring requires an RCU grace period to lapse
>>> and this slows down ring exit quite a lot.
>>>
>>> Add a basic per-cpu counter for our references instead, and use that.
>>> This is in preparation for doing a sync wait on on any request (notably
>>> file) references on ring exit. As we're going to be waiting on ctx refs
>>> going away as well with that, the RCU grace period wait becomes a
>>> noticeable slowdown.
>>
>> How does it work?
>>
>> - What prevents io_ring_ref_maybe_done() from miscalculating and either
>> 1) firing while there are refs or
>> 2) not triggering when we put down all refs?
>> E.g. percpu_ref relies on atomic counting after switching from
>> percpu mode.
> 
> I'm open to critique of it, do you have any specific worries? The
> counters are per-cpu, and whenever the REF_DEAD_BIT is set, we sum on
> that drop. We should not be grabbing references post that, and any drop

Well, my worry is concurrent modifications and CPU caches

CPU0                  |   CPU1
queue tw // task 1    |
                       | close(ring_fd); // task 2
                       | exit_work() -> kill_refs();
execute tw            |
   handle_tw_list()    |
     get_ref()         |

Sounds like this will try to grab a ref after REF_DEAD_BIT

> will just sum the counters.

CPU0 (io-wq)               | CPU1
                            | exit_work() -> kill
io_req_complete_post()     | cancel request
   put_ref()                |   put_ref()

This one seems possible as well. Then let's say those 2
refs we're putting are the last. They both dec, but count
it to 1 because of caches => never frees the ring

I also think, if we combine these 2 scenarios, we get
concurrent put and get, which might result in UAF

>> - What contexts it can be used from? Task context only? I'll argue we
>> want to use it in [soft]irq for likes of *task_work_add().
> 
> We don't manipulate ctx refs from non-task context right now, or from
> hard/soft IRQ. On the task_work side, the request already has a
> reference to the ctx. Not sure why you'd want to add more. In any case,
> I prefer not to deal with hypotheticals, just the code we have now.

which is not enough to protect it, see [1]. Yes, I optimised it
later with [2] (which is a bit ugly and confusing), but it's not
a hypothetical.

[1] commit 9ffa13ff78a0a55df968a72d6f0ebffccee5c9f4
     io_uring: pin context while queueing deferred tw
[2] commit d73a572df24661851465c821d33c03e70e4b68e5
     io_uring: optimize local tw add ctx pinning


-- 
Pavel Begunkov

  reply	other threads:[~2023-08-15 22:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-11 17:12 [PATCHSET 0/3] Ensure file refs are dropped on io_uring fd release Jens Axboe
2023-08-11 17:12 ` [PATCH 1/3] io_uring: move to using private ring references Jens Axboe
2023-08-15 17:45   ` Pavel Begunkov
2023-08-15 21:45     ` Jens Axboe
2023-08-15 22:50       ` Pavel Begunkov [this message]
2023-08-11 17:12 ` [PATCH 2/3] io_uring: consider ring dead once the ref is marked dying Jens Axboe
2023-08-11 17:12 ` [PATCH 3/3] io_uring: wait for cancelations on final ring put Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox