public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Yuhao Jiang <danisjiang@gmail.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>,
	io-uring@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH v2] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass by removing cross-buffer accounting
Date: Tue, 20 Jan 2026 05:04:10 -0700	[thread overview]
Message-ID: <c019c249-ae7c-4034-9d1a-e4b9e200453a@kernel.dk> (raw)
In-Reply-To: <CAHYQsXQK4nKu+fcni71__=V241RN=QxUHrvNQMQtPMzeL_z=BA@mail.gmail.com>

On 1/20/26 12:05 AM, Yuhao Jiang wrote:
> Hi Jens,
> 
> On Mon, Jan 19, 2026 at 5:40 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 1/19/26 4:34 PM, Yuhao Jiang wrote:
>>> On Mon, Jan 19, 2026 at 11:03 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> On 1/19/26 12:10 AM, Yuhao Jiang wrote:
>>>>> The trade-off is that memory accounting may be overestimated when
>>>>> multiple buffers share compound pages, but this is safe and prevents
>>>>> the security issue.
>>>>
>>>> I'd be worried that this would break existing setups. We obviously need
>>>> to get the unmap accounting correct, but in terms of practicality, any
>>>> user of registered buffers will have had to bump distro limits manually
>>>> anyway, and in that case it's usually just set very high. Otherwise
>>>> there's very little you can do with it.
>>>>
>>>> How about something else entirely - just track the accounted pages on
>>>> the side. If we ref those, then we can ensure that if a huge page is
>>>> accounted, it's only unaccounted when all existing "users" of it have
>>>> gone away. That means if you drop parts of it, it'll remain accounted.
>>>>
>>>> Something totally untested like the below... Yes it's not a trivial
>>>> amount of code, but it is actually fairly trivial code.
>>>
>>> Thanks, this approach makes sense. I'll send a v3 based on this.
>>
>> Great, thanks! I think the key is tracking this on the side, and then
>> a ref to tell when it's safe to unaccount it. The rest is just
>> implementation details.
>>
>> --
>> Jens Axboe
>>
> 
> I've been implementing the xarray-based ref tracking approach for v3.
> While working on it, I discovered an issue with buffer cloning.
> 
> If ctx1 has two buffers sharing a huge page, ctx1->hpage_acct[page] = 2.
> Clone to ctx2, now both have a refcount of 2. On cleanup both hit zero
> and unaccount, so we double-unaccount and user->locked_vm goes negative.
> 
> The per-context xarray can't coordinate across clones - each context
> tracks its own refcount independently. I think we either need a global
> xarray (shared across all contexts), or just go back to v2. What do
> you think?

Ah right, yes that is obviously true. Honestly having a shared xarray
for this is probably even better, rather than one per ctx. Should not
change the code very much over the existing test patch. And it won't
consume memory on a per-ring basis. Downside is of course the need
to synchronize updates, but should not be a big deal as accounting
isn't a fast path. IMHO, just go that route.

-- 
Jens Axboe


  reply	other threads:[~2026-01-20 12:04 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-19  7:10 [PATCH v2] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass by removing cross-buffer accounting Yuhao Jiang
2026-01-19 17:03 ` Jens Axboe
2026-01-19 23:34   ` Yuhao Jiang
2026-01-19 23:40     ` Jens Axboe
2026-01-20  7:05       ` Yuhao Jiang
2026-01-20 12:04         ` Jens Axboe [this message]
2026-01-20 12:05         ` Pavel Begunkov
2026-01-20 17:03           ` Jens Axboe
2026-01-20 21:45             ` Pavel Begunkov
2026-01-21 14:58               ` Jens Axboe
2026-01-22 11:43                 ` Pavel Begunkov
2026-01-22 17:47                   ` Jens Axboe
2026-01-22 21:51                     ` Pavel Begunkov
2026-01-23 14:26                       ` Pavel Begunkov
2026-01-23 14:50                         ` Jens Axboe
2026-01-23 15:04                           ` Jens Axboe
2026-01-23 16:52                             ` Jens Axboe
2026-01-24 11:04                               ` Pavel Begunkov
2026-01-24 15:14                                 ` Jens Axboe
2026-01-24 15:55                                   ` Jens Axboe
2026-01-24 16:30                                     ` Pavel Begunkov
2026-01-24 18:44                                     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c019c249-ae7c-4034-9d1a-e4b9e200453a@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=danisjiang@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox