From: Pavel Begunkov <asml.silence@gmail.com>
To: Yuhao Jiang <danisjiang@gmail.com>, Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass via compound page accounting
Date: Wed, 14 Jan 2026 14:10:42 +0000 [thread overview]
Message-ID: <0adb508f-480d-4bfc-b861-3cf42e87bee1@gmail.com> (raw)
In-Reply-To: <afe7d084-a254-46a3-889b-a136dc8f4fbd@gmail.com>
On 1/13/26 19:44, Pavel Begunkov wrote:
> On 1/9/26 03:02, Yuhao Jiang wrote:
>> Hi Jens, Pavel, and all,
>>
>> Just a gentle follow-up on this patch below.
>> Please let me know if there are any concerns or if changes are needed.
>
> I'm pretty this will break with buffer sharing / cloning. I'd
> be tempted to remove all this cross buffer accounting logic
> and overestimate it, the current accounting is not sane.
> Otherwise, it'll likely need some proxy object shared b/w
> buffers or some other overly overcomplicated solution
Another way would be to double account cloned buffers and then
have your patch, which combines overaccounting with the ugliness
of full buffer table walks.
>> On Wed, Dec 17, 2025 at 9:00 PM Yuhao Jiang <danisjiang@gmail.com> wrote:
>>>
>>> When multiple registered buffers share the same compound page, only the
>>> first buffer accounts for the memory via io_buffer_account_pin(). The
>>> subsequent buffers skip accounting since headpage_already_acct() returns
>>> true.
>>>
>>> When the first buffer is unregistered, the accounting is decremented,
>>> but the compound page remains pinned by the remaining buffers. This
>>> creates a state where pinned memory is not properly accounted against
>>> RLIMIT_MEMLOCK.
>>>
>>> On systems with HugeTLB pages pre-allocated, an unprivileged user can
>>> exploit this to pin memory beyond RLIMIT_MEMLOCK by cycling buffer
>>> registrations. The bypass amount is proportional to the number of
>>> available huge pages, potentially allowing gigabytes of memory to be
>>> pinned while the kernel accounting shows near-zero.
>>>
>>> Fix this by recalculating the actual pages to unaccount when unmapping
>>> a buffer. For regular pages, always unaccount. For compound pages, only
>>> unaccount if no other registered buffer references the same compound
>>> page. This ensures the accounting persists until the last buffer
>>> referencing the compound page is released.
>>>
>>> Reported-by: Yuhao Jiang <danisjiang@gmail.com>
>>> Fixes: 57bebf807e2a ("io_uring/rsrc: optimise registered huge pages")
>
> That's not the right commit, the accounting is ancient, should
> get blamed somewhere around first commits that added registered
> buffers.
Turns it came just a bit later:
commit de2939388be564836b06f0f06b3787bdedaed822
Author: Jens Axboe <axboe@kernel.dk>
Date: Thu Sep 17 16:19:16 2020 -0600
io_uring: improve registered buffer accounting for huge pages
--
Pavel Begunkov
next prev parent reply other threads:[~2026-01-14 14:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 2:59 [PATCH] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass via compound page accounting Yuhao Jiang
2026-01-09 3:02 ` Yuhao Jiang
2026-01-13 19:44 ` Pavel Begunkov
2026-01-14 14:10 ` Pavel Begunkov [this message]
2026-01-14 20:59 ` Yuhao Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0adb508f-480d-4bfc-b861-3cf42e87bee1@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=danisjiang@gmail.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox