From: Yuhao Jiang <danisjiang@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org, Yuhao Jiang <danisjiang@gmail.com>
Subject: [PATCH v2] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass by removing cross-buffer accounting
Date: Mon, 19 Jan 2026 01:10:39 -0600 [thread overview]
Message-ID: <20260119071039.2113739-1-danisjiang@gmail.com> (raw)
When multiple registered buffers share the same compound page, only the
first buffer accounts for the memory via io_buffer_account_pin(). The
subsequent buffers skip accounting since headpage_already_acct() returns
true.
When the first buffer is unregistered, the accounting is decremented,
but the compound page remains pinned by the remaining buffers. This
creates a state where pinned memory is not properly accounted against
RLIMIT_MEMLOCK.
On systems with HugeTLB pages pre-allocated, an unprivileged user can
exploit this to pin memory beyond RLIMIT_MEMLOCK by cycling buffer
registrations. The bypass amount is proportional to the number of
available huge pages, potentially allowing gigabytes of memory to be
pinned while the kernel accounting shows near-zero.
Fix this by removing the cross-buffer accounting optimization entirely.
Each buffer now independently accounts for its pinned pages, even if
the same compound pages are referenced by other buffers. This prevents
accounting underflow when buffers are unregistered in arbitrary order.
The trade-off is that memory accounting may be overestimated when
multiple buffers share compound pages, but this is safe and prevents
the security issue.
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Suggested-by: Pavel Begunkov <asml.silence@gmail.com>
Fixes: de2939388be5 ("io_uring: improve registered buffer accounting for huge pages")
Cc: stable@vger.kernel.org
Signed-off-by: Yuhao Jiang <danisjiang@gmail.com>
---
Changes in v2:
- Remove cross-buffer accounting logic entirely
- Link to v1: https://lore.kernel.org/all/20251218025947.36115-1-danisjiang@gmail.com/
io_uring/rsrc.c | 43 -------------------------------------------
1 file changed, 43 deletions(-)
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index 41c89f5c616d..f35652f36c57 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -619,47 +619,6 @@ int io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
return 0;
}
-/*
- * Not super efficient, but this is just a registration time. And we do cache
- * the last compound head, so generally we'll only do a full search if we don't
- * match that one.
- *
- * We check if the given compound head page has already been accounted, to
- * avoid double accounting it. This allows us to account the full size of the
- * page, not just the constituent pages of a huge page.
- */
-static bool headpage_already_acct(struct io_ring_ctx *ctx, struct page **pages,
- int nr_pages, struct page *hpage)
-{
- int i, j;
-
- /* check current page array */
- for (i = 0; i < nr_pages; i++) {
- if (!PageCompound(pages[i]))
- continue;
- if (compound_head(pages[i]) == hpage)
- return true;
- }
-
- /* check previously registered pages */
- for (i = 0; i < ctx->buf_table.nr; i++) {
- struct io_rsrc_node *node = ctx->buf_table.nodes[i];
- struct io_mapped_ubuf *imu;
-
- if (!node)
- continue;
- imu = node->buf;
- for (j = 0; j < imu->nr_bvecs; j++) {
- if (!PageCompound(imu->bvec[j].bv_page))
- continue;
- if (compound_head(imu->bvec[j].bv_page) == hpage)
- return true;
- }
- }
-
- return false;
-}
-
static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
int nr_pages, struct io_mapped_ubuf *imu,
struct page **last_hpage)
@@ -677,8 +636,6 @@ static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
if (hpage == *last_hpage)
continue;
*last_hpage = hpage;
- if (headpage_already_acct(ctx, pages, i, hpage))
- continue;
imu->acct_pages += page_size(hpage) >> PAGE_SHIFT;
}
}
--
2.34.1
next reply other threads:[~2026-01-19 7:10 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-19 7:10 Yuhao Jiang [this message]
2026-01-19 17:03 ` [PATCH v2] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass by removing cross-buffer accounting Jens Axboe
2026-01-19 23:34 ` Yuhao Jiang
2026-01-19 23:40 ` Jens Axboe
2026-01-20 7:05 ` Yuhao Jiang
2026-01-20 12:04 ` Jens Axboe
2026-01-20 12:05 ` Pavel Begunkov
2026-01-20 17:03 ` Jens Axboe
2026-01-20 21:45 ` Pavel Begunkov
2026-01-21 14:58 ` Jens Axboe
2026-01-22 11:43 ` Pavel Begunkov
2026-01-22 17:47 ` Jens Axboe
2026-01-22 21:51 ` Pavel Begunkov
2026-01-23 14:26 ` Pavel Begunkov
2026-01-23 14:50 ` Jens Axboe
2026-01-23 15:04 ` Jens Axboe
2026-01-23 16:52 ` Jens Axboe
2026-01-24 11:04 ` Pavel Begunkov
2026-01-24 15:14 ` Jens Axboe
2026-01-24 15:55 ` Jens Axboe
2026-01-24 16:30 ` Pavel Begunkov
2026-01-24 18:44 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260119071039.2113739-1-danisjiang@gmail.com \
--to=danisjiang@gmail.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox