* [PATCH AUTOSEL 7.0-6.18] io_uring/rsrc: unify nospec indexing for direct descriptors
[not found] <20260428104133.2858589-1-sashal@kernel.org>
@ 2026-04-28 10:40 ` Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] io_uring: take page references for NOMMU pbuf_ring mmaps Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2026-04-28 10:40 UTC (permalink / raw)
To: patches, stable
Cc: Jens Axboe, Gabriel Krisman Bertazi, Sasha Levin, io-uring,
linux-kernel
From: Jens Axboe <axboe@kernel.dk>
[ Upstream commit 53262c91f7b81f96495ff24e9d1fa8b1632e69c8 ]
For file updates, the node reset isn't capping the value via
array_index_nospec() like the other paths do. Ensure it's all sane and
have the update path do the proper capping as well.
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Now I have enough context. Let me write my analysis.
## PHASE 1: COMMIT MESSAGE FORENSICS
**Step 1.1: Subject line parse**
- Subsystem: `io_uring/rsrc`
- Action verb: "unify" (cleanup-style verb that often masks a fix)
- Record: `io_uring/rsrc` `unify` - bringing nospec indexing into one
place across direct descriptor paths
**Step 1.2: Tag analysis**
- `Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>` - reviewed by
an external developer
- `Signed-off-by: Jens Axboe <axboe@kernel.dk>` - io_uring maintainer
(he wrote the patch)
- NO `Fixes:` tag
- NO `Cc: stable@vger.kernel.org`
- NO `Reported-by:`, `Link:`
- Record: Bare-bones tag set; reviewed once by SUSE engineer; author is
the subsystem maintainer.
**Step 1.3: Body analysis**
- Bug description: "the node reset isn't capping the value via
array_index_nospec() like the other paths do"
- Failure mode: Spectre v1 (Bounds Check Bypass / CVE-2017-5753)
speculative side-channel
- Author explicitly contrasts the buggy file-update path with "the other
paths" that already use `array_index_nospec()` (i.e., the buffer
update path and `io_rsrc_node_lookup`)
- Record: This is missing Spectre v1 hardening on a user-reachable
register-files-update code path.
**Step 1.4: Hidden bug fix detection**
- "unify" is cleanup language but the substance is restoring missing
speculation protection on a user-controlled index. This is a real
defensive-security fix (similar to the pattern of `b7620121dc04e`,
`34bb77184123a`, `953c37e066f05`, and `29b95ac917927`, all of which
were Spectre v1 nospec fixes).
- Record: This IS a hidden bug fix - missing Spectre v1 protection.
## PHASE 2: DIFF ANALYSIS
**Step 2.1: Inventory**
- `io_uring/rsrc.c`: +3 lines in `__io_sqe_files_update()`
- `io_uring/rsrc.h`: +6/-1 lines in `io_reset_rsrc_node()` inline
- Total: 10 insertions, 2 deletions across 2 files
- Scope: single-file-pair, single subsystem, surgical
- Record: ~10 line surgical change in one helper + one caller.
**Step 2.2: Code flow change**
- Before in `__io_sqe_files_update`: `i = up->offset + done;
io_reset_rsrc_node(...)` - relies only on the upfront architectural
check at line 222 (`up->offset + nr_args > ctx->file_table.data.nr`)
- After: explicit `if (i >= ctx->file_table.data.nr) break;` then `i =
array_index_nospec(i, ...)` - speculation barrier
- Before in `io_reset_rsrc_node`: `data->nodes[index]` directly without
index hardening
- After: bounds-check-then-nospec-mask before dereferencing
`data->nodes[index]`
- Index parameter widened from `int` to `unsigned int` (safer for the
comparison with unsigned `data->nr`)
- Record: Adds Spectre v1 mitigation in two places (caller-side and
helper-side, defense-in-depth).
**Step 2.3: Bug mechanism**
- Category: Memory safety / Spectre v1 (Bounds Check Bypass)
- Mechanism: User passes `up->offset` and `nr_args`. The upfront check
at line 222 is architecturally correct, but on speculation, a CPU
could mispredict the bounds branch and do a speculative
`data->nodes[i]` load with i out of bounds, leaving observable cache
state. `array_index_nospec()` is the canonical mitigation.
- Record: Spectre v1 / CVE-2017-5753 hardening on a user-reachable index
load.
**Step 2.4: Fix quality**
- Obviously correct - the pattern is identical to surrounding code
(`io_rsrc_node_lookup`, `__io_sqe_buffers_update`)
- No semantic change for non-malicious callers (architectural bounds
were already guaranteed)
- Zero regression risk: only adds an extra bounds-check + nospec mask on
an existing index
- Record: High-quality, low-risk hardening.
## PHASE 3: GIT HISTORY
**Step 3.1: Blame**
- The helper `io_reset_rsrc_node()` was added by `4007c3d8c22a2`
("io_uring/rsrc: add io_reset_rsrc_node() helper", Jens Axboe, Oct 29
2024) — first appears in v6.13.
- Before that refactor (v6.12), `__io_sqe_files_update` had `i =
array_index_nospec(up->offset + done, ctx->nr_user_files);` — verified
by `git show v6.12:io_uring/rsrc.c`. So v6.12 was protected.
- Record: Bug introduced in 4007c3d8c22a2 (v6.13) by inadvertently
dropping `array_index_nospec()` during the helper extraction.
**Step 3.2: Fixes: tag follow-through**
- No Fixes: tag in this commit. Logical Fixes target is `4007c3d8c22a2`,
present in v6.13 and later.
- Record: Bug regression introduced in v6.13; absent in v6.12 LTS.
**Step 3.3: Related changes / file history**
- `io_uring/rsrc.h` recently saw `82dadc8a49475` ("take unsigned index
in io_rsrc_node_lookup()", Jan 2026) — related index typing cleanup
- This commit takes the same step for `io_reset_rsrc_node`
- Record: Latest in a series of small index-safety improvements; no
prerequisites required.
**Step 3.4: Author**
- Jens Axboe is the io_uring maintainer; he both wrote 4007c3d8c22a2
(introduced the regression) and authors this fix.
- Record: Subsystem maintainer authored.
**Step 3.5: Dependencies**
- The patch uses only existing primitives (`array_index_nospec`, the
existing `data->nr` field, the existing helper signature). Standalone.
- Record: Standalone, no prerequisites.
## PHASE 4: MAILING LIST RESEARCH
**Step 4.1: Original submission**
- `b4 dig -c 53262c91f7b81` found patch 2/6 of "Various bug fixes"
series at lore.kernel.org/all/20260421135626.581917-3-axboe@kernel.dk
- Cover letter ("PATCHSET 0/6 Various bug fixes") explicitly describes
the patches:
- "Patch 2, spectre masking for file updates."
- Patch 6 is the only one with `Cc: stable@kernel.org` (a different
patch with a clear regression Fixes:)
- Record: Submitted as part of a 6-patch series; cover-letter labels
this one as "spectre masking" specifically (separate category from
"defensive cleanups").
**Step 4.2: Reviewers (b4 dig -w)**
- Original recipients: `Jens Axboe`, `io-uring@vger.kernel.org`
- Reply thread: Gabriel Krisman Bertazi (SUSE) gave Reviewed-by
- Record: Reviewed by external developer (SUSE).
**Step 4.3: Bug report**
- No Reported-by / Link tags. No bug report - this is proactive
hardening.
- Record: Proactive Spectre v1 mitigation, no specific user-triggered
report.
**Step 4.4: Series context**
- Series: 1/6 (defensive cleanup, not reachable), 2/6 (this - spectre
masking), 3/6 (defensive cleanup), 4/6 (defensive hardening), 5/6
(futex actual fix, has Fixes:), 6/6 (ring resize actual fix, has
Fixes: + Cc: stable)
- Record: Standalone within the series; doesn't depend on the others.
**Step 4.5: Stable list history**
- Not searched in detail. Note: the author chose NOT to Cc stable on
this specific patch.
- Record: No explicit stable nomination, but author historically doesn't
cc-stable Spectre hardening either (precedent: similar nospec fixes
953c37e066f05/29b95ac917927 went to stable via maintainer-tagged
Fixes:).
## PHASE 5: CODE SEMANTIC ANALYSIS
**Step 5.1: Modified functions**
- `__io_sqe_files_update()` - handles `IORING_REGISTER_FILES_UPDATE`
- `io_reset_rsrc_node()` - inline helper used in 4 places
**Step 5.2: Callers**
- `io_reset_rsrc_node()` callers (verified by Grep):
- `io_uring/rsrc.c:241` - in `__io_sqe_files_update()` (this fix's
site)
- `io_uring/rsrc.c:320` - in `__io_sqe_buffers_update()` (already
nospec'd at the caller)
- `io_uring/filetable.c:79` - in `io_install_fixed_file()` (called for
direct fd installs; bounds-checked at line 72)
- `io_uring/filetable.c:138` - in `io_fixed_fd_remove()` (bounds-
checked at line 132)
- All 4 are user-reachable via io_uring register/update operations.
- Record: 4 call sites; all reachable from userspace via io_uring
`register` syscall paths.
**Step 5.3: Callees**
- `io_reset_rsrc_node()` calls `io_put_rsrc_node()` and indexes
`data->nodes[index]`. The `array_index_nospec()` mask is now applied
before the indexed load.
**Step 5.4: Reachability**
- The path is reachable from userspace via
`io_uring_register(IORING_REGISTER_FILES_UPDATE, ...)`. Any process
with io_uring access can hit it.
- Record: User-reachable from a basic syscall path.
**Step 5.5: Similar patterns**
- `io_rsrc_node_lookup()` already does the same pattern (bounds check +
nospec mask)
- `__io_sqe_buffers_update()` already does the nospec mask at the caller
- This commit harmonizes the file-update path and the helper itself
- Past similar fixes: `b7620121dc04e` (2019), `34bb77184123a` (2022),
`953c37e066f05` (2023), `29b95ac917927` (2024) - all backported
- Record: Identical pattern to a long lineage of accepted Spectre v1
nospec fixes.
## PHASE 6: CROSS-REFERENCING / STABLE TREE
**Step 6.1: Buggy code in stable**
- `io_reset_rsrc_node()` introduced in `4007c3d8c22a2` (v6.13). Stable
trees v6.13.y onward inherit the missing nospec.
- v6.12.y LTS does NOT have this regression (the function itself doesn't
exist there).
- Record: Affected stable trees: v6.13.y - v6.19.y. v6.12 LTS
unaffected.
**Step 6.2: Backport difficulty**
- The diff context is small. The function shape has been stable since
v6.13 with only minor signature changes (e.g., `82dadc8a49475` made
`io_rsrc_node_lookup` index unsigned in Jan 2026). Backport should
apply nearly cleanly to active stable trees that have
`io_reset_rsrc_node`.
- Record: Likely clean apply on v6.13+ stable trees; v6.12 LTS not
applicable.
**Step 6.3: Related fixes already in stable**
- `953c37e066f05` and similar nospec fixes are already in older stable
kernels.
- Record: This is the latest in the series; no overlap.
## PHASE 7: SUBSYSTEM CONTEXT
**Step 7.1: Subsystem**
- `io_uring/` - heavily used core async I/O subsystem reachable by any
unprivileged process; security-sensitive.
- Criticality: IMPORTANT (used by many distros, databases, language
runtimes).
**Step 7.2: Activity**
- Highly active subsystem with frequent fixes. Spectre and registration-
path hardening is an ongoing theme.
## PHASE 8: IMPACT / RISK
**Step 8.1: Affected users**
- Any user of io_uring fixed-files (`IORING_REGISTER_FILES_UPDATE`) on a
kernel >= v6.13. That's a large population - any process able to call
`io_uring_setup`.
**Step 8.2: Trigger**
- Trigger: a userspace caller invokes `IORING_REGISTER_FILES_UPDATE`
with a manipulated offset to mistrain a CPU branch predictor for a
Spectre v1 attack. Architecturally bounded, but exposes a speculative-
load gadget to any unprivileged caller.
- Record: Unprivileged userspace can reach the path.
**Step 8.3: Failure mode**
- Pure architectural correctness is unaffected; the failure mode is
*information disclosure* via a Spectre v1 side channel. Severity for a
sanitizer/Spectre hardening category: MEDIUM-HIGH (security hardening,
defense-in-depth, no crash but real CVE class).
**Step 8.4: Risk-Benefit**
- Benefit: closes a known speculative gadget on a user-reachable indexed
load - matches a long-standing pattern of accepted backports.
- Risk: ~10 lines, identical to widely-deployed pattern in adjacent
code, fully verifiable. Very low.
- Record: High benefit / very low risk.
## PHASE 9: SYNTHESIS
**Step 9.1: Evidence**
- FOR backporting:
- Spectre v1 (CVE-2017-5753 class) speculative-load gadget on a user-
reachable path.
- Restores protection that existed in v6.12 and was lost during the
v6.13 helper extraction (`4007c3d8c22a2`).
- 10-line surgical change identical in pattern to multiple historical
nospec fixes that DID go to stable (`b7620121dc04e`,
`34bb77184123a`, `953c37e066f05`, `29b95ac917927`).
- Defense-in-depth: hardens both the caller and the helper.
- Reviewed-by external developer. Maintainer-authored.
- AGAINST:
- No `Fixes:` tag, no `Cc: stable` (notable since other patches in the
same series do have them).
- Cover letter labels this as "spectre masking" rather than "actual
fix".
- Architectural bound check already exists at line 222-223 in
`__io_sqe_files_update`; the issue is purely speculative-execution.
- UNVERIFIED: Did not exhaustively check every active stable branch for
divergence; reliant on grep/log of mainline.
**Step 9.2: Stable rules**
1. Obviously correct and tested? YES (10 lines, well-known pattern,
reviewed).
2. Fixes a real bug? YES (Spectre v1 class info-leak gadget, regression
vs v6.12).
3. Important issue? YES (security hardening, CVE class, user-reachable).
4. Small and contained? YES (10 lines, 2 files, 1 subsystem).
5. No new features or APIs? YES.
6. Apply to stable trees? YES for v6.13.y+ (helper exists there).
**Step 9.3: Exception category**
- Falls under defensive security hardening with strong precedent
(multiple prior nospec fixes in io_uring backported).
**Step 9.4: Decision**
- Borderline by author's tagging choice but technically a Spectre v1
mitigation that follows a well-established pattern of stable-eligible
io_uring nospec fixes. Restores protection lost in v6.13.
## Verification
- [Phase 1] Read commit message via `git show
53262c91f7b81f96495ff24e9d1fa8b1632e69c8`: confirmed `Reviewed-by:
Gabriel Krisman Bertazi`, `Signed-off-by: Jens Axboe`, no `Fixes:`, no
`Cc: stable`.
- [Phase 2] Diff analysis: confirmed 3 added lines in
`io_uring/rsrc.c::__io_sqe_files_update()` and 6 added/1 removed in
`io_uring/rsrc.h::io_reset_rsrc_node()`. Index type widened to
`unsigned int`.
- [Phase 2] Read post-fix `io_uring/rsrc.c` lines 211-272 and
`io_uring/rsrc.h` lines 90-140 to verify the upfront bounds check at
line 222 and the resulting helper shape.
- [Phase 3] `git log -- io_uring/rsrc.c | head` and `git log
--grep="io_reset_rsrc_node"`: located helper introduction
`4007c3d8c22a2` (Oct 29 2024).
- [Phase 3] `git tag --contains 4007c3d8c22a2 | grep "^v"` (via prefix
match): no results means the tag is in v6.13+ (helper introduced for
v6.13).
- [Phase 3] `git show v6.12:io_uring/rsrc.c`: verified v6.12 had `i =
array_index_nospec(up->offset + done, ctx->nr_user_files);` so this is
a true regression introduced by the helper extraction.
- [Phase 4] `b4 dig -c 53262c91f7b81f96495ff24e9d1fa8b1632e69c8`:
matched lore URL `https://lore.kernel.org/all/20260421135626.581917-3-
axboe@kernel.dk/`.
- [Phase 4] `b4 dig -w`: original recipients limited to Jens Axboe and
io-uring list.
- [Phase 4] Saved mbox via `b4 dig -m`: cover-letter explicitly
classifies this patch as "spectre masking for file updates"; only
patch 6/6 has `Cc: stable@kernel.org`. Patch 5/6 has `Fixes:` for an
old commit but no Cc: stable; this one (patch 2/6) has neither.
- [Phase 4] Verified `Reviewed-by: Gabriel Krisman Bertazi` reply in the
saved mbox at line 1490.
- [Phase 5] `Grep "io_reset_rsrc_node"` across `io_uring/`: found 4
callers (rsrc.c:241, rsrc.c:320, filetable.c:79, filetable.c:138), all
in user-reachable register paths. Inspected `filetable.c:62-141` to
confirm both call sites are bounds-checked at the architectural level
but lacked a nospec mask before the helper.
- [Phase 5] Located historical precedents: `git log
--grep="^io_uring.*nospec"` returned `953c37e066f05`, `4cdd158be9d09`,
`34bb77184123a`, `b7620121dc04e`. Inspected `b7620121dc04e` and
`34bb77184123a` to confirm pattern (Spectre v1 nospec on indexed
register paths).
- [Phase 6] Stable affected: v6.13+ (since helper introduction); v6.12
LTS unaffected (different code structure with existing nospec).
- [Phase 8] Failure mode classified as Spectre v1 information disclosure
on a user-reachable indexed load; severity MEDIUM-HIGH (security CVE
class).
- UNVERIFIED: Did not check each individual active stable tree (v6.13.y
through v6.19.y) line-for-line for backport conflicts; assessment
based on the limited churn around the helper since v6.13.
- UNVERIFIED: Did not directly confirm whether
953c37e066f05/29b95ac917927 actually landed in every stable branch
(relied on the historical pattern of similar nospec fixes being
backported).
## Decision rationale
This is a small (~10 line) Spectre v1 mitigation that restores
`array_index_nospec()` protection on a user-reachable indexed-load
gadget that was inadvertently dropped when the `io_reset_rsrc_node()`
helper was extracted in v6.13. It applies the same idiom already used in
the buffer-update path (`__io_sqe_buffers_update`) and the lookup helper
(`io_rsrc_node_lookup`), and it directly mirrors a long line of similar
nospec fixes that have been backported (`b7620121dc04e`,
`34bb77184123a`, `953c37e066f05`, `29b95ac917927`). The patch is
reviewed, surgical, regression-free, maintainer-authored, and addresses
a genuine speculative-execution gadget reachable by any unprivileged
process that can use io_uring. Although the author did not add `Cc:
stable` or a `Fixes:` tag (a deliberate choice noted on the cover-letter
as "spectre masking" rather than "actual fix"), the technical merits
clearly satisfy the stable rules: real CVE-class bug, important, small,
contained, no new functionality, applies cleanly to the affected stable
trees (v6.13+).
**YES**
io_uring/rsrc.c | 3 +++
io_uring/rsrc.h | 9 +++++++--
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index 1b96ab5e98c99..7953f38af1eb0 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -238,6 +238,9 @@ static int __io_sqe_files_update(struct io_ring_ctx *ctx,
continue;
i = up->offset + done;
+ if (i >= ctx->file_table.data.nr)
+ break;
+ i = array_index_nospec(i, ctx->file_table.data.nr);
if (io_reset_rsrc_node(ctx, &ctx->file_table.data, i))
io_file_bitmap_clear(&ctx->file_table, i);
diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h
index cff0f8834c353..44e3386f7c1ca 100644
--- a/io_uring/rsrc.h
+++ b/io_uring/rsrc.h
@@ -109,10 +109,15 @@ static inline void io_put_rsrc_node(struct io_ring_ctx *ctx, struct io_rsrc_node
}
static inline bool io_reset_rsrc_node(struct io_ring_ctx *ctx,
- struct io_rsrc_data *data, int index)
+ struct io_rsrc_data *data,
+ unsigned int index)
{
- struct io_rsrc_node *node = data->nodes[index];
+ struct io_rsrc_node *node;
+ if (index >= data->nr)
+ return false;
+ index = array_index_nospec(index, data->nr);
+ node = data->nodes[index];
if (!node)
return false;
io_put_rsrc_node(ctx, node);
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH AUTOSEL 7.0-6.12] io_uring: take page references for NOMMU pbuf_ring mmaps
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] io_uring/rsrc: unify nospec indexing for direct descriptors Sasha Levin
@ 2026-04-28 10:41 ` Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2026-04-28 10:41 UTC (permalink / raw)
To: patches, stable
Cc: Greg Kroah-Hartman, Jens Axboe, Sasha Levin, io-uring,
linux-kernel
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d0be8884f56b0b800cd8966e37ce23417cd5044e ]
Under !CONFIG_MMU, io_uring_get_unmapped_area() returns the kernel
virtual address of the io_mapped_region's backing pages directly;
the user's VMA aliases the kernel allocation. io_uring_mmap() then
just returns 0 -- it takes no page references.
The CONFIG_MMU path uses vm_insert_pages(), which takes a reference on
each inserted page. Those references are released when the VMA is torn
down (zap_pte_range -> put_page). io_free_region() -> release_pages()
drops the io_uring-side references, but the pages survive until munmap
drops the VMA-side references.
Under NOMMU there are no VMA-side references. io_unregister_pbuf_ring ->
io_put_bl -> io_free_region -> release_pages drops the only references
and the pages return to the buddy allocator while the user's VMA still
has vm_start pointing into them. The user can then write into whatever
the allocator hands out next.
Mirror the MMU lifetime: take get_page references in io_uring_mmap() and
release them via vm_ops->close. NOMMU's delete_vma() calls vma_close()
which runs ->close on munmap.
This also incidentally addresses the duplicate-vm_start case: two mmaps
of SQ_RING and CQ_RING resolve to the same ctx->ring_region pointer.
With page refs taken per mmap, the second mmap takes its own refs and
the pages survive until both mmaps are closed. The nommu rb-tree BUG_ON
on duplicate vm_start is a separate mm/nommu.c concern (it should share
the existing region rather than BUG), but the page lifetime is now
correct.
Cc: Jens Axboe <axboe@kernel.dk>
Reported-by: Anthropic
Assisted-by: gkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2026042115-body-attention-d15b@gregkh
[axboe: get rid of region lookup, just iterate pages in vma]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
# Analysis: io_uring NOMMU pbuf_ring page UAF fix
## PHASE 1: COMMIT MESSAGE FORENSICS
**Step 1.1 - Subject line:**
Record: subsystem `io_uring`, action verb `take`, summary: "take page
references for NOMMU pbuf_ring mmaps" — wraps a fix for a page lifetime
/ use-after-free issue under `!CONFIG_MMU`.
**Step 1.2 - Tags:**
Record:
- `Cc: Jens Axboe <axboe@kernel.dk>`
- `Reported-by: Anthropic` (AI bug report)
- `Assisted-by: gkh_clanker_t1000` (an unusual tag — verified this is
identical to upstream commit `d0be8884f56b0`, not pipeline-injected)
- `Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>`
(author SOB, kernel veteran)
- `Link: https://patch.msgid.link/2026042115-body-attention-d15b@gregkh`
- `[axboe: get rid of region lookup, just iterate pages in vma]`
(maintainer-folded change)
- `Signed-off-by: Jens Axboe <axboe@kernel.dk>` (subsystem maintainer)
No `Cc: stable` or `Fixes:` — expected.
**Step 1.3 - Body text:**
Record: Author explains a use-after-free root cause precisely:
- NOMMU `io_uring_get_unmapped_area()` returns a kernel virtual address;
user VMA aliases the kernel pages.
- `io_uring_mmap()` returns 0 without taking page references.
- `io_unregister_pbuf_ring -> io_put_bl -> io_free_region ->
release_pages` drops the only reference; pages return to the buddy
allocator while the user's VMA still maps them.
- "The user can then write into whatever the allocator hands out next."
— this is a write-after-free.
- Fix mirrors MMU lifetime by `get_page` per page in `mmap()` and
`put_page` via `vm_ops->close`.
- Also addresses the duplicate-vm_start case for SQ/CQ.
**Step 1.4 - Hidden bug fix?**
Record: Not hidden — the commit body explicitly describes a use-after-
free / write-after-free of pages handed to userspace, which is a serious
memory-safety / security bug.
## PHASE 2: DIFF ANALYSIS
**Step 2.1 - Inventory:**
Record: 1 file changed (`io_uring/memmap.c`); +44 lines, -1 line. Adds
`io_uring_nommu_vm_close()`, `io_uring_nommu_vm_ops`, expands
`io_uring_mmap()` (`!CONFIG_MMU` branch). Single-file, surgical NOMMU-
only change.
**Step 2.2 - Code flow change:**
Before: `io_uring_mmap()` for NOMMU only validated flags; returned 0
with no page references taken.
After: validates flags, looks up the region under `ctx->mmap_lock`,
validates region is set and the VMA size matches `region->nr_pages`,
takes a `get_page()` per backing page, and installs `vm_ops->close` to
drop those references at unmap.
**Step 2.3 - Bug mechanism:**
Record: Use-after-free / write-after-free of kernel pages still mapped
in userspace. Category: memory safety + reference counting (missing
`get_page` on the mmap path that aliases kernel allocations). The fix
balances the lifetime by adding `get_page()` on map and `put_page()` on
close.
**Step 2.4 - Fix quality:**
Record: Small, contained. Logic is straightforward: per-page `get_page`
on map, mirrored `put_page` on close. The validation that `vma->vm_end -
vma->vm_start == region->nr_pages << PAGE_SHIFT` guards the close-time
`virt_to_page` walk over the VMA address range. Risk that
`vma->vm_start` no longer points to those pages is addressed by holding
the page references — the kernel virtual address remains valid as long
as the page is alive. Fix is obviously correct for the NOMMU case
described.
## PHASE 3: GIT HISTORY
**Step 3.1 - Blame:**
Record: The vulnerable line `return
is_nommu_shared_mapping(vma->vm_flags) ? 0 : -EINVAL;` has been present
in NOMMU `io_uring_mmap()` since `f15ed8b4d0ce2 io_uring: move
mapping/allocation helpers to a separate file` (v6.10) and earlier in
`io_uring/io_uring.c` going back to v6.0 era when io_uring moved into
its own subdirectory (`ed29b0b4fd835`, v6.0).
**Step 3.2 - Fixes: tag:**
Record: No Fixes: tag. The specific UAF via the `pbuf_ring`
`release_pages` path requires the region API on the pbuf side, which
arrived with `ef62de3c4ad58 io_uring/kbuf: use region api for pbuf
rings` and the sibling memmap commits, all in v6.14-rc1.
**Step 3.3 - Related changes:**
Record: Relevant series: `7cd7b9575270e io_uring/memmap: unify io_uring
mmap'ing code`, `ef62de3c4ad58 io_uring/kbuf: use region api for pbuf
rings`, `90175f3f50321 io_uring/kbuf: remove pbuf ring refcounting` (all
v6.14-rc1). These restructured pbuf_ring mmap to share the region
machinery — the same machinery whose `release_pages` now drops the only
reference under NOMMU.
**Step 3.4 - Author:**
Record: Author is Greg Kroah-Hartman (LTS maintainer). Folded by Jens
Axboe (io_uring maintainer). Both highly authoritative.
**Step 3.5 - Dependencies:**
Record: The fix uses `io_mmap_get_region()`, `io_region_is_set()`,
`region->pages`, `region->nr_pages`, `ctx->mmap_lock` — all introduced
in v6.14. For v6.14+ stable trees, this should apply standalone. For
older trees (≤v6.12), the patch will not apply as-is.
## PHASE 4: MAILING LIST AND EXTERNAL RESEARCH
**Step 4.1 - Original submission:**
Record: `b4 dig -c d0be8884f56b0` returned thread
`https://lore.kernel.org/all/2026042115-body-attention-d15b@gregkh/`.
The series went through one revision — Jens folded a simplification
("get rid of region lookup, just iterate pages in vma") with size
validation before applying.
**Step 4.2 - Reviewers:**
Record: To `io-uring@vger.kernel.org`, Cc: Jens Axboe (subsystem
maintainer). Maintainer folded changes and pushed.
**Step 4.3 - Bug report:**
Record: Greg's email confirms this was an AI-generated report.
**However**, Greg explicitly built a PoC (poc.c + run-poc.sh attached to
the thread) which:
- Builds a riscv64 NOMMU kernel and boots in QEMU with `init_on_free=1`
- As init, registers a pbuf_ring with `IOU_PBUF_RING_MMAP`, mmaps a
page, writes a 0x55 canary, unregisters the pbuf_ring, then re-reads
- On unfixed: canary becomes 0x00 (page freed and zeroed), then re-
registering reuses the same page demonstrating write-after-free
- On fixed: canary is intact
- Greg replied `Tested-by: Greg Kroah-Hartman
<gregkh@linuxfoundation.org>` after Jens's folded version
The CVE-style identifiers `ANT-2026-02884` (the UAF) and
`ANT-2026-02650` (related duplicate vm_start) are referenced in the PoC.
**Step 4.4 - Series context:**
Record: Single patch (no series). Greg also has an alternative patch
that disables io_uring on `!MMU` entirely, which Jens did not accept in
favor of this fix.
**Step 4.5 - Stable discussion:**
Record: No explicit `Cc: stable` mention in the thread, and no
`stable@vger.kernel.org` in the discussion. However, this is a confirmed
UAF reachable from unprivileged userspace with a working exploit
reproducer — clearly stable material.
## PHASE 5: CODE SEMANTIC ANALYSIS
**Step 5.1 - Modified functions:**
Record: `io_uring_mmap()` (NOMMU branch), new
`io_uring_nommu_vm_close()`, new `io_uring_nommu_vm_ops`.
**Step 5.2 - Callers:**
Record: `io_uring_mmap` is the file_operations `.mmap` for the io_uring
fd; reachable from any userspace `mmap()` on an io_uring fd.
`io_uring_nommu_vm_close` is invoked by `delete_vma()` in `mm/nommu.c`
on `munmap`/exit. The bug path: `io_unregister_pbuf_ring()` →
`io_put_bl()` (`io_uring/kbuf.c:445`) → `io_free_region()`
(`io_uring/memmap.c:91`) → `release_pages()` — confirmed by `git grep`.
**Step 5.3 - Callees:**
Record: `get_page()`, `put_page()`, `is_nommu_shared_mapping()`,
`io_mmap_get_region()`, `io_region_is_set()`, `virt_to_page()`. All
standard kernel APIs.
**Step 5.4 - Reachability:**
Record: io_uring `register`/`unregister` and `mmap` are unprivileged
syscalls (no `CAP_SYS_ADMIN` for these paths — verified by grep across
`io_uring/`). The PoC demonstrates a full unprivileged trigger.
**Step 5.5 - Similar patterns:**
Record: The MMU path uses `vm_insert_pages()` (which does its own
`get_page` per inserted page, released on VMA teardown via
`zap_pte_range -> put_page`). The fix gives NOMMU equivalent symmetry.
Searching for other `is_nommu_shared_mapping` users (`fc4f4be9b5271`) —
io_uring is the only file_ops user adding such page lifetime semantics
manually.
## PHASE 6: STABLE TREE ANALYSIS
**Step 6.1 - Bug presence in stable:**
Record: Verified `git show v6.18:io_uring/memmap.c` and `git show
v7.0:io_uring/memmap.c` — both contain the unfixed `return
is_nommu_shared_mapping(vma->vm_flags) ? 0 : -EINVAL;`. The pbuf_ring
region API (the trigger surface for this exact UAF) exists from v6.14
onward. Affected trees with this exact bug: v6.14, v6.15, v6.16, v6.17,
**v6.18 LTS**, v6.19, **v7.0** (this branch).
**Step 6.2 - Backport complications:**
Record: For v6.14 → v7.0, all helpers (`io_mmap_get_region`,
`io_region_is_set`, `ctx->mmap_lock`, `region->pages/nr_pages`,
`guard(mutex)`) exist; the patch should apply cleanly or with trivial
adjustment. For v6.12 LTS and older, `io_mmap_get_region()` does not
exist (region API absent in pbuf path) — the same conceptual UAF may
exist via different code, but the fix as-presented does not apply. v6.6
LTS — same story.
**Step 6.3 - Related fixes already in stable:**
Record: No prior fix found. This is a new, recently-discovered class of
bug.
## PHASE 7: SUBSYSTEM CONTEXT
**Step 7.1 - Subsystem and criticality:**
Record: `io_uring` — IMPORTANT (heavily used subsystem; security-
relevant; reachable from unprivileged userspace). Criticality of this
specific config (NOMMU): PERIPHERAL (only `!CONFIG_MMU` builds, mostly
RISC-V/embedded). Net assessment: IMPORTANT-but-PERIPHERAL —
unprivileged UAF in a security-sensitive subsystem, on a small but real
config.
**Step 7.2 - Subsystem activity:**
Record: io_uring is one of the most actively developed kernel
subsystems; the affected code (region API) is recent (v6.14) and well
maintained.
## PHASE 8: IMPACT AND RISK ASSESSMENT
**Step 8.1 - Affected users:**
Record: Users of `!CONFIG_MMU` kernels (RISC-V nommu, ARM nommu,
Blackfin successors, some MicroBlaze configs, embedded NOMMU systems
with io_uring enabled). Small population, but real and the bug is
unconditional on those builds when pbuf_ring mmap is used.
**Step 8.2 - Trigger:**
Record: Trivial — unprivileged process calls `io_uring_setup`,
`io_uring_register(IORING_REGISTER_PBUF_RING, ..., IOU_PBUF_RING_MMAP)`,
`mmap(IORING_OFF_PBUF_RING)`, then
`io_uring_register(IORING_UNREGISTER_PBUF_RING, ...)`. PoC demonstrates
this path. Same pattern for SQ/CQ rings.
**Step 8.3 - Failure mode:**
Record: Use-after-free → write-after-free of kernel pages from
userspace. With the page returned to the buddy allocator and reused
(kernel-side allocation hands the same page back), the user can
read/write whatever the kernel later places there — heap-spray-friendly,
security-CRITICAL. PoC ends with sysrq-c kernel panic for proof.
**Step 8.4 - Risk-benefit:**
Record:
- Benefit: prevents an unprivileged user-triggered UAF / write-after-
free on NOMMU systems — exactly the stable mandate.
- Risk: minimal — change is confined to the `!CONFIG_MMU` branch of
`io_uring/memmap.c` (44 lines), so it cannot affect any MMU build.
Even on NOMMU, the fix only adds `get_page`/`put_page` symmetry to
mirror the MMU path. Tested-by Greg Kroah-Hartman with explicit PoC +
boot test.
Ratio: very high benefit / very low risk.
## PHASE 9: FINAL SYNTHESIS
**Step 9.1 - Evidence:**
- FOR: confirmed unprivileged-reachable UAF/WAF on NOMMU; PoC exists and
panics unfixed kernels; small, surgical, single-file fix; tested by
Greg KH; written by LTS maintainer; folded by io_uring maintainer;
merged upstream in `d0be8884f56b0`; only touches NOMMU branch; depends
on helpers all present in v6.14+.
- AGAINST: NOMMU is uncommon; the bug was AI-discovered and Greg
initially expressed low confidence in the fix details; pre-v6.14
stable trees would need a different patch.
**Step 9.2 - Stable rules:**
1. Obviously correct and tested? YES — PoC + Tested-by from LTS
maintainer.
2. Fixes a real bug affecting users? YES — UAF reachable by any
unprivileged process on NOMMU.
3. Important issue? YES — security-critical (write-after-free of
arbitrary kernel pages).
4. Small and contained? YES — 1 file, ~44 lines, NOMMU-only branch.
5. No new features/APIs? YES — internal fix only.
6. Applies to stable? YES for v6.14+ (cleanly to v6.18, v6.19, v7.0);
needs adaptation for older trees.
**Step 9.3 - Exceptions:** Not applicable — this is a normal bug fix,
not a quirk/device-id/DT update. Stands on its merits.
**Step 9.4 - Decision:** Backport. This is a confirmed unprivileged-
reachable use-after-free with a working PoC. The fix is small, surgical,
and isolated to the NOMMU code path so it cannot regress MMU builds. It
applies cleanly to v6.14+ stable trees including the v7.0.y autosel
target (HEAD here).
## Verification
- [Phase 1] Read full upstream commit `d0be8884f56b0` via `git show`;
confirmed `Reported-by: Anthropic` and `Assisted-by:
gkh_clanker_t1000` are part of the upstream commit, not pipeline-
injected.
- [Phase 2] Diff inspection confirms +44/-1 lines in `io_uring/memmap.c`
only, all in the `!CONFIG_MMU` branch.
- [Phase 3] `git log --oneline -- io_uring/memmap.c` and `git describe
--contains` confirm region API arrived in v6.14-rc1 (`ef62de3c4ad58`,
`7cd7b9575270e`); pre-v6.14 NOMMU mmap was already vulnerable in
spirit but used different (refcounted) pbuf paths.
- [Phase 3] `git show v6.6:io_uring/io_uring.c`,
`v6.12:io_uring/memmap.c`, `v6.18:io_uring/memmap.c`,
`v7.0:io_uring/memmap.c` confirm the unfixed `return
is_nommu_shared_mapping(vma->vm_flags) ? 0 : -EINVAL;` is present from
v6.6 through v7.0.
- [Phase 4] `b4 dig -c d0be8884f56b0` returned thread
`https://lore.kernel.org/all/2026042115-body-attention-d15b@gregkh/`.
- [Phase 4] `b4 dig -c d0be8884f56b0 -a` showed v1 only; Jens folded an
inline simplification when applying.
- [Phase 4] `b4 dig -c d0be8884f56b0 -m /tmp/io_uring_thread.mbox` saved
the thread; read confirms PoC (poc.c, run-poc.sh) tests vulnerable vs.
fixed kernels with `init_on_free=1`, and `Tested-by: Greg Kroah-
Hartman` on Jens's folded version.
- [Phase 4] PoC references CVE-style identifiers `ANT-2026-02884` (this
UAF) and `ANT-2026-02650` (related duplicate vm_start case).
- [Phase 5] `grep` in `io_uring/kbuf.c` confirmed
`io_unregister_pbuf_ring -> io_put_bl -> io_free_region` call chain at
lines 445, 698, 719.
- [Phase 5] `grep` for `capable\|CAP_` in `io_uring/io_uring.c, kbuf.c,
register.c` confirms IORING_REGISTER_PBUF_RING and io_uring_mmap are
unprivileged.
- [Phase 6] Verified `io_mmap_get_region` and `io_region_is_set` exist
in v6.18, v7.0; do not exist in v6.12.
- [Phase 8] Failure mode: confirmed UAF + WAF + observable from
userspace via PoC. Severity: CRITICAL (security).
- UNVERIFIED: Did not attempt to actually run the PoC under QEMU in this
session; relied on Greg KH's `Tested-by` and PoC source code
inspection.
- UNVERIFIED: Did not check whether stable maintainers (separate from
the discussion thread) have already queued or rejected this for
stable.
The fix addresses a confirmed unprivileged-reachable use-after-free /
write-after-free in io_uring under `!CONFIG_MMU`, is small and contained
to the NOMMU branch only, was tested-by the LTS maintainer with a
working PoC, and applies cleanly to v6.14+ stable trees (including the
v7.0.y target this branch represents). It meets every stable rule.
**YES**
io_uring/memmap.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 45 insertions(+), 1 deletion(-)
diff --git a/io_uring/memmap.c b/io_uring/memmap.c
index e6958968975a8..4f9b439319c46 100644
--- a/io_uring/memmap.c
+++ b/io_uring/memmap.c
@@ -366,9 +366,53 @@ unsigned long io_uring_get_unmapped_area(struct file *filp, unsigned long addr,
#else /* !CONFIG_MMU */
+/*
+ * Drop the pages that were initially referenced and added in
+ * io_uring_mmap(). We cannot have had a mremap() as that isn't supported,
+ * hence the vma should be identical to the one we initially referenced and
+ * mapped, and partial unmaps and splitting isn't possible on a file backed
+ * mapping.
+ */
+static void io_uring_nommu_vm_close(struct vm_area_struct *vma)
+{
+ unsigned long index;
+
+ for (index = vma->vm_start; index < vma->vm_end; index += PAGE_SIZE)
+ put_page(virt_to_page((void *) index));
+}
+
+static const struct vm_operations_struct io_uring_nommu_vm_ops = {
+ .close = io_uring_nommu_vm_close,
+};
+
int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
{
- return is_nommu_shared_mapping(vma->vm_flags) ? 0 : -EINVAL;
+ struct io_ring_ctx *ctx = file->private_data;
+ struct io_mapped_region *region;
+ unsigned long i;
+
+ if (!is_nommu_shared_mapping(vma->vm_flags))
+ return -EINVAL;
+
+ guard(mutex)(&ctx->mmap_lock);
+ region = io_mmap_get_region(ctx, vma->vm_pgoff);
+ if (!region || !io_region_is_set(region))
+ return -EINVAL;
+
+ if ((vma->vm_end - vma->vm_start) !=
+ (unsigned long) region->nr_pages << PAGE_SHIFT)
+ return -EINVAL;
+
+ /*
+ * Pin the pages so io_free_region()'s release_pages() does not
+ * drop the last reference while this VMA exists. delete_vma()
+ * in mm/nommu.c calls vma_close() which runs ->close above.
+ */
+ for (i = 0; i < region->nr_pages; i++)
+ get_page(region->pages[i]);
+
+ vma->vm_ops = &io_uring_nommu_vm_ops;
+ return 0;
}
unsigned int io_uring_nommu_mmap_capabilities(struct file *file)
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-28 10:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260428104133.2858589-1-sashal@kernel.org>
2026-04-28 10:40 ` [PATCH AUTOSEL 7.0-6.18] io_uring/rsrc: unify nospec indexing for direct descriptors Sasha Levin
2026-04-28 10:41 ` [PATCH AUTOSEL 7.0-6.12] io_uring: take page references for NOMMU pbuf_ring mmaps Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox