From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Pavel Begunkov <asml.silence@gmail.com>,
Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>,
io-uring@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.16] io_uring/zcrx: account niov arrays to cgroup
Date: Thu, 9 Oct 2025 11:55:47 -0400 [thread overview]
Message-ID: <20251009155752.773732-81-sashal@kernel.org> (raw)
In-Reply-To: <20251009155752.773732-1-sashal@kernel.org>
From: Pavel Begunkov <asml.silence@gmail.com>
[ Upstream commit 31bf77dcc3810e08bcc7d15470e92cdfffb7f7f1 ]
net_iov / freelist / etc. arrays can be quite long, make sure they're
accounted.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- What it does: Switches three kvmalloc_array() allocations used by
io_uring zcrx areas from GFP_KERNEL to GFP_KERNEL_ACCOUNT so their
memory is charged to the creating task’s memcg. Specifically:
- `area->nia.niovs` allocation: io_uring/zcrx.c:425
- `area->freelist` allocation: io_uring/zcrx.c:430
- `area->user_refs` allocation: io_uring/zcrx.c:435
Why it matters
- Fixes unaccounted kernel memory: These arrays can be very large (one
entry per page of the registered area). Without GFP_KERNEL_ACCOUNT, a
cgroup can allocate significant kernel memory that is not charged to
its memcg, breaking containment and potentially causing host memory
pressure. The commit explicitly addresses this: “arrays can be quite
long, make sure they're accounted.”
- Brings consistency with existing accounting in the same path: The user
memory backing the area is already accounted to memcg via
`sg_alloc_table_from_pages(..., GFP_KERNEL_ACCOUNT)`
(io_uring/zcrx.c:196) and to the io_uring context via
`io_account_mem()` (io_uring/zcrx.c:205). Accounting these control
arrays aligns with that design and closes a loophole where only the
big page backing was charged but the (potentially multi‑MiB) array
metadata was not.
- Scope is tiny and contained: The change is three flag substitutions
within `io_zcrx_create_area()` and has no API/ABI or behavioral
changes beyond proper memcg charging. No architectural changes; hot
paths are unaffected (this is registration-time allocation).
Risk assessment
- Low regression risk: Uses a long-standing flag (`GFP_KERNEL_ACCOUNT`)
already used in this file for the data path (io_uring/zcrx.c:196). The
only behavioral change is that allocations will now fail earlier with
`-ENOMEM` if a cgroup’s limits would be exceeded—this is the desired
and correct behavior for accounting fixes.
- No ordering dependencies: The patch doesn’t rely on recent refactors;
the affected allocations exist in v6.15–v6.17 and are currently done
with `GFP_KERNEL`. The change applies cleanly to those stable series
where `io_uring/zcrx.c` is present.
Stable tree fit
- Fixes a real bug affecting users: memcg under-accounting in a new but
shipped subsystem (zcrx is present since v6.15).
- Minimal, localized, and low risk: Three flag changes in one function.
- No feature additions or architectural changes: Pure accounting fix.
- Consistent with stable policy: Similar accounting fixes are regularly
accepted; related earlier work in this area explicitly targeted stable
(e.g., “io_uring/zcrx: account area memory” carries a `Cc:
stable@vger.kernel.org`, complementing this change).
Conclusion
- Backporting will prevent unaccounted kernel memory growth from zcrx
area metadata, aligning with memcg expectations and improving
containment with negligible risk.
io_uring/zcrx.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 39d1ef52a57b1..5928544cd1687 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -426,17 +426,17 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
ret = -ENOMEM;
area->nia.niovs = kvmalloc_array(nr_iovs, sizeof(area->nia.niovs[0]),
- GFP_KERNEL | __GFP_ZERO);
+ GFP_KERNEL_ACCOUNT | __GFP_ZERO);
if (!area->nia.niovs)
goto err;
area->freelist = kvmalloc_array(nr_iovs, sizeof(area->freelist[0]),
- GFP_KERNEL | __GFP_ZERO);
+ GFP_KERNEL_ACCOUNT | __GFP_ZERO);
if (!area->freelist)
goto err;
area->user_refs = kvmalloc_array(nr_iovs, sizeof(area->user_refs[0]),
- GFP_KERNEL | __GFP_ZERO);
+ GFP_KERNEL_ACCOUNT | __GFP_ZERO);
if (!area->user_refs)
goto err;
--
2.51.0
prev parent reply other threads:[~2025-10-09 16:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20251009155752.773732-1-sashal@kernel.org>
2025-10-09 15:54 ` [PATCH AUTOSEL 6.17] io_uring/zcrx: check all niovs filled with dma addresses Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.12] io_uring/zctx: check chained notif contexts Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.16] io_uring/rsrc: respect submitter_task in io_register_clone_buffers() Sasha Levin
2025-10-09 15:55 ` Sasha Levin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251009155752.773732-81-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox