public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Pavel Begunkov <asml.silence@gmail.com>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>,
	io-uring@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.16] io_uring/zcrx: account niov arrays to cgroup
Date: Thu,  9 Oct 2025 11:55:47 -0400	[thread overview]
Message-ID: <20251009155752.773732-81-sashal@kernel.org> (raw)
In-Reply-To: <20251009155752.773732-1-sashal@kernel.org>

From: Pavel Begunkov <asml.silence@gmail.com>

[ Upstream commit 31bf77dcc3810e08bcc7d15470e92cdfffb7f7f1 ]

net_iov / freelist / etc. arrays can be quite long, make sure they're
accounted.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

- What it does: Switches three kvmalloc_array() allocations used by
  io_uring zcrx areas from GFP_KERNEL to GFP_KERNEL_ACCOUNT so their
  memory is charged to the creating task’s memcg. Specifically:
  - `area->nia.niovs` allocation: io_uring/zcrx.c:425
  - `area->freelist` allocation: io_uring/zcrx.c:430
  - `area->user_refs` allocation: io_uring/zcrx.c:435

Why it matters
- Fixes unaccounted kernel memory: These arrays can be very large (one
  entry per page of the registered area). Without GFP_KERNEL_ACCOUNT, a
  cgroup can allocate significant kernel memory that is not charged to
  its memcg, breaking containment and potentially causing host memory
  pressure. The commit explicitly addresses this: “arrays can be quite
  long, make sure they're accounted.”
- Brings consistency with existing accounting in the same path: The user
  memory backing the area is already accounted to memcg via
  `sg_alloc_table_from_pages(..., GFP_KERNEL_ACCOUNT)`
  (io_uring/zcrx.c:196) and to the io_uring context via
  `io_account_mem()` (io_uring/zcrx.c:205). Accounting these control
  arrays aligns with that design and closes a loophole where only the
  big page backing was charged but the (potentially multi‑MiB) array
  metadata was not.
- Scope is tiny and contained: The change is three flag substitutions
  within `io_zcrx_create_area()` and has no API/ABI or behavioral
  changes beyond proper memcg charging. No architectural changes; hot
  paths are unaffected (this is registration-time allocation).

Risk assessment
- Low regression risk: Uses a long-standing flag (`GFP_KERNEL_ACCOUNT`)
  already used in this file for the data path (io_uring/zcrx.c:196). The
  only behavioral change is that allocations will now fail earlier with
  `-ENOMEM` if a cgroup’s limits would be exceeded—this is the desired
  and correct behavior for accounting fixes.
- No ordering dependencies: The patch doesn’t rely on recent refactors;
  the affected allocations exist in v6.15–v6.17 and are currently done
  with `GFP_KERNEL`. The change applies cleanly to those stable series
  where `io_uring/zcrx.c` is present.

Stable tree fit
- Fixes a real bug affecting users: memcg under-accounting in a new but
  shipped subsystem (zcrx is present since v6.15).
- Minimal, localized, and low risk: Three flag changes in one function.
- No feature additions or architectural changes: Pure accounting fix.
- Consistent with stable policy: Similar accounting fixes are regularly
  accepted; related earlier work in this area explicitly targeted stable
  (e.g., “io_uring/zcrx: account area memory” carries a `Cc:
  stable@vger.kernel.org`, complementing this change).

Conclusion
- Backporting will prevent unaccounted kernel memory growth from zcrx
  area metadata, aligning with memcg expectations and improving
  containment with negligible risk.

 io_uring/zcrx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 39d1ef52a57b1..5928544cd1687 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -426,17 +426,17 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
 
 	ret = -ENOMEM;
 	area->nia.niovs = kvmalloc_array(nr_iovs, sizeof(area->nia.niovs[0]),
-					 GFP_KERNEL | __GFP_ZERO);
+					 GFP_KERNEL_ACCOUNT | __GFP_ZERO);
 	if (!area->nia.niovs)
 		goto err;
 
 	area->freelist = kvmalloc_array(nr_iovs, sizeof(area->freelist[0]),
-					GFP_KERNEL | __GFP_ZERO);
+					GFP_KERNEL_ACCOUNT | __GFP_ZERO);
 	if (!area->freelist)
 		goto err;
 
 	area->user_refs = kvmalloc_array(nr_iovs, sizeof(area->user_refs[0]),
-					GFP_KERNEL | __GFP_ZERO);
+					GFP_KERNEL_ACCOUNT | __GFP_ZERO);
 	if (!area->user_refs)
 		goto err;
 
-- 
2.51.0


      parent reply	other threads:[~2025-10-09 16:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251009155752.773732-1-sashal@kernel.org>
2025-10-09 15:54 ` [PATCH AUTOSEL 6.17] io_uring/zcrx: check all niovs filled with dma addresses Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.12] io_uring/zctx: check chained notif contexts Sasha Levin
2025-10-09 15:55 ` [PATCH AUTOSEL 6.17-6.16] io_uring/rsrc: respect submitter_task in io_register_clone_buffers() Sasha Levin
2025-10-09 15:55 ` Sasha Levin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251009155752.773732-81-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox