From: "Clément Léger" <cleger@meta.com>
To: <io-uring@vger.kernel.org>,
Pavel Begunkov <asml.silence@gmail.com>,
"Jens Axboe" <axboe@kernel.dk>
Cc: "Clément Léger" <cleger@meta.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org, netdev@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Simon Horman" <horms@kernel.org>,
"Jonathan Corbet" <corbet@lwn.net>,
"Shuah Khan" <skhan@linuxfoundation.org>,
"Vishwanath Seshagiri" <vishs@fb.com>
Subject: [PATCH 4/5] Documentation: networking: document zcrx notifications and statistics
Date: Wed, 22 Apr 2026 04:25:15 -0700 [thread overview]
Message-ID: <20260422112522.3316660-5-cleger@meta.com> (raw)
In-Reply-To: <20260422112522.3316660-1-cleger@meta.com>
Document the zcrx notification system and shared-memory statistics
that were introduced to let userspace monitor zero-copy receive health.
The notification section covers the two notification types
(ZCRX_NOTIF_NO_BUFFERS, ZCRX_NOTIF_COPY), registration via
zcrx_notification_desc, and the fire-once / re-arm mechanism via
ZCRX_CTRL_ARM_NOTIFICATION. The statistics section covers the optional
shared-memory io_uring_zcrx_notif_stats structure placed in the refill
ring region, including how to query its layout via
IO_URING_QUERY_ZCRX_NOTIF.
Signed-off-by: Clément Léger <cleger@meta.com>
---
Documentation/networking/iou-zcrx.rst | 106 ++++++++++++++++++++++++++
1 file changed, 106 insertions(+)
diff --git a/Documentation/networking/iou-zcrx.rst b/Documentation/networking/iou-zcrx.rst
index 7f3f4b2e6cf2..b17205fe55aa 100644
--- a/Documentation/networking/iou-zcrx.rst
+++ b/Documentation/networking/iou-zcrx.rst
@@ -196,6 +196,112 @@ Return buffers back to the kernel to be used again::
rqe->len = cqe->res;
IO_URING_WRITE_ONCE(*refill_ring.ktail, ++refill_ring.rq_tail);
+Notifications
+-------------
+
+When zero-copy receive encounters conditions that affect performance or
+functionality, the kernel can notify userspace via dedicated CQE notifications.
+The application must register a notification descriptor during
+``IORING_REGISTER_ZCRX_IFQ`` to receive them.
+
+Supported features can be detected by checking for ``ZCRX_FEATURE_NOTIFICATION``
+in the features bitmask returned by ``IO_URING_QUERY_ZCRX``.
+
+**Notification types**
+
+``ZCRX_NOTIF_NO_BUFFERS``
+ Fired when the page pool fails to allocate because the zcrx buffer area is
+ exhausted.
+
+``ZCRX_NOTIF_COPY``
+ Fired when a received fragment could not be delivered zero-copy and was
+ instead copied into a buffer.
+
+**Registering notifications**
+
+Allocate and fill a ``struct zcrx_notification_desc``::
+
+ struct zcrx_notification_desc notif = {
+ .user_data = MY_NOTIF_USER_DATA,
+ .type_mask = ZCRX_NOTIF_NO_BUFFERS | ZCRX_NOTIF_COPY,
+ };
+
+ reg.notif_desc = (__u64)(unsigned long)¬if;
+
+``user_data`` is the value that will appear in the notification CQE's
+``user_data`` field. ``type_mask`` selects which notification types the
+application wants to receive.
+
+When a registered event occurs, the kernel posts a CQE with the specified
+``user_data`` and ``cqe->res`` set to a bitmask of the triggered notification
+types.
+
+**Rate limiting**
+
+Each notification type fires once until the application explicitly re-arms it.
+To re-arm, issue ``IORING_REGISTER_ZCRX_CTRL`` with
+``ZCRX_CTRL_ARM_NOTIFICATION``::
+
+ struct zcrx_ctrl ctrl = {
+ .zcrx_id = zcrx_id,
+ .op = ZCRX_CTRL_ARM_NOTIFICATION,
+ .zc_arm_notif = {
+ .type_mask = ZCRX_NOTIF_NO_BUFFERS | ZCRX_NOTIF_COPY,
+ },
+ };
+
+ io_uring_register(ring_fd, IORING_REGISTER_ZCRX_CTRL, &ctrl, 0);
+
+Only notification types that have previously fired can be re-armed.
+
+Notification statistics
+-----------------------
+
+In addition to CQE-based notifications, the kernel can maintain a shared-memory
+statistics structure that is updated on every relevant event. All stats are
+updated regardless of which notification flags were registered.
+
+The statistics structure layout and alignment requirements can be queried via
+``IO_URING_QUERY_ZCRX_NOTIF``. The application must query the structure size
+and alignment requirements so that it allocates enough memory for the region
+to fit both the refill ring and the stats structure.
+
+To enable statistics, place the stats structure after the refill ring entries
+within the same mapped region, and set the ``ZCRX_NOTIF_DESC_FLAG_STATS`` flag
+in the notification descriptor::
+
+ /* Compute offset for the stats struct (after refill ring entries) */
+ size_t stats_offset = ring_size;
+ ring_size += ALIGN_UP(sizeof(struct io_uring_zcrx_notif_stats), PAGE_SIZE);
+
+ /* Map the region with the extra space */
+ ring_ptr = mmap(NULL, ring_size, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
+
+ struct zcrx_notification_desc notif = {
+ .user_data = MY_NOTIF_USER_DATA,
+ .type_mask = ZCRX_NOTIF_COPY,
+ .flags = ZCRX_NOTIF_DESC_FLAG_STATS,
+ .stats_offset = stats_offset,
+ };
+
+The ``stats_offset`` must satisfy the alignment reported by
+``notif_stats_off_alignment`` and must point to a location within the mapped
+region that does not overlap with the refill ring header or entries.
+
+Application can read stat counters them at any time::
+
+ volatile struct io_uring_zcrx_notif_stats *stats =
+ (void *)((char *)ring_ptr + stats_offset);
+
+ printf("copy fallbacks: %llu (%llu bytes)\n",
+ IO_URING_READ_ONCE(stats->copy_count),
+ IO_URING_READ_ONCE(stats->copy_bytes));
+
+``copy_count`` is incremented each time a fragment is copied instead of being
+delivered via zero-copy. ``copy_bytes`` accumulates the total number of bytes
+copied.
+
Area chunking
-------------
--
2.52.0
next prev parent reply other threads:[~2026-04-22 11:30 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-22 11:25 [PATCH 0/5] io_uring/zcrx: add CQE based notifications and stats reporting Clément Léger
2026-04-22 11:25 ` [PATCH 1/5] io_uring/zcrx: notify user when out of buffers Clément Léger
2026-04-22 11:25 ` [PATCH 2/5] io_uring/zcrx: notify user on frag copy fallback Clément Léger
2026-04-22 11:25 ` [PATCH 3/5] io_uring/zcrx: add shared-memory notification statistics Clément Léger
2026-04-22 11:25 ` Clément Léger [this message]
2026-04-22 11:25 ` [PATCH 5/5] selftests: iou-zcrx: add notification and stats test for zcrx Clément Léger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260422112522.3316660-5-cleger@meta.com \
--to=cleger@meta.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=io-uring@vger.kernel.org \
--cc=kuba@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=skhan@linuxfoundation.org \
--cc=vishs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox