From: Pavel Begunkov <[email protected]>
To: [email protected], [email protected],
[email protected]
Cc: "David S . Miller" <[email protected]>,
Jakub Kicinski <[email protected]>,
Jonathan Lemon <[email protected]>,
Willem de Bruijn <[email protected]>,
Jens Axboe <[email protected]>,
[email protected], Pavel Begunkov <[email protected]>
Subject: [RFC net-next v3 28/29] io_uring: batch submission notif referencing
Date: Tue, 28 Jun 2022 19:56:50 +0100 [thread overview]
Message-ID: <bbf76e9185c50a51c121153cd4c3bd7a6b830778.1653992701.git.asml.silence@gmail.com> (raw)
In-Reply-To: <[email protected]>
Batch get notifier references and use ->msg_ubuf_ref to hand off one ref
per sendzc request to the network layer. This ammortises the submission
side net_zcopy_get() atomics. Note that we always keep at least one
reference in the cache because we do only post send checks on
whether ->msg_ubuf_ref was consumed or not.
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 32 +++++++++++++++++++++++++++++---
1 file changed, 29 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 08c98a4d9bd2..78990a130b66 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -374,6 +374,7 @@ struct io_ev_fd {
};
#define IO_NOTIF_MAX_SLOTS (1U << 10)
+#define IO_NOTIF_REF_CACHE_NR 64
struct io_notif {
struct ubuf_info uarg;
@@ -384,6 +385,8 @@ struct io_notif {
u64 tag;
/* see struct io_notif_slot::seq */
u32 seq;
+ /* extra uarg->refcnt refs */
+ int cached_refs;
/* hook into ctx->notif_list and ctx->notif_list_locked */
struct list_head cache_node;
@@ -2949,14 +2952,30 @@ static struct io_notif *io_alloc_notif(struct io_ring_ctx *ctx,
notif->seq = slot->seq++;
notif->tag = slot->tag;
+ notif->cached_refs = IO_NOTIF_REF_CACHE_NR;
/* master ref owned by io_notif_slot, will be dropped on flush */
- refcount_set(¬if->uarg.refcnt, 1);
+ refcount_set(¬if->uarg.refcnt, IO_NOTIF_REF_CACHE_NR + 1);
percpu_ref_get(&ctx->refs);
notif->rsrc_node = ctx->rsrc_node;
io_charge_rsrc_node(ctx);
return notif;
}
+static inline void io_notif_consume_ref(struct io_notif *notif)
+ __must_hold(&ctx->uring_lock)
+{
+ notif->cached_refs--;
+
+ /*
+ * Issue sends without looking at notif->cached_refs first, so we
+ * always have to have at least one ref cached
+ */
+ if (unlikely(!notif->cached_refs)) {
+ refcount_add(IO_NOTIF_REF_CACHE_NR, ¬if->uarg.refcnt);
+ notif->cached_refs += IO_NOTIF_REF_CACHE_NR;
+ }
+}
+
static inline struct io_notif *io_get_notif(struct io_ring_ctx *ctx,
struct io_notif_slot *slot)
{
@@ -2979,13 +2998,15 @@ static void io_notif_slot_flush(struct io_notif_slot *slot)
__must_hold(&ctx->uring_lock)
{
struct io_notif *notif = slot->notif;
+ int refs = notif->cached_refs + 1;
slot->notif = NULL;
+ notif->cached_refs = 0;
if (WARN_ON_ONCE(in_interrupt()))
return;
- /* drop slot's master ref */
- if (refcount_dec_and_test(¬if->uarg.refcnt))
+ /* drop all cached refs and the slot's master ref */
+ if (refcount_sub_and_test(refs, ¬if->uarg.refcnt))
io_notif_complete(notif);
}
@@ -6653,6 +6674,7 @@ static int io_sendzc(struct io_kiocb *req, unsigned int issue_flags)
msg.msg_controllen = 0;
msg.msg_namelen = 0;
msg.msg_managed_data = 1;
+ msg.msg_ubuf_ref = 1;
if (req->msgzc.zc_flags & IORING_SENDZC_FIXED_BUF) {
ret = __io_import_fixed(WRITE, &msg.msg_iter, req->imu,
@@ -6686,6 +6708,10 @@ static int io_sendzc(struct io_kiocb *req, unsigned int issue_flags)
msg.msg_ubuf = ¬if->uarg;
ret = sock_sendmsg(sock, &msg);
+ /* check if the send consumed an additional ref */
+ if (likely(!msg.msg_ubuf_ref))
+ io_notif_consume_ref(notif);
+
if (likely(ret >= min_ret)) {
unsigned zc_flags = req->msgzc.zc_flags;
--
2.36.1
next prev parent reply other threads:[~2022-06-28 19:01 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-28 18:56 [RFC net-next v3 00/29] io_uring zerocopy send Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 01/29] ipv4: avoid partial copy for zc Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 02/29] ipv6: " Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 03/29] skbuff: add SKBFL_DONT_ORPHAN flag Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 04/29] skbuff: carry external ubuf_info in msghdr Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 05/29] net: bvec specific path in zerocopy_sg_from_iter Pavel Begunkov
2022-06-28 20:06 ` Al Viro
2022-06-28 21:33 ` Pavel Begunkov
2022-06-28 22:52 ` David Ahern
2022-07-04 13:31 ` Pavel Begunkov
2022-07-05 2:28 ` David Ahern
2022-07-05 14:03 ` Pavel Begunkov
2022-07-05 22:09 ` Pavel Begunkov
2022-07-06 15:11 ` David Ahern
2022-06-28 18:56 ` [RFC net-next v3 06/29] net: optimise bvec-based zc page referencing Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 07/29] net: don't track pfmemalloc for managed frags Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 08/29] skbuff: don't mix ubuf_info of different types Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 09/29] ipv4/udp: support zc with managed data Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 10/29] ipv6/udp: " Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 11/29] tcp: " Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 12/29] tcp: kill extra io_uring's uarg refcounting Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 13/29] net: let callers provide extra ubuf_info refs Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 14/29] io_uring: opcode independent fixed buf import Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 15/29] io_uring: add zc notification infrastructure Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 16/29] io_uring: cache struct io_notif Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 17/29] io_uring: complete notifiers in tw Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 18/29] io_uring: add notification slot registration Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 19/29] io_uring: rename IORING_OP_FILES_UPDATE Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 20/29] io_uring: add zc notification flush requests Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 21/29] io_uring: wire send zc request type Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 22/29] io_uring: account locked pages for non-fixed zc Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 23/29] io_uring: allow to pass addr into sendzc Pavel Begunkov
2022-06-29 7:42 ` Stefan Metzmacher
2022-06-29 9:53 ` Pavel Begunkov
2022-08-13 8:45 ` Stefan Metzmacher
2022-08-15 9:46 ` Pavel Begunkov
2022-08-15 11:40 ` Stefan Metzmacher
2022-08-15 12:19 ` Pavel Begunkov
2022-08-15 13:30 ` Stefan Metzmacher
2022-08-15 14:09 ` Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 24/29] io_uring: add rsrc referencing for notifiers Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 25/29] io_uring: sendzc with fixed buffers Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 26/29] io_uring: flush notifiers after sendzc Pavel Begunkov
2022-06-28 18:56 ` [RFC net-next v3 27/29] io_uring: allow to override zc tag on flush Pavel Begunkov
2022-06-28 18:56 ` Pavel Begunkov [this message]
2022-06-28 18:56 ` [RFC net-next v3 29/29] selftests/io_uring: test zerocopy send Pavel Begunkov
2022-06-28 19:03 ` [RFC net-next v3 00/29] io_uring " Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bbf76e9185c50a51c121153cd4c3bd7a6b830778.1653992701.git.asml.silence@gmail.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox