public inbox for [email protected]
 help / color / mirror / Atom feed
* [RFC 0/3] request parameter set api and wait termination tuning
@ 2024-11-10 14:56 Pavel Begunkov
  2024-11-10 14:56 ` [RFC 1/3] io_uring: introduce request parameter sets Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Pavel Begunkov @ 2024-11-10 14:56 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

A crude prototype for probing opinions on the API. Not suitable for
upstream in the current form. Not properly tested either.

Patch 1 adds indirection for new parameters and flags by allowing
the user to register a certain combination of them and requests to
refer to them an index passed in sqe->personality. The use case in
mind is the tuning wake ups and wait loop termination conditions.

Patch 3 is not complete, and I have doubts about the semantics of
Patch 2, but it showcases what/how the series is trying to target.
Note, these are made as hints and can be seamlessly deprecated and
removed from the kernel, in which case the user will get woken up
more often / earlier, which should be tolerated.

Jens Axboe (1):
  io_uring: add support for ignoring inline completions for waits

Pavel Begunkov (2):
  io_uring: introduce request parameter sets
  io_uring: allow waiting loop to ignore some CQEs

 include/linux/io_uring_types.h |  9 ++++
 include/uapi/linux/io_uring.h  | 14 ++++++
 io_uring/io_uring.c            | 91 +++++++++++++++++++++++-----------
 io_uring/msg_ring.c            |  1 +
 io_uring/net.c                 |  1 +
 io_uring/register.c            | 52 +++++++++++++++++++
 6 files changed, 139 insertions(+), 29 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RFC 1/3] io_uring: introduce request parameter sets
  2024-11-10 14:56 [RFC 0/3] request parameter set api and wait termination tuning Pavel Begunkov
@ 2024-11-10 14:56 ` Pavel Begunkov
  2024-11-10 14:56 ` [RFC 2/3] io_uring: add support for ignoring inline completions for waits Pavel Begunkov
  2024-11-10 14:56 ` [RFC 3/3] io_uring: allow waiting loop to ignore some CQEs Pavel Begunkov
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2024-11-10 14:56 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

There are lots of parameters we might want to additionally pass to a
request, but SQE has limited space and it may require additional parsing
and checking in the hot path. Then requests take an index specifying
which parameter set to use.

The benefit for the kernel is that we can put any number of arguments in
there and then do pre-processing at the initialisation time like
renumbering flags and enabling static keys for performance deprecated
features. The obvious downside is that the user can't use the entire
parameter space as there could only be a limited number of sets. The
main target here is tuning the waiting loop with finer grained control
when we should wake the task and return to the user.

The current implementation is crude, it needs a SETUP flag disabling
creds/personalities, and is limited to one registration of maximum 16
sets. It could be made to co-exist with creds and be a bit more flexibly
registered and expanded.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 include/linux/io_uring_types.h |  8 ++++++
 include/uapi/linux/io_uring.h  |  9 ++++++
 io_uring/io_uring.c            | 36 ++++++++++++++++--------
 io_uring/msg_ring.c            |  1 +
 io_uring/net.c                 |  1 +
 io_uring/register.c            | 51 ++++++++++++++++++++++++++++++++++
 6 files changed, 94 insertions(+), 12 deletions(-)

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index ad5001102c86..79f38c07642d 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -75,6 +75,10 @@ struct io_hash_table {
 	unsigned		hash_bits;
 };
 
+struct io_set {
+	u32 flags;
+};
+
 /*
  * Arbitrary limit, can be raised if need be
  */
@@ -268,6 +272,9 @@ struct io_ring_ctx {
 		unsigned		cached_sq_head;
 		unsigned		sq_entries;
 
+		struct io_set		iosets[16];
+		unsigned int		nr_iosets;
+
 		/*
 		 * Fixed resources fast path, should be accessed only under
 		 * uring_lock, and updated through io_uring_register(2)
@@ -635,6 +642,7 @@ struct io_kiocb {
 
 	struct io_ring_ctx		*ctx;
 	struct io_uring_task		*tctx;
+	struct io_set			*ioset;
 
 	union {
 		/* stores selected buf, valid IFF REQ_F_BUFFER_SELECTED is set */
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index ba373deb8406..6a432383e7c3 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -158,6 +158,8 @@ enum io_uring_sqe_flags_bit {
 #define IORING_SETUP_ATTACH_WQ	(1U << 5)	/* attach to existing wq */
 #define IORING_SETUP_R_DISABLED	(1U << 6)	/* start with ring disabled */
 #define IORING_SETUP_SUBMIT_ALL	(1U << 7)	/* continue submit on error */
+#define IORING_SETUP_IOSET	(1U << 8)
+
 /*
  * Cooperative task running. When requests complete, they often require
  * forcing the submitter to transition to the kernel to complete. If this
@@ -634,6 +636,8 @@ enum io_uring_register_op {
 	/* register fixed io_uring_reg_wait arguments */
 	IORING_REGISTER_CQWAIT_REG		= 34,
 
+	IORING_REGISTER_IOSETS			= 35,
+
 	/* this goes last */
 	IORING_REGISTER_LAST,
 
@@ -895,6 +899,11 @@ struct io_uring_recvmsg_out {
 	__u32 flags;
 };
 
+struct io_uring_ioset_reg {
+	__u64 flags;
+	__u64 __resv[3];
+};
+
 /*
  * Argument for IORING_OP_URING_CMD when file is a socket
  */
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index f34fa1ead2cf..cf688a9ff737 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2156,6 +2156,7 @@ static void io_init_req_drain(struct io_kiocb *req)
 
 static __cold int io_init_fail_req(struct io_kiocb *req, int err)
 {
+	req->ioset = &req->ctx->iosets[0];
 	/* ensure per-opcode data is cleared if we fail before prep */
 	memset(&req->cmd.data, 0, sizeof(req->cmd.data));
 	return err;
@@ -2238,19 +2239,27 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	}
 
 	personality = READ_ONCE(sqe->personality);
-	if (personality) {
-		int ret;
-
-		req->creds = xa_load(&ctx->personalities, personality);
-		if (!req->creds)
+	if (ctx->flags & IORING_SETUP_IOSET) {
+		if (unlikely(personality >= ctx->nr_iosets))
 			return io_init_fail_req(req, -EINVAL);
-		get_cred(req->creds);
-		ret = security_uring_override_creds(req->creds);
-		if (ret) {
-			put_cred(req->creds);
-			return io_init_fail_req(req, ret);
+		personality = array_index_nospec(personality, ctx->nr_iosets);
+		req->ioset = &ctx->iosets[personality];
+	} else {
+		if (personality) {
+			int ret;
+
+			req->creds = xa_load(&ctx->personalities, personality);
+			if (!req->creds)
+				return io_init_fail_req(req, -EINVAL);
+			get_cred(req->creds);
+			ret = security_uring_override_creds(req->creds);
+			if (ret) {
+				put_cred(req->creds);
+				return io_init_fail_req(req, ret);
+			}
+			req->flags |= REQ_F_CREDS;
 		}
-		req->flags |= REQ_F_CREDS;
+		req->ioset = &ctx->iosets[0];
 	}
 
 	return def->prep(req, sqe);
@@ -3909,6 +3918,8 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
 	if (!ctx)
 		return -ENOMEM;
 
+	ctx->nr_iosets = 0;
+
 	ctx->clockid = CLOCK_MONOTONIC;
 	ctx->clock_offset = 0;
 
@@ -4076,7 +4087,8 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
 			IORING_SETUP_SQE128 | IORING_SETUP_CQE32 |
 			IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN |
 			IORING_SETUP_NO_MMAP | IORING_SETUP_REGISTERED_FD_ONLY |
-			IORING_SETUP_NO_SQARRAY | IORING_SETUP_HYBRID_IOPOLL))
+			IORING_SETUP_NO_SQARRAY | IORING_SETUP_HYBRID_IOPOLL |
+			IORING_SETUP_IOSET))
 		return -EINVAL;
 
 	return io_uring_create(entries, &p, params);
diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c
index e63af34004b7..f5a747aa255c 100644
--- a/io_uring/msg_ring.c
+++ b/io_uring/msg_ring.c
@@ -98,6 +98,7 @@ static int io_msg_remote_post(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	io_req_set_res(req, res, cflags);
 	percpu_ref_get(&ctx->refs);
 	req->ctx = ctx;
+	req->ioset = &ctx->iosets[0];
 	req->io_task_work.func = io_msg_tw_complete;
 	io_req_task_work_add_remote(req, ctx, IOU_F_TWQ_LAZY_WAKE);
 	return 0;
diff --git a/io_uring/net.c b/io_uring/net.c
index 2ccc2b409431..785987bf9e6a 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -1242,6 +1242,7 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	notif = zc->notif = io_alloc_notif(ctx);
 	if (!notif)
 		return -ENOMEM;
+	notif->ioset = req->ioset;
 	notif->cqe.user_data = req->cqe.user_data;
 	notif->cqe.res = 0;
 	notif->cqe.flags = IORING_CQE_F_NOTIF;
diff --git a/io_uring/register.c b/io_uring/register.c
index 45edfc57963a..e7571dc46da5 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -86,6 +86,48 @@ int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id)
 	return -EINVAL;
 }
 
+static int io_update_ioset(struct io_ring_ctx *ctx,
+			   const struct io_uring_ioset_reg *reg,
+			   struct io_set *set)
+{
+	if (!(ctx->flags & IORING_SETUP_IOSET))
+		return -EINVAL;
+	if (reg->flags)
+		return -EINVAL;
+	if (reg->__resv[0] || reg->__resv[1] || reg->__resv[2])
+		return -EINVAL;
+
+	set->flags = reg->flags;
+	return 0;
+}
+
+static int io_register_iosets(struct io_ring_ctx *ctx,
+			      void __user *arg, unsigned int nr_args)
+{
+	struct io_uring_ioset_reg __user *uptr = arg;
+	struct io_uring_ioset_reg reg[16];
+	int i, ret;
+
+	/* TODO: one time setup, max 16 entries, should be made more dynamic */
+	if (ctx->nr_iosets)
+		return -EINVAL;
+	if (nr_args >= ARRAY_SIZE(ctx->iosets))
+		return -EINVAL;
+
+	if (copy_from_user(reg, uptr, sizeof(reg[0]) * nr_args))
+		return -EFAULT;
+
+	for (i = 0; i < nr_args; i++) {
+		ret = io_update_ioset(ctx, &reg[i], &ctx->iosets[i]);
+		if (ret) {
+			memset(&ctx->iosets[0], 0, sizeof(ctx->iosets[0]));
+			return ret;
+		}
+	}
+
+	ctx->nr_iosets = nr_args;
+	return 0;
+}
 
 static int io_register_personality(struct io_ring_ctx *ctx)
 {
@@ -93,6 +135,9 @@ static int io_register_personality(struct io_ring_ctx *ctx)
 	u32 id;
 	int ret;
 
+	if (ctx->flags & IORING_SETUP_IOSET)
+		return -EINVAL;
+
 	creds = get_current_cred();
 
 	ret = xa_alloc_cyclic(&ctx->personalities, &id, (void *)creds,
@@ -846,6 +891,12 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 			break;
 		ret = io_register_cqwait_reg(ctx, arg);
 		break;
+	case IORING_REGISTER_IOSETS:
+		ret = -EINVAL;
+		if (!arg)
+			break;
+		ret = io_register_iosets(ctx, arg, nr_args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [RFC 2/3] io_uring: add support for ignoring inline completions for waits
  2024-11-10 14:56 [RFC 0/3] request parameter set api and wait termination tuning Pavel Begunkov
  2024-11-10 14:56 ` [RFC 1/3] io_uring: introduce request parameter sets Pavel Begunkov
@ 2024-11-10 14:56 ` Pavel Begunkov
  2024-11-10 14:56 ` [RFC 3/3] io_uring: allow waiting loop to ignore some CQEs Pavel Begunkov
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2024-11-10 14:56 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, Jens Axboe

From: Jens Axboe <[email protected]>

io_uring treats all completions the same - they post a completion event,
or more, and anyone waiting on event completions will see each event as
it gets posted.

However, some events may be more interesting that others. For a request
and response type model, it's not uncommon to have send/write events
that are submitted with a recv/read type of request. While the app does
want to see a successful send/write completion eventually, it need not
handle it upfront as it would want to do with a recv/read, as it isn't
time sensitive. Generally, a send/write completion will just mean that
a buffer can get recycled/reused, whereas a recv/read completion needs
acting upon (and a response sent).

This can be somewhat tricky to handle if many requests and responses
are being handled, and the app generally needs to track the number of
pending sends/writes to be able to sanely wait on just new incoming
recv/read requests. And even with that, an application would still
like to see a completion for a short/failed send/write immediately.

Add infrastructure to account inline completions, such that they can
be deducted from the 'wait_nr' being passed in via a submit_and_wait()
type of situation. Inline completions are ones that complete directly
inline from submission, such as a send to a socket where there's
enough space to accomodate the data being sent.

Signed-off-by: Jens Axboe <[email protected]>
[pavel: rebased onto iosets]
Signed-off-by: Pavel Begunkov <[email protected]>
---
 include/linux/io_uring_types.h |  1 +
 include/uapi/linux/io_uring.h  |  4 ++++
 io_uring/io_uring.c            | 12 ++++++++++--
 io_uring/register.c            |  2 +-
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 79f38c07642d..f04444f9356a 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -213,6 +213,7 @@ struct io_submit_state {
 	bool			need_plug;
 	bool			cq_flush;
 	unsigned short		submit_nr;
+	unsigned short		inline_completions;
 	struct blk_plug		plug;
 };
 
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 6a432383e7c3..e6d10fba8ae2 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -899,6 +899,10 @@ struct io_uring_recvmsg_out {
 	__u32 flags;
 };
 
+enum {
+	IOSQE_SET_F_HINT_IGNORE_INLINE		= 1,
+};
+
 struct io_uring_ioset_reg {
 	__u64 flags;
 	__u64 __resv[3];
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index cf688a9ff737..6e89435c243d 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1575,6 +1575,9 @@ void __io_submit_flush_completions(struct io_ring_ctx *ctx)
 		struct io_kiocb *req = container_of(node, struct io_kiocb,
 					    comp_list);
 
+		if (req->ioset->flags & IOSQE_SET_F_HINT_IGNORE_INLINE)
+			state->inline_completions++;
+
 		if (unlikely(req->flags & (REQ_F_CQE_SKIP | REQ_F_GROUP))) {
 			if (req->flags & REQ_F_GROUP) {
 				io_complete_group_req(req);
@@ -2511,6 +2514,7 @@ static void io_submit_state_start(struct io_submit_state *state,
 	state->plug_started = false;
 	state->need_plug = max_ios > 2;
 	state->submit_nr = max_ios;
+	state->inline_completions = 0;
 	/* set only head, no need to init link_last in advance */
 	state->link.head = NULL;
 	state->group.head = NULL;
@@ -3611,6 +3615,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 		size_t, argsz)
 {
 	struct io_ring_ctx *ctx;
+	int inline_complete = 0;
 	struct file *file;
 	long ret;
 
@@ -3676,6 +3681,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 			mutex_unlock(&ctx->uring_lock);
 			goto out;
 		}
+		inline_complete = ctx->submit_state.inline_completions;
 		if (flags & IORING_ENTER_GETEVENTS) {
 			if (ctx->syscall_iopoll)
 				goto iopoll_locked;
@@ -3713,8 +3719,10 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 
 			ret2 = io_get_ext_arg(ctx, flags, argp, &ext_arg);
 			if (likely(!ret2)) {
-				min_complete = min(min_complete,
-						   ctx->cq_entries);
+				if (min_complete > ctx->cq_entries)
+					min_complete = ctx->cq_entries;
+				else
+					min_complete += inline_complete;
 				ret2 = io_cqring_wait(ctx, min_complete, flags,
 						      &ext_arg);
 			}
diff --git a/io_uring/register.c b/io_uring/register.c
index e7571dc46da5..f87ec7b773bd 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -92,7 +92,7 @@ static int io_update_ioset(struct io_ring_ctx *ctx,
 {
 	if (!(ctx->flags & IORING_SETUP_IOSET))
 		return -EINVAL;
-	if (reg->flags)
+	if (reg->flags & ~IOSQE_SET_F_HINT_IGNORE_INLINE)
 		return -EINVAL;
 	if (reg->__resv[0] || reg->__resv[1] || reg->__resv[2])
 		return -EINVAL;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [RFC 3/3] io_uring: allow waiting loop to ignore some CQEs
  2024-11-10 14:56 [RFC 0/3] request parameter set api and wait termination tuning Pavel Begunkov
  2024-11-10 14:56 ` [RFC 1/3] io_uring: introduce request parameter sets Pavel Begunkov
  2024-11-10 14:56 ` [RFC 2/3] io_uring: add support for ignoring inline completions for waits Pavel Begunkov
@ 2024-11-10 14:56 ` Pavel Begunkov
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2024-11-10 14:56 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence

The user might not care about getting results of certain request, but
there will still wake up the task (i.e. task_work) and trigger the
waiting loop to terminate.

IOSQE_SET_F_HINT_SILENT attempts to de-priorities such completions.
The completion will be eventually posted, however the execution of the
request can and likely will be delayed to batch it with other requests.

It's an incomplete prototype, it only works with DEFER_TASKRUN, fails to
apply the optimisation for task_works queued before the waiting loop
starts, and interaction with IOSQE_SET_F_HINT_IGNORE_INLINE is likely
broken.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 include/uapi/linux/io_uring.h |  1 +
 io_uring/io_uring.c           | 43 +++++++++++++++++++++++------------
 io_uring/register.c           |  3 ++-
 3 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index e6d10fba8ae2..6dff0ee4e20c 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -901,6 +901,7 @@ struct io_uring_recvmsg_out {
 
 enum {
 	IOSQE_SET_F_HINT_IGNORE_INLINE		= 1,
+	IOSQE_SET_F_HINT_SILENT			= 2,
 };
 
 struct io_uring_ioset_reg {
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 6e89435c243d..2e1af10fd4f2 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1270,6 +1270,7 @@ static inline void io_req_local_work_add(struct io_kiocb *req,
 {
 	unsigned nr_wait, nr_tw, nr_tw_prev;
 	struct llist_node *head;
+	bool ignore = req->ioset->flags & IOSQE_SET_F_HINT_SILENT;
 
 	/* See comment above IO_CQ_WAKE_INIT */
 	BUILD_BUG_ON(IO_CQ_WAKE_FORCE <= IORING_MAX_CQ_ENTRIES);
@@ -1297,13 +1298,17 @@ static inline void io_req_local_work_add(struct io_kiocb *req,
 			nr_tw_prev = READ_ONCE(first_req->nr_tw);
 		}
 
-		/*
-		 * Theoretically, it can overflow, but that's fine as one of
-		 * previous adds should've tried to wake the task.
-		 */
-		nr_tw = nr_tw_prev + 1;
-		if (!(flags & IOU_F_TWQ_LAZY_WAKE))
-			nr_tw = IO_CQ_WAKE_FORCE;
+		nr_tw = nr_tw_prev;
+
+		if (!ignore) {
+			/*
+			 * Theoretically, it can overflow, but that's fine as
+			 * one of previous adds should've tried to wake the task.
+			 */
+			nr_tw += 1;
+			if (!(flags & IOU_F_TWQ_LAZY_WAKE))
+				nr_tw = IO_CQ_WAKE_FORCE;
+		}
 
 		req->nr_tw = nr_tw;
 		req->io_task_work.node.next = head;
@@ -1325,6 +1330,9 @@ static inline void io_req_local_work_add(struct io_kiocb *req,
 			io_eventfd_signal(ctx);
 	}
 
+	if (ignore)
+		return;
+
 	nr_wait = atomic_read(&ctx->cq_wait_nr);
 	/* not enough or no one is waiting */
 	if (nr_tw < nr_wait)
@@ -1405,7 +1413,7 @@ static bool io_run_local_work_continue(struct io_ring_ctx *ctx, int events,
 }
 
 static int __io_run_local_work(struct io_ring_ctx *ctx, struct io_tw_state *ts,
-			       int min_events)
+			       int min_events, struct io_wait_queue *waitq)
 {
 	struct llist_node *node;
 	unsigned int loops = 0;
@@ -1425,6 +1433,10 @@ static int __io_run_local_work(struct io_ring_ctx *ctx, struct io_tw_state *ts,
 		struct llist_node *next = node->next;
 		struct io_kiocb *req = container_of(node, struct io_kiocb,
 						    io_task_work.node);
+
+		if (req->ioset->flags & IOSQE_SET_F_HINT_SILENT)
+			waitq->cq_tail++;
+
 		INDIRECT_CALL_2(req->io_task_work.func,
 				io_poll_task_func, io_req_rw_complete,
 				req, ts);
@@ -1450,16 +1462,17 @@ static inline int io_run_local_work_locked(struct io_ring_ctx *ctx,
 
 	if (llist_empty(&ctx->work_llist))
 		return 0;
-	return __io_run_local_work(ctx, &ts, min_events);
+	return __io_run_local_work(ctx, &ts, min_events, NULL);
 }
 
-static int io_run_local_work(struct io_ring_ctx *ctx, int min_events)
+static int io_run_local_work(struct io_ring_ctx *ctx, int min_events,
+			      struct io_wait_queue *waitq)
 {
 	struct io_tw_state ts = {};
 	int ret;
 
 	mutex_lock(&ctx->uring_lock);
-	ret = __io_run_local_work(ctx, &ts, min_events);
+	ret = __io_run_local_work(ctx, &ts, min_events, waitq);
 	mutex_unlock(&ctx->uring_lock);
 	return ret;
 }
@@ -2643,7 +2656,7 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
 {
 	if (!llist_empty(&ctx->work_llist)) {
 		__set_current_state(TASK_RUNNING);
-		if (io_run_local_work(ctx, INT_MAX) > 0)
+		if (io_run_local_work(ctx, INT_MAX, NULL) > 0)
 			return 0;
 	}
 	if (io_run_task_work() > 0)
@@ -2806,7 +2819,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags,
 	if (!io_allowed_run_tw(ctx))
 		return -EEXIST;
 	if (!llist_empty(&ctx->work_llist))
-		io_run_local_work(ctx, min_events);
+		io_run_local_work(ctx, min_events, NULL);
 	io_run_task_work();
 
 	if (unlikely(test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)))
@@ -2877,7 +2890,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags,
 		 * now rather than let the caller do another wait loop.
 		 */
 		if (!llist_empty(&ctx->work_llist))
-			io_run_local_work(ctx, nr_wait);
+			io_run_local_work(ctx, nr_wait, &iowq);
 		io_run_task_work();
 
 		/*
@@ -3389,7 +3402,7 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
 
 	if ((ctx->flags & IORING_SETUP_DEFER_TASKRUN) &&
 	    io_allowed_defer_tw_run(ctx))
-		ret |= io_run_local_work(ctx, INT_MAX) > 0;
+		ret |= io_run_local_work(ctx, INT_MAX, NULL) > 0;
 	ret |= io_cancel_defer_files(ctx, tctx, cancel_all);
 	mutex_lock(&ctx->uring_lock);
 	ret |= io_poll_remove_all(ctx, tctx, cancel_all);
diff --git a/io_uring/register.c b/io_uring/register.c
index f87ec7b773bd..5462c49bebd3 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -92,7 +92,8 @@ static int io_update_ioset(struct io_ring_ctx *ctx,
 {
 	if (!(ctx->flags & IORING_SETUP_IOSET))
 		return -EINVAL;
-	if (reg->flags & ~IOSQE_SET_F_HINT_IGNORE_INLINE)
+	if (reg->flags & ~(IOSQE_SET_F_HINT_IGNORE_INLINE |
+			   IOSQE_SET_F_HINT_SILENT))
 		return -EINVAL;
 	if (reg->__resv[0] || reg->__resv[1] || reg->__resv[2])
 		return -EINVAL;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-11-10 14:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-10 14:56 [RFC 0/3] request parameter set api and wait termination tuning Pavel Begunkov
2024-11-10 14:56 ` [RFC 1/3] io_uring: introduce request parameter sets Pavel Begunkov
2024-11-10 14:56 ` [RFC 2/3] io_uring: add support for ignoring inline completions for waits Pavel Begunkov
2024-11-10 14:56 ` [RFC 3/3] io_uring: allow waiting loop to ignore some CQEs Pavel Begunkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox