* [PATCHSET REBASE 01/10] io_uring: rearrange defer list checks
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 02/10] io_uring: don't iterate cq wait fast path Pavel Begunkov
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
There should be nothing in the ->work_llist for non DEFER_TASKRUN rings,
so we can skip flag checks and test the list emptiness directly. Also
move it out of io_run_local_work() for inlining.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 3 ---
io_uring/io_uring.h | 2 +-
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index fefffb02e6b0..cddfcfddcb52 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1338,9 +1338,6 @@ int io_run_local_work(struct io_ring_ctx *ctx)
bool locked;
int ret;
- if (llist_empty(&ctx->work_llist))
- return 0;
-
__set_current_state(TASK_RUNNING);
locked = mutex_trylock(&ctx->uring_lock);
ret = __io_run_local_work(ctx, &locked);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index e9f0d41ebb99..46c0f765a77a 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -274,7 +274,7 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx)
int ret = 0;
int ret2;
- if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
+ if (!llist_empty(&ctx->work_llist))
ret = io_run_local_work(ctx);
/* want to run this after in case more is added */
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 02/10] io_uring: don't iterate cq wait fast path
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 01/10] io_uring: rearrange defer list checks Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 03/10] io_uring: kill io_run_task_work_ctx Pavel Begunkov
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
Task work runners keep running until all queues tw items are exhausted.
It's also rare for defer tw to queue normal tw and vise versa. Taking it
into account, there is only a dim chance that further iterating the
io_cqring_wait() fast path will get us anything and so we can remove
the loop there.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index cddfcfddcb52..ec800a8bed28 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2507,18 +2507,16 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
if (!io_allowed_run_tw(ctx))
return -EEXIST;
-
- do {
- /* always run at least 1 task work to process local work */
- ret = io_run_task_work_ctx(ctx);
+ if (!llist_empty(&ctx->work_llist)) {
+ ret = io_run_local_work(ctx);
if (ret < 0)
return ret;
- io_cqring_overflow_flush(ctx);
-
- /* if user messes with these they will just get an early return */
- if (__io_cqring_events_user(ctx) >= min_events)
- return 0;
- } while (ret > 0);
+ }
+ io_run_task_work();
+ io_cqring_overflow_flush(ctx);
+ /* if user messes with these they will just get an early return */
+ if (__io_cqring_events_user(ctx) >= min_events)
+ return 0;
if (sig) {
#ifdef CONFIG_COMPAT
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 03/10] io_uring: kill io_run_task_work_ctx
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 01/10] io_uring: rearrange defer list checks Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 02/10] io_uring: don't iterate cq wait fast path Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 04/10] io_uring: move defer tw task checks Pavel Begunkov
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
There is only one user of io_run_task_work_ctx(), inline it.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 6 +++++-
io_uring/io_uring.h | 20 --------------------
2 files changed, 5 insertions(+), 21 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index ec800a8bed28..bf6f9777d165 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2452,7 +2452,11 @@ static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode,
int io_run_task_work_sig(struct io_ring_ctx *ctx)
{
- if (io_run_task_work_ctx(ctx) > 0)
+ if (!llist_empty(&ctx->work_llist)) {
+ if (io_run_local_work(ctx) > 0)
+ return 1;
+ }
+ if (io_run_task_work() > 0)
return 1;
if (task_sigpending(current))
return -EINTR;
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 46c0f765a77a..8a5c3affd724 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -269,26 +269,6 @@ static inline bool io_task_work_pending(struct io_ring_ctx *ctx)
return task_work_pending(current) || !wq_list_empty(&ctx->work_llist);
}
-static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx)
-{
- int ret = 0;
- int ret2;
-
- if (!llist_empty(&ctx->work_llist))
- ret = io_run_local_work(ctx);
-
- /* want to run this after in case more is added */
- ret2 = io_run_task_work();
-
- /* Try propagate error in favour of if tasks were run,
- * but still make sure to run them if requested
- */
- if (ret >= 0)
- ret += ret2;
-
- return ret;
-}
-
static inline int io_run_local_work_locked(struct io_ring_ctx *ctx)
{
bool locked;
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 04/10] io_uring: move defer tw task checks
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (2 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 03/10] io_uring: kill io_run_task_work_ctx Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 05/10] io_uring: parse check_cq out of wq waiting Pavel Begunkov
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
Most places that want to run local tw explicitly and in advance check if
they are allowed to do so. Don't rely on a similar check in
__io_run_local_work(), leave it as a just-in-case warning and make sure
callers checks capabilities themselves.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 15 ++++++---------
io_uring/io_uring.h | 5 +++++
2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index bf6f9777d165..3bb3e9889717 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1296,14 +1296,13 @@ int __io_run_local_work(struct io_ring_ctx *ctx, bool *locked)
struct llist_node *node;
struct llist_node fake;
struct llist_node *current_final = NULL;
- int ret;
+ int ret = 0;
unsigned int loops = 1;
- if (unlikely(ctx->submitter_task != current))
+ if (WARN_ON_ONCE(ctx->submitter_task != current))
return -EEXIST;
node = io_llist_xchg(&ctx->work_llist, &fake);
- ret = 0;
again:
while (node != current_final) {
struct llist_node *next = node->next;
@@ -2511,11 +2510,8 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
if (!io_allowed_run_tw(ctx))
return -EEXIST;
- if (!llist_empty(&ctx->work_llist)) {
- ret = io_run_local_work(ctx);
- if (ret < 0)
- return ret;
- }
+ if (!llist_empty(&ctx->work_llist))
+ io_run_local_work(ctx);
io_run_task_work();
io_cqring_overflow_flush(ctx);
/* if user messes with these they will just get an early return */
@@ -3052,7 +3048,8 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
}
}
- if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
+ if ((ctx->flags & IORING_SETUP_DEFER_TASKRUN) &&
+ io_allowed_defer_tw_run(ctx))
ret |= io_run_local_work(ctx) > 0;
ret |= io_cancel_defer_files(ctx, task, cancel_all);
mutex_lock(&ctx->uring_lock);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 8a5c3affd724..9b7baeff5a1c 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -352,6 +352,11 @@ static inline struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
return container_of(node, struct io_kiocb, comp_list);
}
+static inline bool io_allowed_defer_tw_run(struct io_ring_ctx *ctx)
+{
+ return likely(ctx->submitter_task == current);
+}
+
static inline bool io_allowed_run_tw(struct io_ring_ctx *ctx)
{
return likely(!(ctx->flags & IORING_SETUP_DEFER_TASKRUN) ||
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 05/10] io_uring: parse check_cq out of wq waiting
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (3 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 04/10] io_uring: move defer tw task checks Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 06/10] io_uring: mimimise io_cqring_wait_schedule Pavel Begunkov
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
We already avoid flushing overflows in io_cqring_wait_schedule() but
only return an error for the outer loop to handle it. Minimise it even
further by moving all ->check_cq parsing there.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 3bb3e9889717..067e3577ac9b 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2468,21 +2468,13 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
ktime_t *timeout)
{
int ret;
- unsigned long check_cq;
+ if (unlikely(READ_ONCE(ctx->check_cq)))
+ return 1;
/* make sure we run task_work before checking for signals */
ret = io_run_task_work_sig(ctx);
if (ret || io_should_wake(iowq))
return ret;
-
- check_cq = READ_ONCE(ctx->check_cq);
- if (unlikely(check_cq)) {
- /* let the caller flush overflows, retry */
- if (check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT))
- return 1;
- if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
- return -EBADR;
- }
if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
return -ETIME;
@@ -2548,13 +2540,25 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
trace_io_uring_cqring_wait(ctx, min_events);
do {
- if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) {
- finish_wait(&ctx->cq_wait, &iowq.wq);
- io_cqring_do_overflow_flush(ctx);
- }
+ unsigned long check_cq;
+
prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
TASK_INTERRUPTIBLE);
ret = io_cqring_wait_schedule(ctx, &iowq, &timeout);
+
+ check_cq = READ_ONCE(ctx->check_cq);
+ if (unlikely(check_cq)) {
+ /* let the caller flush overflows, retry */
+ if (check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT)) {
+ finish_wait(&ctx->cq_wait, &iowq.wq);
+ io_cqring_do_overflow_flush(ctx);
+ }
+ if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) {
+ ret = -EBADR;
+ break;
+ }
+ }
+
if (__io_cqring_events_user(ctx) >= min_events)
break;
cond_resched();
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 06/10] io_uring: mimimise io_cqring_wait_schedule
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (4 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 05/10] io_uring: parse check_cq out of wq waiting Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 07/10] io_uring: simplify io_has_work Pavel Begunkov
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
io_cqring_wait_schedule() is called after we started waiting on the cq
wq and set the state to TASK_INTERRUPTIBLE, for that reason we have to
constantly worry whether we has returned the state back to running or
not. Leave only quick checks in io_cqring_wait_schedule() and move the
rest including running task work to the callers. Note, we run tw in the
loop after the sched checks because of the fast path in the beginning of
the function.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 39 +++++++++++++++++++++++----------------
1 file changed, 23 insertions(+), 16 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 067e3577ac9b..b4ca238cbd63 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2467,24 +2467,19 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
ktime_t *timeout)
{
- int ret;
-
if (unlikely(READ_ONCE(ctx->check_cq)))
return 1;
- /* make sure we run task_work before checking for signals */
- ret = io_run_task_work_sig(ctx);
- if (ret || io_should_wake(iowq))
- return ret;
+ if (unlikely(!llist_empty(&ctx->work_llist)))
+ return 1;
+ if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL)))
+ return 1;
+ if (unlikely(task_sigpending(current)))
+ return -EINTR;
+ if (unlikely(io_should_wake(iowq)))
+ return 0;
if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
return -ETIME;
-
- /*
- * Run task_work after scheduling. If we got woken because of
- * task_work being processed, run it now rather than let the caller
- * do another wait loop.
- */
- ret = io_run_task_work_sig(ctx);
- return ret < 0 ? ret : 1;
+ return 0;
}
/*
@@ -2545,6 +2540,16 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
TASK_INTERRUPTIBLE);
ret = io_cqring_wait_schedule(ctx, &iowq, &timeout);
+ if (ret < 0)
+ break;
+ /*
+ * Run task_work after scheduling and before io_should_wake().
+ * If we got woken because of task_work being processed, run it
+ * now rather than let the caller do another wait loop.
+ */
+ io_run_task_work();
+ if (!llist_empty(&ctx->work_llist))
+ io_run_local_work(ctx);
check_cq = READ_ONCE(ctx->check_cq);
if (unlikely(check_cq)) {
@@ -2559,10 +2564,12 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
}
}
- if (__io_cqring_events_user(ctx) >= min_events)
+ if (io_should_wake(&iowq)) {
+ ret = 0;
break;
+ }
cond_resched();
- } while (ret > 0);
+ } while (1);
finish_wait(&ctx->cq_wait, &iowq.wq);
restore_saved_sigmask_unless(ret == -EINTR);
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 07/10] io_uring: simplify io_has_work
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (5 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 06/10] io_uring: mimimise io_cqring_wait_schedule Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 08/10] io_uring: set TASK_RUNNING right after schedule Pavel Begunkov
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
->work_llist should never be non-empty for a non DEFER_TASKRUN ring, so
we can safely skip checking the flag.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index b4ca238cbd63..2376adce9570 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2416,8 +2416,7 @@ struct io_wait_queue {
static inline bool io_has_work(struct io_ring_ctx *ctx)
{
return test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq) ||
- ((ctx->flags & IORING_SETUP_DEFER_TASKRUN) &&
- !llist_empty(&ctx->work_llist));
+ !llist_empty(&ctx->work_llist);
}
static inline bool io_should_wake(struct io_wait_queue *iowq)
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 08/10] io_uring: set TASK_RUNNING right after schedule
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (6 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 07/10] io_uring: simplify io_has_work Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 09/10] io_uring: optimise non-timeout waiting Pavel Begunkov
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
Instead of constantly watching that the state of the task is running
before executing tw or taking locks in io_cqring_wait(), switch it back
to TASK_RUNNING immediately.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
io_uring/io_uring.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 2376adce9570..54ec0106ab83 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2541,6 +2541,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
ret = io_cqring_wait_schedule(ctx, &iowq, &timeout);
if (ret < 0)
break;
+ __set_current_state(TASK_RUNNING);
/*
* Run task_work after scheduling and before io_should_wake().
* If we got woken because of task_work being processed, run it
@@ -2553,10 +2554,8 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
check_cq = READ_ONCE(ctx->check_cq);
if (unlikely(check_cq)) {
/* let the caller flush overflows, retry */
- if (check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT)) {
- finish_wait(&ctx->cq_wait, &iowq.wq);
+ if (check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT))
io_cqring_do_overflow_flush(ctx);
- }
if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) {
ret = -EBADR;
break;
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 09/10] io_uring: optimise non-timeout waiting
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (7 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 08/10] io_uring: set TASK_RUNNING right after schedule Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 11:22 ` [PATCHSET REBASE 10/10] io_uring: keep timeout in io_wait_queue Pavel Begunkov
2023-01-05 15:28 ` [PATCHSET REBASE 00/10] cq wait refactoring rebase Jens Axboe
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
Unlike the jiffy scheduling version, schedule_hrtimeout() jumps a few
functions before getting into schedule() even if there is no actual
timeout needed. Some tests showed that it takes up to 1% of CPU cycles.
Signed-off-by: Pavel Begunkov <[email protected]>
---
io_uring/io_uring.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 54ec0106ab83..420b022f6c31 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2476,7 +2476,9 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
return -EINTR;
if (unlikely(io_should_wake(iowq)))
return 0;
- if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
+ if (*timeout == KTIME_MAX)
+ schedule();
+ else if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
return -ETIME;
return 0;
}
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET REBASE 10/10] io_uring: keep timeout in io_wait_queue
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (8 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 09/10] io_uring: optimise non-timeout waiting Pavel Begunkov
@ 2023-01-05 11:22 ` Pavel Begunkov
2023-01-05 15:28 ` [PATCHSET REBASE 00/10] cq wait refactoring rebase Jens Axboe
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2023-01-05 11:22 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence
Move waiting timeout into io_wait_queue
Signed-off-by: Pavel Begunkov <[email protected]>
---
io_uring/io_uring.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 420b022f6c31..a44b3b5813df 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2411,6 +2411,7 @@ struct io_wait_queue {
struct io_ring_ctx *ctx;
unsigned cq_tail;
unsigned nr_timeouts;
+ ktime_t timeout;
};
static inline bool io_has_work(struct io_ring_ctx *ctx)
@@ -2463,8 +2464,7 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
/* when returns >0, the caller should retry */
static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
- struct io_wait_queue *iowq,
- ktime_t *timeout)
+ struct io_wait_queue *iowq)
{
if (unlikely(READ_ONCE(ctx->check_cq)))
return 1;
@@ -2476,9 +2476,9 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
return -EINTR;
if (unlikely(io_should_wake(iowq)))
return 0;
- if (*timeout == KTIME_MAX)
+ if (iowq->timeout == KTIME_MAX)
schedule();
- else if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
+ else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
return -ETIME;
return 0;
}
@@ -2493,7 +2493,6 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
{
struct io_wait_queue iowq;
struct io_rings *rings = ctx->rings;
- ktime_t timeout = KTIME_MAX;
int ret;
if (!io_allowed_run_tw(ctx))
@@ -2519,20 +2518,21 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
return ret;
}
- if (uts) {
- struct timespec64 ts;
-
- if (get_timespec64(&ts, uts))
- return -EFAULT;
- timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
- }
-
init_waitqueue_func_entry(&iowq.wq, io_wake_function);
iowq.wq.private = current;
INIT_LIST_HEAD(&iowq.wq.entry);
iowq.ctx = ctx;
iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts);
iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events;
+ iowq.timeout = KTIME_MAX;
+
+ if (uts) {
+ struct timespec64 ts;
+
+ if (get_timespec64(&ts, uts))
+ return -EFAULT;
+ iowq.timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
+ }
trace_io_uring_cqring_wait(ctx, min_events);
do {
@@ -2540,7 +2540,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
TASK_INTERRUPTIBLE);
- ret = io_cqring_wait_schedule(ctx, &iowq, &timeout);
+ ret = io_cqring_wait_schedule(ctx, &iowq);
if (ret < 0)
break;
__set_current_state(TASK_RUNNING);
--
2.38.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCHSET REBASE 00/10] cq wait refactoring rebase
2023-01-05 11:22 [PATCHSET REBASE 00/10] cq wait refactoring rebase Pavel Begunkov
` (9 preceding siblings ...)
2023-01-05 11:22 ` [PATCHSET REBASE 10/10] io_uring: keep timeout in io_wait_queue Pavel Begunkov
@ 2023-01-05 15:28 ` Jens Axboe
10 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2023-01-05 15:28 UTC (permalink / raw)
To: io-uring, Pavel Begunkov
On Thu, 05 Jan 2023 11:22:19 +0000, Pavel Begunkov wrote:
> Rebase of 6.3, i.e. recent CQ waiting refactoring series on top of
> just sent 6.2 patch ("io_uring: fix CQ waiting timeout handling").
>
> Apart from that there are 2 more patches on top, 9/10 squeezes
> an extra 1% of perf for one of my tests.
>
> Pavel Begunkov (10):
> io_uring: rearrange defer list checks
> io_uring: don't iterate cq wait fast path
> io_uring: kill io_run_task_work_ctx
> io_uring: move defer tw task checks
> io_uring: parse check_cq out of wq waiting
> io_uring: mimimise io_cqring_wait_schedule
> io_uring: simplify io_has_work
> io_uring: set TASK_RUNNING right after schedule
> io_uring: optimise non-timeout waiting
> io_uring: keep timeout in io_wait_queue
>
> [...]
Applied, thanks!
[01/10] io_uring: rearrange defer list checks
commit: d6d505fb8cafd1d78ca438fcd2d9a460c9695a1a
[02/10] io_uring: don't iterate cq wait fast path
commit: ba5636ba85eb7b01bcbb03b8855ec444f78779ed
[03/10] io_uring: kill io_run_task_work_ctx
commit: a95f8ea66f4804c9faeb51bf496008d09d43bf41
[04/10] io_uring: move defer tw task checks
commit: 07ef9bf92577095d2e6612b010267f6bd5139936
[05/10] io_uring: parse check_cq out of wq waiting
commit: 4a42b391367e7df55f8fd5f7a4e42cdf796ef36a
[06/10] io_uring: mimimise io_cqring_wait_schedule
commit: 4fbb75ef0d9ed027d098394227596d91c59e246a
[07/10] io_uring: simplify io_has_work
commit: 8d1f973f5025d62f88a9b3e4406c74c857ef49de
[08/10] io_uring: set TASK_RUNNING right after schedule
commit: 2fc7bfa59e195d2a1fed638a35dde7ab794f0178
[09/10] io_uring: optimise non-timeout waiting
commit: dc53cdbd5a63ef78c1acc34b2fda535bbcffd5c5
[10/10] io_uring: keep timeout in io_wait_queue
commit: a367ffbe03a6da8d97f5f8057114bada90ccaea8
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread