* [PATCH 0/4] sqpoll fixes @ 2021-03-11 23:29 Pavel Begunkov 2021-03-11 23:29 ` [PATCH 1/4] io_uring: cancel deferred requests in try_cancel Pavel Begunkov ` (4 more replies) 0 siblings, 5 replies; 10+ messages in thread From: Pavel Begunkov @ 2021-03-11 23:29 UTC (permalink / raw) To: Jens Axboe, io-uring 1-3 are relatively easy. 2/3 is rather a cleanup, but it's better to throw it out of the equation, makes life easier. 4/4 removes the park/unpark dance for cancellations, and fixes a couple of issues (will send one test later). Signals in io_sq_thread() add troubles adding more cases to think about and care about races b/w dying SQPOLL task, dying ctxs and going away userspace tasks. Pavel Begunkov (4): io_uring: cancel deferred requests in try_cancel io_uring: remove useless ->startup completion io_uring: prevent racy sqd->thread checks io_uring: cancel sqpoll via task_work fs/io_uring.c | 195 +++++++++++++++++++++++++------------------------- 1 file changed, 99 insertions(+), 96 deletions(-) -- 2.24.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/4] io_uring: cancel deferred requests in try_cancel 2021-03-11 23:29 [PATCH 0/4] sqpoll fixes Pavel Begunkov @ 2021-03-11 23:29 ` Pavel Begunkov 2021-03-11 23:29 ` [PATCH 2/4] io_uring: remove useless ->startup completion Pavel Begunkov ` (3 subsequent siblings) 4 siblings, 0 replies; 10+ messages in thread From: Pavel Begunkov @ 2021-03-11 23:29 UTC (permalink / raw) To: Jens Axboe, io-uring As io_uring_cancel_files() and others let SQO to run between io_uring_try_cancel_requests(), SQO may generate new deferred requests, so it's safer to try to cancel them in it. Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 49f85f49e1c3..56f3d8f408c9 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8577,11 +8577,11 @@ static bool io_cancel_task_cb(struct io_wq_work *work, void *data) return ret; } -static void io_cancel_defer_files(struct io_ring_ctx *ctx, +static bool io_cancel_defer_files(struct io_ring_ctx *ctx, struct task_struct *task, struct files_struct *files) { - struct io_defer_entry *de = NULL; + struct io_defer_entry *de; LIST_HEAD(list); spin_lock_irq(&ctx->completion_lock); @@ -8592,6 +8592,8 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx, } } spin_unlock_irq(&ctx->completion_lock); + if (list_empty(&list)) + return false; while (!list_empty(&list)) { de = list_first_entry(&list, struct io_defer_entry, list); @@ -8601,6 +8603,7 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx, io_req_complete(de->req, -ECANCELED); kfree(de); } + return true; } static bool io_cancel_ctx_cb(struct io_wq_work *work, void *data) @@ -8666,6 +8669,7 @@ static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx, } } + ret |= io_cancel_defer_files(ctx, task, files); ret |= io_poll_remove_all(ctx, task, files); ret |= io_kill_timeouts(ctx, task, files); ret |= io_run_task_work(); @@ -8734,8 +8738,6 @@ static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx, atomic_inc(&task->io_uring->in_idle); } - io_cancel_defer_files(ctx, task, files); - io_uring_cancel_files(ctx, task, files); if (!files) io_uring_try_cancel_requests(ctx, task, NULL); -- 2.24.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/4] io_uring: remove useless ->startup completion 2021-03-11 23:29 [PATCH 0/4] sqpoll fixes Pavel Begunkov 2021-03-11 23:29 ` [PATCH 1/4] io_uring: cancel deferred requests in try_cancel Pavel Begunkov @ 2021-03-11 23:29 ` Pavel Begunkov 2021-03-11 23:29 ` [PATCH 3/4] io_uring: prevent racy sqd->thread checks Pavel Begunkov ` (2 subsequent siblings) 4 siblings, 0 replies; 10+ messages in thread From: Pavel Begunkov @ 2021-03-11 23:29 UTC (permalink / raw) To: Jens Axboe, io-uring We always do complete(&sqd->startup) almost right after sqd->thread creation, either in the success path or in io_sq_thread_finish(). It's specifically created not started for us to be able to set some stuff like sqd->thread and io_uring_alloc_task_context() before following right after wake_up_new_task(). Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 56f3d8f408c9..6349374d715d 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -272,7 +272,6 @@ struct io_sq_data { pid_t task_tgid; unsigned long state; - struct completion startup; struct completion exited; }; @@ -6656,8 +6655,6 @@ static int io_sq_thread(void *data) set_cpus_allowed_ptr(current, cpu_online_mask); current->flags |= PF_NO_SETAFFINITY; - wait_for_completion(&sqd->startup); - down_read(&sqd->rw_lock); while (!test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state)) { @@ -7080,7 +7077,6 @@ static void io_sq_thread_finish(struct io_ring_ctx *ctx) struct io_sq_data *sqd = ctx->sq_data; if (sqd) { - complete(&sqd->startup); io_sq_thread_park(sqd); list_del(&ctx->sqd_list); io_sqd_update_thread_idle(sqd); @@ -7144,7 +7140,6 @@ static struct io_sq_data *io_get_sq_data(struct io_uring_params *p) INIT_LIST_HEAD(&sqd->ctx_list); init_rwsem(&sqd->rw_lock); init_waitqueue_head(&sqd->wait); - init_completion(&sqd->startup); init_completion(&sqd->exited); return sqd; } @@ -7856,7 +7851,6 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx, wake_up_new_task(tsk); if (ret) goto err; - complete(&sqd->startup); } else if (p->flags & IORING_SETUP_SQ_AFF) { /* Can't have SQ_AFF without SQPOLL */ ret = -EINVAL; -- 2.24.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/4] io_uring: prevent racy sqd->thread checks 2021-03-11 23:29 [PATCH 0/4] sqpoll fixes Pavel Begunkov 2021-03-11 23:29 ` [PATCH 1/4] io_uring: cancel deferred requests in try_cancel Pavel Begunkov 2021-03-11 23:29 ` [PATCH 2/4] io_uring: remove useless ->startup completion Pavel Begunkov @ 2021-03-11 23:29 ` Pavel Begunkov 2021-03-11 23:29 ` [PATCH 4/4] io_uring: cancel sqpoll via task_work Pavel Begunkov 2021-03-12 18:19 ` [PATCH 0/4] sqpoll fixes Jens Axboe 4 siblings, 0 replies; 10+ messages in thread From: Pavel Begunkov @ 2021-03-11 23:29 UTC (permalink / raw) To: Jens Axboe, io-uring SQPOLL thread to which we're trying to attach may be going away, it's not nice but a more serious problem is if io_sq_offload_create() sees sqd->thread==NULL, and tries to init it with a new thread. There are tons of ways it can be exploited or fail. Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 6349374d715d..cdec59510433 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7119,14 +7119,18 @@ static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p) return sqd; } -static struct io_sq_data *io_get_sq_data(struct io_uring_params *p) +static struct io_sq_data *io_get_sq_data(struct io_uring_params *p, + bool *attached) { struct io_sq_data *sqd; + *attached = false; if (p->flags & IORING_SETUP_ATTACH_WQ) { sqd = io_attach_sq_data(p); - if (!IS_ERR(sqd)) + if (!IS_ERR(sqd)) { + *attached = true; return sqd; + } /* fall through for EPERM case, setup new sqd/task */ if (PTR_ERR(sqd) != -EPERM) return sqd; @@ -7799,12 +7803,13 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx, if (ctx->flags & IORING_SETUP_SQPOLL) { struct task_struct *tsk; struct io_sq_data *sqd; + bool attached; ret = -EPERM; if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE)) goto err; - sqd = io_get_sq_data(p); + sqd = io_get_sq_data(p, &attached); if (IS_ERR(sqd)) { ret = PTR_ERR(sqd); goto err; @@ -7816,13 +7821,24 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx, if (!ctx->sq_thread_idle) ctx->sq_thread_idle = HZ; + ret = 0; io_sq_thread_park(sqd); - list_add(&ctx->sqd_list, &sqd->ctx_list); - io_sqd_update_thread_idle(sqd); + /* don't attach to a dying SQPOLL thread, would be racy */ + if (attached && !sqd->thread) { + ret = -ENXIO; + } else { + list_add(&ctx->sqd_list, &sqd->ctx_list); + io_sqd_update_thread_idle(sqd); + } io_sq_thread_unpark(sqd); - if (sqd->thread) + if (ret < 0) { + io_put_sq_data(sqd); + ctx->sq_data = NULL; + return ret; + } else if (attached) { return 0; + } if (p->flags & IORING_SETUP_SQ_AFF) { int cpu = p->sq_thread_cpu; -- 2.24.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/4] io_uring: cancel sqpoll via task_work 2021-03-11 23:29 [PATCH 0/4] sqpoll fixes Pavel Begunkov ` (2 preceding siblings ...) 2021-03-11 23:29 ` [PATCH 3/4] io_uring: prevent racy sqd->thread checks Pavel Begunkov @ 2021-03-11 23:29 ` Pavel Begunkov 2021-03-12 19:35 ` Pavel Begunkov 2021-03-12 18:19 ` [PATCH 0/4] sqpoll fixes Jens Axboe 4 siblings, 1 reply; 10+ messages in thread From: Pavel Begunkov @ 2021-03-11 23:29 UTC (permalink / raw) To: Jens Axboe, io-uring 1) The first problem is io_uring_cancel_sqpoll() -> io_uring_cancel_task_requests() basically doing park(); park(); and so hanging. 2) Another one is more subtle, when the master task is doing cancellations, but SQPOLL task submits in-between the end of the cancellation but before finish() requests taking a ref to the ctx, and so eternally locking it up. 3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and same io_uring_cancel_sqpoll() from the owner task, they race for tctx->wait events. And there probably more of them. Instead do SQPOLL cancellations from within SQPOLL task context via task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal park()/unpark() during cancellation, which is ugly, subtle and anyway doesn't allow to do io_run_task_work() properly. io_uring_cancel_sqpoll() is called only from SQPOLL task context and under sqd locking, so all parking is removed from there. And so, io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by SQPOLL task, and that spare us from some headache. Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll, which is not used anymore. Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 151 ++++++++++++++++++++++++-------------------------- 1 file changed, 71 insertions(+), 80 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index cdec59510433..70286b393c0e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6665,6 +6665,7 @@ static int io_sq_thread(void *data) up_read(&sqd->rw_lock); cond_resched(); down_read(&sqd->rw_lock); + io_run_task_work(); timeout = jiffies + sqd->sq_thread_idle; continue; } @@ -6720,18 +6721,22 @@ static int io_sq_thread(void *data) finish_wait(&sqd->wait, &wait); timeout = jiffies + sqd->sq_thread_idle; } - - list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) - io_uring_cancel_sqpoll(ctx); up_read(&sqd->rw_lock); - + down_write(&sqd->rw_lock); + /* + * someone may have parked and added a cancellation task_work, run + * it first because we don't want it in io_uring_cancel_sqpoll() + */ io_run_task_work(); - down_write(&sqd->rw_lock); + list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) + io_uring_cancel_sqpoll(ctx); sqd->thread = NULL; list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) io_ring_set_wakeup_flag(ctx); up_write(&sqd->rw_lock); + + io_run_task_work(); complete(&sqd->exited); do_exit(0); } @@ -7033,8 +7038,8 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx) static void io_sq_thread_unpark(struct io_sq_data *sqd) __releases(&sqd->rw_lock) { - if (sqd->thread == current) - return; + WARN_ON_ONCE(sqd->thread == current); + clear_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state); up_write(&sqd->rw_lock); } @@ -7042,8 +7047,8 @@ static void io_sq_thread_unpark(struct io_sq_data *sqd) static void io_sq_thread_park(struct io_sq_data *sqd) __acquires(&sqd->rw_lock) { - if (sqd->thread == current) - return; + WARN_ON_ONCE(sqd->thread == current); + set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state); down_write(&sqd->rw_lock); /* set again for consistency, in case concurrent parks are happening */ @@ -7054,8 +7059,8 @@ static void io_sq_thread_park(struct io_sq_data *sqd) static void io_sq_thread_stop(struct io_sq_data *sqd) { - if (test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state)) - return; + WARN_ON_ONCE(sqd->thread == current); + down_write(&sqd->rw_lock); set_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state); if (sqd->thread) @@ -7078,7 +7083,7 @@ static void io_sq_thread_finish(struct io_ring_ctx *ctx) if (sqd) { io_sq_thread_park(sqd); - list_del(&ctx->sqd_list); + list_del_init(&ctx->sqd_list); io_sqd_update_thread_idle(sqd); io_sq_thread_unpark(sqd); @@ -7760,7 +7765,6 @@ static int io_uring_alloc_task_context(struct task_struct *task, init_waitqueue_head(&tctx->wait); tctx->last = NULL; atomic_set(&tctx->in_idle, 0); - tctx->sqpoll = false; task->io_uring = tctx; spin_lock_init(&tctx->task_lock); INIT_WQ_LIST(&tctx->task_list); @@ -8719,43 +8723,12 @@ static void io_uring_cancel_files(struct io_ring_ctx *ctx, io_uring_try_cancel_requests(ctx, task, files); - if (ctx->sq_data) - io_sq_thread_unpark(ctx->sq_data); prepare_to_wait(&task->io_uring->wait, &wait, TASK_UNINTERRUPTIBLE); if (inflight == io_uring_count_inflight(ctx, task, files)) schedule(); finish_wait(&task->io_uring->wait, &wait); - if (ctx->sq_data) - io_sq_thread_park(ctx->sq_data); - } -} - -/* - * We need to iteratively cancel requests, in case a request has dependent - * hard links. These persist even for failure of cancelations, hence keep - * looping until none are found. - */ -static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx, - struct files_struct *files) -{ - struct task_struct *task = current; - - if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) { - io_sq_thread_park(ctx->sq_data); - task = ctx->sq_data->thread; - if (task) - atomic_inc(&task->io_uring->in_idle); } - - io_uring_cancel_files(ctx, task, files); - if (!files) - io_uring_try_cancel_requests(ctx, task, NULL); - - if (task) - atomic_dec(&task->io_uring->in_idle); - if (ctx->sq_data) - io_sq_thread_unpark(ctx->sq_data); } /* @@ -8796,15 +8769,6 @@ static int io_uring_add_task_file(struct io_ring_ctx *ctx) } tctx->last = ctx; } - - /* - * This is race safe in that the task itself is doing this, hence it - * cannot be going through the exit/cancel paths at the same time. - * This cannot be modified while exit/cancel is running. - */ - if (!tctx->sqpoll && (ctx->flags & IORING_SETUP_SQPOLL)) - tctx->sqpoll = true; - return 0; } @@ -8847,6 +8811,44 @@ static void io_uring_clean_tctx(struct io_uring_task *tctx) } } +static s64 tctx_inflight(struct io_uring_task *tctx) +{ + return percpu_counter_sum(&tctx->inflight); +} + +static void io_sqpoll_cancel_cb(struct callback_head *cb) +{ + struct io_tctx_exit *work = container_of(cb, struct io_tctx_exit, task_work); + struct io_ring_ctx *ctx = work->ctx; + struct io_sq_data *sqd = ctx->sq_data; + + if (sqd->thread) + io_uring_cancel_sqpoll(ctx); + complete(&work->completion); +} + +static void io_sqpoll_cancel_sync(struct io_ring_ctx *ctx) +{ + struct io_sq_data *sqd = ctx->sq_data; + struct io_tctx_exit work = { .ctx = ctx, }; + struct task_struct *task; + + io_sq_thread_park(sqd); + list_del_init(&ctx->sqd_list); + io_sqd_update_thread_idle(sqd); + task = sqd->thread; + if (task) { + init_completion(&work.completion); + init_task_work(&work.task_work, io_sqpoll_cancel_cb); + WARN_ON_ONCE(task_work_add(task, &work.task_work, TWA_SIGNAL)); + wake_up_process(task); + } + io_sq_thread_unpark(sqd); + + if (task) + wait_for_completion(&work.completion); +} + void __io_uring_files_cancel(struct files_struct *files) { struct io_uring_task *tctx = current->io_uring; @@ -8855,41 +8857,40 @@ void __io_uring_files_cancel(struct files_struct *files) /* make sure overflow events are dropped */ atomic_inc(&tctx->in_idle); - xa_for_each(&tctx->xa, index, node) - io_uring_cancel_task_requests(node->ctx, files); + xa_for_each(&tctx->xa, index, node) { + struct io_ring_ctx *ctx = node->ctx; + + if (ctx->sq_data) { + io_sqpoll_cancel_sync(ctx); + continue; + } + io_uring_cancel_files(ctx, current, files); + if (!files) + io_uring_try_cancel_requests(ctx, current, NULL); + } atomic_dec(&tctx->in_idle); if (files) io_uring_clean_tctx(tctx); } -static s64 tctx_inflight(struct io_uring_task *tctx) -{ - return percpu_counter_sum(&tctx->inflight); -} - +/* should only be called by SQPOLL task */ static void io_uring_cancel_sqpoll(struct io_ring_ctx *ctx) { struct io_sq_data *sqd = ctx->sq_data; - struct io_uring_task *tctx; + struct io_uring_task *tctx = current->io_uring; s64 inflight; DEFINE_WAIT(wait); - if (!sqd) - return; - io_sq_thread_park(sqd); - if (!sqd->thread || !sqd->thread->io_uring) { - io_sq_thread_unpark(sqd); - return; - } - tctx = ctx->sq_data->thread->io_uring; + WARN_ON_ONCE(!sqd || ctx->sq_data->thread != current); + atomic_inc(&tctx->in_idle); do { /* read completions before cancelations */ inflight = tctx_inflight(tctx); if (!inflight) break; - io_uring_cancel_task_requests(ctx, NULL); + io_uring_try_cancel_requests(ctx, current, NULL); prepare_to_wait(&tctx->wait, &wait, TASK_UNINTERRUPTIBLE); /* @@ -8902,7 +8903,6 @@ static void io_uring_cancel_sqpoll(struct io_ring_ctx *ctx) finish_wait(&tctx->wait, &wait); } while (1); atomic_dec(&tctx->in_idle); - io_sq_thread_unpark(sqd); } /* @@ -8917,15 +8917,6 @@ void __io_uring_task_cancel(void) /* make sure overflow events are dropped */ atomic_inc(&tctx->in_idle); - - if (tctx->sqpoll) { - struct io_tctx_node *node; - unsigned long index; - - xa_for_each(&tctx->xa, index, node) - io_uring_cancel_sqpoll(node->ctx); - } - do { /* read completions before cancelations */ inflight = tctx_inflight(tctx); -- 2.24.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] io_uring: cancel sqpoll via task_work 2021-03-11 23:29 ` [PATCH 4/4] io_uring: cancel sqpoll via task_work Pavel Begunkov @ 2021-03-12 19:35 ` Pavel Begunkov 2021-03-12 19:40 ` Jens Axboe 0 siblings, 1 reply; 10+ messages in thread From: Pavel Begunkov @ 2021-03-12 19:35 UTC (permalink / raw) To: Jens Axboe, io-uring On 11/03/2021 23:29, Pavel Begunkov wrote: > 1) The first problem is io_uring_cancel_sqpoll() -> > io_uring_cancel_task_requests() basically doing park(); park(); and so > hanging. > > 2) Another one is more subtle, when the master task is doing cancellations, > but SQPOLL task submits in-between the end of the cancellation but > before finish() requests taking a ref to the ctx, and so eternally > locking it up. > > 3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and > same io_uring_cancel_sqpoll() from the owner task, they race for > tctx->wait events. And there probably more of them. > > Instead do SQPOLL cancellations from within SQPOLL task context via > task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal > park()/unpark() during cancellation, which is ugly, subtle and anyway > doesn't allow to do io_run_task_work() properly.> > io_uring_cancel_sqpoll() is called only from SQPOLL task context and > under sqd locking, so all parking is removed from there. And so, > io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by > SQPOLL task, and that spare us from some headache. > > Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll, > which is not used anymore. Looks, the chunk below somehow slipped from the patch. Not important for 5.12, but can can be folded anyway diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 9761a0ec9f95..c24c62b47745 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -22,7 +22,6 @@ struct io_uring_task { void *io_wq; struct percpu_counter inflight; atomic_t in_idle; - bool sqpoll; spinlock_t task_lock; struct io_wq_work_list task_list; -- Pavel Begunkov ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] io_uring: cancel sqpoll via task_work 2021-03-12 19:35 ` Pavel Begunkov @ 2021-03-12 19:40 ` Jens Axboe 2021-03-12 21:41 ` Pavel Begunkov 0 siblings, 1 reply; 10+ messages in thread From: Jens Axboe @ 2021-03-12 19:40 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 3/12/21 12:35 PM, Pavel Begunkov wrote: > On 11/03/2021 23:29, Pavel Begunkov wrote: >> 1) The first problem is io_uring_cancel_sqpoll() -> >> io_uring_cancel_task_requests() basically doing park(); park(); and so >> hanging. >> >> 2) Another one is more subtle, when the master task is doing cancellations, >> but SQPOLL task submits in-between the end of the cancellation but >> before finish() requests taking a ref to the ctx, and so eternally >> locking it up. >> >> 3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and >> same io_uring_cancel_sqpoll() from the owner task, they race for >> tctx->wait events. And there probably more of them. >> >> Instead do SQPOLL cancellations from within SQPOLL task context via >> task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal >> park()/unpark() during cancellation, which is ugly, subtle and anyway >> doesn't allow to do io_run_task_work() properly.> >> io_uring_cancel_sqpoll() is called only from SQPOLL task context and >> under sqd locking, so all parking is removed from there. And so, >> io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by >> SQPOLL task, and that spare us from some headache. >> >> Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll, >> which is not used anymore. > > > Looks, the chunk below somehow slipped from the patch. Not important > for 5.12, but can can be folded anyway > > diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h > index 9761a0ec9f95..c24c62b47745 100644 > --- a/include/linux/io_uring.h > +++ b/include/linux/io_uring.h > @@ -22,7 +22,6 @@ struct io_uring_task { > void *io_wq; > struct percpu_counter inflight; > atomic_t in_idle; > - bool sqpoll; > > spinlock_t task_lock; > struct io_wq_work_list task_list; Let's do it as a separate patch instead. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] io_uring: cancel sqpoll via task_work 2021-03-12 19:40 ` Jens Axboe @ 2021-03-12 21:41 ` Pavel Begunkov 2021-03-13 2:44 ` Jens Axboe 0 siblings, 1 reply; 10+ messages in thread From: Pavel Begunkov @ 2021-03-12 21:41 UTC (permalink / raw) To: Jens Axboe, io-uring On 12/03/2021 19:40, Jens Axboe wrote: > On 3/12/21 12:35 PM, Pavel Begunkov wrote: >> On 11/03/2021 23:29, Pavel Begunkov wrote: >>> 1) The first problem is io_uring_cancel_sqpoll() -> >>> io_uring_cancel_task_requests() basically doing park(); park(); and so >>> hanging. >>> >>> 2) Another one is more subtle, when the master task is doing cancellations, >>> but SQPOLL task submits in-between the end of the cancellation but >>> before finish() requests taking a ref to the ctx, and so eternally >>> locking it up. >>> >>> 3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and >>> same io_uring_cancel_sqpoll() from the owner task, they race for >>> tctx->wait events. And there probably more of them. >>> >>> Instead do SQPOLL cancellations from within SQPOLL task context via >>> task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal >>> park()/unpark() during cancellation, which is ugly, subtle and anyway >>> doesn't allow to do io_run_task_work() properly.> >>> io_uring_cancel_sqpoll() is called only from SQPOLL task context and >>> under sqd locking, so all parking is removed from there. And so, >>> io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by >>> SQPOLL task, and that spare us from some headache. >>> >>> Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll, >>> which is not used anymore. >> >> >> Looks, the chunk below somehow slipped from the patch. Not important >> for 5.12, but can can be folded anyway >> >> diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h >> index 9761a0ec9f95..c24c62b47745 100644 >> --- a/include/linux/io_uring.h >> +++ b/include/linux/io_uring.h >> @@ -22,7 +22,6 @@ struct io_uring_task { >> void *io_wq; >> struct percpu_counter inflight; >> atomic_t in_idle; >> - bool sqpoll; >> >> spinlock_t task_lock; >> struct io_wq_work_list task_list; > > Let's do it as a separate patch instead. Ok, I'll send it for-5.13 when it's appropriate. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] io_uring: cancel sqpoll via task_work 2021-03-12 21:41 ` Pavel Begunkov @ 2021-03-13 2:44 ` Jens Axboe 0 siblings, 0 replies; 10+ messages in thread From: Jens Axboe @ 2021-03-13 2:44 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 3/12/21 2:41 PM, Pavel Begunkov wrote: > On 12/03/2021 19:40, Jens Axboe wrote: >> On 3/12/21 12:35 PM, Pavel Begunkov wrote: >>> On 11/03/2021 23:29, Pavel Begunkov wrote: >>>> 1) The first problem is io_uring_cancel_sqpoll() -> >>>> io_uring_cancel_task_requests() basically doing park(); park(); and so >>>> hanging. >>>> >>>> 2) Another one is more subtle, when the master task is doing cancellations, >>>> but SQPOLL task submits in-between the end of the cancellation but >>>> before finish() requests taking a ref to the ctx, and so eternally >>>> locking it up. >>>> >>>> 3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and >>>> same io_uring_cancel_sqpoll() from the owner task, they race for >>>> tctx->wait events. And there probably more of them. >>>> >>>> Instead do SQPOLL cancellations from within SQPOLL task context via >>>> task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal >>>> park()/unpark() during cancellation, which is ugly, subtle and anyway >>>> doesn't allow to do io_run_task_work() properly.> >>>> io_uring_cancel_sqpoll() is called only from SQPOLL task context and >>>> under sqd locking, so all parking is removed from there. And so, >>>> io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by >>>> SQPOLL task, and that spare us from some headache. >>>> >>>> Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll, >>>> which is not used anymore. >>> >>> >>> Looks, the chunk below somehow slipped from the patch. Not important >>> for 5.12, but can can be folded anyway >>> >>> diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h >>> index 9761a0ec9f95..c24c62b47745 100644 >>> --- a/include/linux/io_uring.h >>> +++ b/include/linux/io_uring.h >>> @@ -22,7 +22,6 @@ struct io_uring_task { >>> void *io_wq; >>> struct percpu_counter inflight; >>> atomic_t in_idle; >>> - bool sqpoll; >>> >>> spinlock_t task_lock; >>> struct io_wq_work_list task_list; >> >> Let's do it as a separate patch instead. > > Ok, I'll send it for-5.13 when it's appropriate. Yeah that's fine, obviously no rush. I'll rebase for-5.13/io_uring when -rc3 is out. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] sqpoll fixes 2021-03-11 23:29 [PATCH 0/4] sqpoll fixes Pavel Begunkov ` (3 preceding siblings ...) 2021-03-11 23:29 ` [PATCH 4/4] io_uring: cancel sqpoll via task_work Pavel Begunkov @ 2021-03-12 18:19 ` Jens Axboe 4 siblings, 0 replies; 10+ messages in thread From: Jens Axboe @ 2021-03-12 18:19 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 3/11/21 4:29 PM, Pavel Begunkov wrote: > 1-3 are relatively easy. 2/3 is rather a cleanup, but it's better to > throw it out of the equation, makes life easier. > > 4/4 removes the park/unpark dance for cancellations, and fixes a couple > of issues (will send one test later). > Signals in io_sq_thread() add troubles adding more cases to think about > and care about races b/w dying SQPOLL task, dying ctxs and going away > userspace tasks. > > Pavel Begunkov (4): > io_uring: cancel deferred requests in try_cancel > io_uring: remove useless ->startup completion > io_uring: prevent racy sqd->thread checks > io_uring: cancel sqpoll via task_work > > fs/io_uring.c | 195 +++++++++++++++++++++++++------------------------- > 1 file changed, 99 insertions(+), 96 deletions(-) Added for 5.12, thanks. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-03-13 2:45 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-03-11 23:29 [PATCH 0/4] sqpoll fixes Pavel Begunkov 2021-03-11 23:29 ` [PATCH 1/4] io_uring: cancel deferred requests in try_cancel Pavel Begunkov 2021-03-11 23:29 ` [PATCH 2/4] io_uring: remove useless ->startup completion Pavel Begunkov 2021-03-11 23:29 ` [PATCH 3/4] io_uring: prevent racy sqd->thread checks Pavel Begunkov 2021-03-11 23:29 ` [PATCH 4/4] io_uring: cancel sqpoll via task_work Pavel Begunkov 2021-03-12 19:35 ` Pavel Begunkov 2021-03-12 19:40 ` Jens Axboe 2021-03-12 21:41 ` Pavel Begunkov 2021-03-13 2:44 ` Jens Axboe 2021-03-12 18:19 ` [PATCH 0/4] sqpoll fixes Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox