* [PATCH 0/3] rsrc quiesce fixes/hardening @ 2021-02-19 20:45 Pavel Begunkov 2021-02-19 20:45 ` [PATCH 1/3] io_uring: zero ref_node after killing it Pavel Begunkov ` (4 more replies) 0 siblings, 5 replies; 12+ messages in thread From: Pavel Begunkov @ 2021-02-19 20:45 UTC (permalink / raw) To: Jens Axboe, io-uring 1/3 addresses new races in io_rsrc_ref_quiesce(), others are hardenings. Pavel Begunkov (3): io_uring: zero ref_node after killing it io_uring: fix io_rsrc_ref_quiesce races io_uring: keep generic rsrc infra generic fs/io_uring.c | 65 +++++++++++++++++---------------------------------- 1 file changed, 21 insertions(+), 44 deletions(-) -- 2.24.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] io_uring: zero ref_node after killing it 2021-02-19 20:45 [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov @ 2021-02-19 20:45 ` Pavel Begunkov 2021-02-19 20:45 ` [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races Pavel Begunkov ` (3 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: Pavel Begunkov @ 2021-02-19 20:45 UTC (permalink / raw) To: Jens Axboe, io-uring After a rsrc/files reference node's refs are killed, it must never be used. And that's how it works, it either assigns a new node or kills the whole data table. Let's explicitly NULL it, that shouldn't be necessary, but if something would go wrong I'd rather catch a NULL dereference to using a dangling pointer. Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index b7bae301744b..50d4dba08f82 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7335,6 +7335,7 @@ static void io_sqe_rsrc_kill_node(struct io_ring_ctx *ctx, struct fixed_rsrc_dat io_rsrc_ref_lock(ctx); ref_node = data->node; + data->node = NULL; io_rsrc_ref_unlock(ctx); if (ref_node) percpu_ref_kill(&ref_node->refs); -- 2.24.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races 2021-02-19 20:45 [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov 2021-02-19 20:45 ` [PATCH 1/3] io_uring: zero ref_node after killing it Pavel Begunkov @ 2021-02-19 20:45 ` Pavel Begunkov 2021-02-20 6:31 ` Hao Xu 2021-02-19 20:45 ` [PATCH 3/3] io_uring: keep generic rsrc infra generic Pavel Begunkov ` (2 subsequent siblings) 4 siblings, 1 reply; 12+ messages in thread From: Pavel Begunkov @ 2021-02-19 20:45 UTC (permalink / raw) To: Jens Axboe, io-uring There are different types of races in io_rsrc_ref_quiesce() between ->release() of percpu_refs and reinit_completion(), fix them by always resurrecting between iterations. BTW, clean the function up, because DRY. Fixes: 0ce4a72632317 ("io_uring: don't hold uring_lock when calling io_run_task_work*") Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 46 +++++++++++++--------------------------------- 1 file changed, 13 insertions(+), 33 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 50d4dba08f82..38ed52065a29 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7316,19 +7316,6 @@ static void io_sqe_rsrc_set_node(struct io_ring_ctx *ctx, percpu_ref_get(&rsrc_data->refs); } -static int io_sqe_rsrc_add_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) -{ - struct fixed_rsrc_ref_node *backup_node; - - backup_node = alloc_fixed_rsrc_ref_node(ctx); - if (!backup_node) - return -ENOMEM; - init_fixed_file_ref_node(ctx, backup_node); - io_sqe_rsrc_set_node(ctx, data, backup_node); - - return 0; -} - static void io_sqe_rsrc_kill_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) { struct fixed_rsrc_ref_node *ref_node = NULL; @@ -7347,36 +7334,29 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, { int ret; - io_sqe_rsrc_kill_node(ctx, data); - percpu_ref_kill(&data->refs); - - /* wait for all refs nodes to complete */ - flush_delayed_work(&ctx->rsrc_put_work); do { + io_sqe_rsrc_kill_node(ctx, data); + percpu_ref_kill(&data->refs); + flush_delayed_work(&ctx->rsrc_put_work); + ret = wait_for_completion_interruptible(&data->done); if (!ret) break; - ret = io_sqe_rsrc_add_node(ctx, data); - if (ret < 0) - break; - /* - * There is small possibility that data->done is already completed - * So reinit it here - */ + percpu_ref_resurrect(&data->refs); + io_sqe_rsrc_set_node(ctx, data, backup_node); reinit_completion(&data->done); mutex_unlock(&ctx->uring_lock); ret = io_run_task_work_sig(); mutex_lock(&ctx->uring_lock); - io_sqe_rsrc_kill_node(ctx, data); - } while (ret >= 0); - if (ret < 0) { - percpu_ref_resurrect(&data->refs); - reinit_completion(&data->done); - io_sqe_rsrc_set_node(ctx, data, backup_node); - return ret; - } + if (ret < 0) + return ret; + backup_node = alloc_fixed_rsrc_ref_node(ctx); + if (!backup_node) + return -ENOMEM; + init_fixed_file_ref_node(ctx, backup_node); + } while (1); destroy_fixed_rsrc_ref_node(backup_node); return 0; -- 2.24.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races 2021-02-19 20:45 ` [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races Pavel Begunkov @ 2021-02-20 6:31 ` Hao Xu 2021-02-20 14:07 ` Pavel Begunkov 0 siblings, 1 reply; 12+ messages in thread From: Hao Xu @ 2021-02-20 6:31 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe, io-uring 在 2021/2/20 上午4:45, Pavel Begunkov 写道: > There are different types of races in io_rsrc_ref_quiesce() between > ->release() of percpu_refs and reinit_completion(), fix them by always > resurrecting between iterations. BTW, clean the function up, because > DRY. > > Fixes: 0ce4a72632317 ("io_uring: don't hold uring_lock when calling io_run_task_work*") > Signed-off-by: Pavel Begunkov <[email protected]> > --- > fs/io_uring.c | 46 +++++++++++++--------------------------------- > 1 file changed, 13 insertions(+), 33 deletions(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 50d4dba08f82..38ed52065a29 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -7316,19 +7316,6 @@ static void io_sqe_rsrc_set_node(struct io_ring_ctx *ctx, > percpu_ref_get(&rsrc_data->refs); > } > > -static int io_sqe_rsrc_add_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) > -{ > - struct fixed_rsrc_ref_node *backup_node; > - > - backup_node = alloc_fixed_rsrc_ref_node(ctx); > - if (!backup_node) > - return -ENOMEM; > - init_fixed_file_ref_node(ctx, backup_node); > - io_sqe_rsrc_set_node(ctx, data, backup_node); > - > - return 0; > -} > - > static void io_sqe_rsrc_kill_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) > { > struct fixed_rsrc_ref_node *ref_node = NULL; > @@ -7347,36 +7334,29 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, > { > int ret; > > - io_sqe_rsrc_kill_node(ctx, data); > - percpu_ref_kill(&data->refs); > - > - /* wait for all refs nodes to complete */ > - flush_delayed_work(&ctx->rsrc_put_work); > do { > + io_sqe_rsrc_kill_node(ctx, data); > + percpu_ref_kill(&data->refs); > + flush_delayed_work(&ctx->rsrc_put_work); > + > ret = wait_for_completion_interruptible(&data->done); > if (!ret) > break; > > - ret = io_sqe_rsrc_add_node(ctx, data); > - if (ret < 0) > - break; > - /* > - * There is small possibility that data->done is already completed > - * So reinit it here > - */ > + percpu_ref_resurrect(&data->refs); I've thought about this, if we resurrect data->refs, then we can't avoid parallel files unregister by percpu_refs_is_dying. > + io_sqe_rsrc_set_node(ctx, data, backup_node); > reinit_completion(&data->done); > mutex_unlock(&ctx->uring_lock); > ret = io_run_task_work_sig(); > mutex_lock(&ctx->uring_lock); > - io_sqe_rsrc_kill_node(ctx, data); > - } while (ret >= 0); > > - if (ret < 0) { > - percpu_ref_resurrect(&data->refs); > - reinit_completion(&data->done); > - io_sqe_rsrc_set_node(ctx, data, backup_node); > - return ret; > - } > + if (ret < 0) > + return ret; > + backup_node = alloc_fixed_rsrc_ref_node(ctx); > + if (!backup_node) > + return -ENOMEM; Should we resurrect data->refs and reinit completion before signal or error return? > + init_fixed_file_ref_node(ctx, backup_node); > + } while (1); > > destroy_fixed_rsrc_ref_node(backup_node); > return 0; > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races 2021-02-20 6:31 ` Hao Xu @ 2021-02-20 14:07 ` Pavel Begunkov 2021-02-20 14:33 ` Pavel Begunkov 2021-02-21 9:46 ` Hao Xu 0 siblings, 2 replies; 12+ messages in thread From: Pavel Begunkov @ 2021-02-20 14:07 UTC (permalink / raw) To: Hao Xu, Jens Axboe, io-uring On 20/02/2021 06:31, Hao Xu wrote: > 在 2021/2/20 上午4:45, Pavel Begunkov 写道: >> There are different types of races in io_rsrc_ref_quiesce() between >> ->release() of percpu_refs and reinit_completion(), fix them by always >> resurrecting between iterations. BTW, clean the function up, because >> DRY. >> >> Fixes: 0ce4a72632317 ("io_uring: don't hold uring_lock when calling io_run_task_work*") >> Signed-off-by: Pavel Begunkov <[email protected]> >> --- >> fs/io_uring.c | 46 +++++++++++++--------------------------------- >> 1 file changed, 13 insertions(+), 33 deletions(-) >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 50d4dba08f82..38ed52065a29 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -7316,19 +7316,6 @@ static void io_sqe_rsrc_set_node(struct io_ring_ctx *ctx, >> percpu_ref_get(&rsrc_data->refs); >> } >> -static int io_sqe_rsrc_add_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) >> -{ >> - struct fixed_rsrc_ref_node *backup_node; >> - >> - backup_node = alloc_fixed_rsrc_ref_node(ctx); >> - if (!backup_node) >> - return -ENOMEM; >> - init_fixed_file_ref_node(ctx, backup_node); >> - io_sqe_rsrc_set_node(ctx, data, backup_node); >> - >> - return 0; >> -} >> - >> static void io_sqe_rsrc_kill_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) >> { >> struct fixed_rsrc_ref_node *ref_node = NULL; >> @@ -7347,36 +7334,29 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, >> { >> int ret; >> - io_sqe_rsrc_kill_node(ctx, data); >> - percpu_ref_kill(&data->refs); >> - >> - /* wait for all refs nodes to complete */ >> - flush_delayed_work(&ctx->rsrc_put_work); >> do { >> + io_sqe_rsrc_kill_node(ctx, data); >> + percpu_ref_kill(&data->refs); >> + flush_delayed_work(&ctx->rsrc_put_work); >> + >> ret = wait_for_completion_interruptible(&data->done); >> if (!ret) >> break; >> - ret = io_sqe_rsrc_add_node(ctx, data); >> - if (ret < 0) >> - break; >> - /* >> - * There is small possibility that data->done is already completed >> - * So reinit it here >> - */ >> + percpu_ref_resurrect(&data->refs); > I've thought about this, if we resurrect data->refs, then we can't > avoid parallel files unregister by percpu_refs_is_dying. Right, totally forgot about it, but otherwise we race with data->done. Didn't test yet, but a diff below should do the trick. >> + if (ret < 0) >> + return ret; >> + backup_node = alloc_fixed_rsrc_ref_node(ctx); >> + if (!backup_node) >> + return -ENOMEM; > Should we resurrect data->refs and reinit completion before > signal or error return? Not sure what exactly you mean, we should not release uring_lock with inconsistent state, so it's done right before unlock. And we should not do it before wait_for_completion_interruptible() before it would take a reference. >> + init_fixed_file_ref_node(ctx, backup_node); >> + } while (1); >> destroy_fixed_rsrc_ref_node(backup_node); >> return 0; >> > diff --git a/fs/io_uring.c b/fs/io_uring.c index ce5fccf00367..0af1572634de 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -236,6 +236,7 @@ struct fixed_rsrc_data { struct fixed_rsrc_ref_node *node; struct percpu_ref refs; struct completion done; + bool quiesce; }; struct io_buffer { @@ -7335,6 +7336,7 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, struct fixed_rsrc_ref_node *backup_node; int ret; + data->quiesce = true; do { backup_node = alloc_fixed_rsrc_ref_node(ctx); if (!backup_node) @@ -7353,16 +7355,19 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, percpu_ref_resurrect(&data->refs); synchronize_rcu(); io_sqe_rsrc_set_node(ctx, data, backup_node); + backup_node = NULL; reinit_completion(&data->done); mutex_unlock(&ctx->uring_lock); ret = io_run_task_work_sig(); mutex_lock(&ctx->uring_lock); if (ret < 0) return ret; - } while (1); + } while (ret >= 0); - destroy_fixed_rsrc_ref_node(backup_node); - return 0; + data->quiesce = false; + if (backup_node) + destroy_fixed_rsrc_ref_node(backup_node); + return ret; } static struct fixed_rsrc_data *alloc_fixed_rsrc_data(struct io_ring_ctx *ctx) @@ -7401,7 +7406,7 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx) * Since we possibly drop uring lock later in this function to * run task work. */ - if (!data || percpu_ref_is_dying(&data->refs)) + if (!data || data->quiesce) return -ENXIO; ret = io_rsrc_ref_quiesce(data, ctx, io_ring_file_put); if (ret) ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races 2021-02-20 14:07 ` Pavel Begunkov @ 2021-02-20 14:33 ` Pavel Begunkov 2021-02-21 9:46 ` Hao Xu 1 sibling, 0 replies; 12+ messages in thread From: Pavel Begunkov @ 2021-02-20 14:33 UTC (permalink / raw) To: Hao Xu, Jens Axboe, io-uring On 20/02/2021 14:07, Pavel Begunkov wrote: > On 20/02/2021 06:31, Hao Xu wrote: >> 在 2021/2/20 上午4:45, Pavel Begunkov 写道: >>> There are different types of races in io_rsrc_ref_quiesce() between >>> ->release() of percpu_refs and reinit_completion(), fix them by always >>> resurrecting between iterations. BTW, clean the function up, because >>> DRY. >>> >>> Fixes: 0ce4a72632317 ("io_uring: don't hold uring_lock when calling io_run_task_work*") >>> Signed-off-by: Pavel Begunkov <[email protected]> >>> --- >>> - /* >>> - * There is small possibility that data->done is already completed >>> - * So reinit it here >>> - */ >>> + percpu_ref_resurrect(&data->refs); >> I've thought about this, if we resurrect data->refs, then we can't >> avoid parallel files unregister by percpu_refs_is_dying. > > Right, totally forgot about it, but otherwise we race with data->done. > Didn't test yet, but a diff below should do the trick. > >>> + if (ret < 0) >>> + return ret; >>> + backup_node = alloc_fixed_rsrc_ref_node(ctx); >>> + if (!backup_node) >>> + return -ENOMEM; >> Should we resurrect data->refs and reinit completion before >> signal or error return? > > Not sure what exactly you mean, we should not release uring_lock with > inconsistent state, so it's done right before unlock. And we should not > do it before wait_for_completion_interruptible() before it would take a > reference. > >>> + init_fixed_file_ref_node(ctx, backup_node); >>> + } while (1); >>> destroy_fixed_rsrc_ref_node(backup_node); >>> return 0; upd diff --git a/fs/io_uring.c b/fs/io_uring.c index ce5fccf00367..12796de8ad10 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -236,6 +236,7 @@ struct fixed_rsrc_data { struct fixed_rsrc_ref_node *node; struct percpu_ref refs; struct completion done; + bool quiesce; }; struct io_buffer { @@ -7335,10 +7336,12 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, struct fixed_rsrc_ref_node *backup_node; int ret; + data->quiesce = true; do { backup_node = alloc_fixed_rsrc_ref_node(ctx); + ret = -ENOMEM; if (!backup_node) - return -ENOMEM; + break; backup_node->rsrc_data = data; backup_node->rsrc_put = rsrc_put; @@ -7353,16 +7356,17 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, percpu_ref_resurrect(&data->refs); synchronize_rcu(); io_sqe_rsrc_set_node(ctx, data, backup_node); + backup_node = NULL; reinit_completion(&data->done); mutex_unlock(&ctx->uring_lock); ret = io_run_task_work_sig(); mutex_lock(&ctx->uring_lock); - if (ret < 0) - return ret; - } while (1); + } while (ret >= 0); - destroy_fixed_rsrc_ref_node(backup_node); - return 0; + data->quiesce = false; + if (backup_node) + destroy_fixed_rsrc_ref_node(backup_node); + return ret; } static struct fixed_rsrc_data *alloc_fixed_rsrc_data(struct io_ring_ctx *ctx) @@ -7401,7 +7405,7 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx) * Since we possibly drop uring lock later in this function to * run task work. */ - if (!data || percpu_ref_is_dying(&data->refs)) + if (!data || data->quiesce) return -ENXIO; ret = io_rsrc_ref_quiesce(data, ctx, io_ring_file_put); if (ret) -- Pavel Begunkov ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races 2021-02-20 14:07 ` Pavel Begunkov 2021-02-20 14:33 ` Pavel Begunkov @ 2021-02-21 9:46 ` Hao Xu 1 sibling, 0 replies; 12+ messages in thread From: Hao Xu @ 2021-02-21 9:46 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe, io-uring 在 2021/2/20 下午10:07, Pavel Begunkov 写道: > On 20/02/2021 06:31, Hao Xu wrote: >> 在 2021/2/20 上午4:45, Pavel Begunkov 写道: >>> There are different types of races in io_rsrc_ref_quiesce() between >>> ->release() of percpu_refs and reinit_completion(), fix them by always >>> resurrecting between iterations. BTW, clean the function up, because >>> DRY. >>> >>> Fixes: 0ce4a72632317 ("io_uring: don't hold uring_lock when calling io_run_task_work*") >>> Signed-off-by: Pavel Begunkov <[email protected]> >>> --- >>> fs/io_uring.c | 46 +++++++++++++--------------------------------- >>> 1 file changed, 13 insertions(+), 33 deletions(-) >>> >>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>> index 50d4dba08f82..38ed52065a29 100644 >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -7316,19 +7316,6 @@ static void io_sqe_rsrc_set_node(struct io_ring_ctx *ctx, >>> percpu_ref_get(&rsrc_data->refs); >>> } >>> -static int io_sqe_rsrc_add_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) >>> -{ >>> - struct fixed_rsrc_ref_node *backup_node; >>> - >>> - backup_node = alloc_fixed_rsrc_ref_node(ctx); >>> - if (!backup_node) >>> - return -ENOMEM; >>> - init_fixed_file_ref_node(ctx, backup_node); >>> - io_sqe_rsrc_set_node(ctx, data, backup_node); >>> - >>> - return 0; >>> -} >>> - >>> static void io_sqe_rsrc_kill_node(struct io_ring_ctx *ctx, struct fixed_rsrc_data *data) >>> { >>> struct fixed_rsrc_ref_node *ref_node = NULL; >>> @@ -7347,36 +7334,29 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, >>> { >>> int ret; >>> - io_sqe_rsrc_kill_node(ctx, data); >>> - percpu_ref_kill(&data->refs); >>> - >>> - /* wait for all refs nodes to complete */ >>> - flush_delayed_work(&ctx->rsrc_put_work); >>> do { >>> + io_sqe_rsrc_kill_node(ctx, data); >>> + percpu_ref_kill(&data->refs); >>> + flush_delayed_work(&ctx->rsrc_put_work); >>> + >>> ret = wait_for_completion_interruptible(&data->done); >>> if (!ret) >>> break; >>> - ret = io_sqe_rsrc_add_node(ctx, data); >>> - if (ret < 0) >>> - break; >>> - /* >>> - * There is small possibility that data->done is already completed >>> - * So reinit it here >>> - */ >>> + percpu_ref_resurrect(&data->refs); >> I've thought about this, if we resurrect data->refs, then we can't >> avoid parallel files unregister by percpu_refs_is_dying. > > Right, totally forgot about it, but otherwise we race with data->done. > Didn't test yet, but a diff below should do the trick. I'll test the latest version soon. > >>> + if (ret < 0) >>> + return ret; >>> + backup_node = alloc_fixed_rsrc_ref_node(ctx); >>> + if (!backup_node) >>> + return -ENOMEM; >> Should we resurrect data->refs and reinit completion before >> signal or error return? > > Not sure what exactly you mean, we should not release uring_lock with > inconsistent state, so it's done right before unlock. And we should not > do it before wait_for_completion_interruptible() before it would take a > reference. Nevermind, I misread the code. > >>> + init_fixed_file_ref_node(ctx, backup_node); >>> + } while (1); >>> destroy_fixed_rsrc_ref_node(backup_node); >>> return 0; >>> >> > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index ce5fccf00367..0af1572634de 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -236,6 +236,7 @@ struct fixed_rsrc_data { > struct fixed_rsrc_ref_node *node; > struct percpu_ref refs; > struct completion done; > + bool quiesce; > }; > > struct io_buffer { > @@ -7335,6 +7336,7 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, > struct fixed_rsrc_ref_node *backup_node; > int ret; > > + data->quiesce = true; > do { > backup_node = alloc_fixed_rsrc_ref_node(ctx); > if (!backup_node) > @@ -7353,16 +7355,19 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, > percpu_ref_resurrect(&data->refs); > synchronize_rcu(); > io_sqe_rsrc_set_node(ctx, data, backup_node); > + backup_node = NULL; > reinit_completion(&data->done); > mutex_unlock(&ctx->uring_lock); > ret = io_run_task_work_sig(); > mutex_lock(&ctx->uring_lock); > if (ret < 0) > return ret; > - } while (1); > + } while (ret >= 0); > > - destroy_fixed_rsrc_ref_node(backup_node); > - return 0; > + data->quiesce = false; > + if (backup_node) > + destroy_fixed_rsrc_ref_node(backup_node); > + return ret; > } > > static struct fixed_rsrc_data *alloc_fixed_rsrc_data(struct io_ring_ctx *ctx) > @@ -7401,7 +7406,7 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx) > * Since we possibly drop uring lock later in this function to > * run task work. > */ > - if (!data || percpu_ref_is_dying(&data->refs)) > + if (!data || data->quiesce) > return -ENXIO; > ret = io_rsrc_ref_quiesce(data, ctx, io_ring_file_put); > if (ret) > ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/3] io_uring: keep generic rsrc infra generic 2021-02-19 20:45 [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov 2021-02-19 20:45 ` [PATCH 1/3] io_uring: zero ref_node after killing it Pavel Begunkov 2021-02-19 20:45 ` [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races Pavel Begunkov @ 2021-02-19 20:45 ` Pavel Begunkov 2021-02-19 20:49 ` [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov 2021-02-20 0:28 ` Jens Axboe 4 siblings, 0 replies; 12+ messages in thread From: Pavel Begunkov @ 2021-02-19 20:45 UTC (permalink / raw) To: Jens Axboe, io-uring io_rsrc_ref_quiesce() is a generic resource function, though now it was wired to allocate and initialise ref nodes with file-specific callbacks/etc. Keep it sane by passing in as a parameters everything we need for initialisations, otherwise it will hurt us badly one day. Signed-off-by: Pavel Begunkov <[email protected]> --- fs/io_uring.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 38ed52065a29..f3af499b12a9 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1036,8 +1036,7 @@ static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx, static void destroy_fixed_rsrc_ref_node(struct fixed_rsrc_ref_node *ref_node); static struct fixed_rsrc_ref_node *alloc_fixed_rsrc_ref_node( struct io_ring_ctx *ctx); -static void init_fixed_file_ref_node(struct io_ring_ctx *ctx, - struct fixed_rsrc_ref_node *ref_node); +static void io_ring_file_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc); static bool io_rw_reissue(struct io_kiocb *req); static void io_cqring_fill_event(struct io_kiocb *req, long res); @@ -7330,11 +7329,19 @@ static void io_sqe_rsrc_kill_node(struct io_ring_ctx *ctx, struct fixed_rsrc_dat static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, struct io_ring_ctx *ctx, - struct fixed_rsrc_ref_node *backup_node) + void (*rsrc_put)(struct io_ring_ctx *ctx, + struct io_rsrc_put *prsrc)) { + struct fixed_rsrc_ref_node *backup_node; int ret; do { + backup_node = alloc_fixed_rsrc_ref_node(ctx); + if (!backup_node) + return -ENOMEM; + backup_node->rsrc_data = data; + backup_node->rsrc_put = rsrc_put; + io_sqe_rsrc_kill_node(ctx, data); percpu_ref_kill(&data->refs); flush_delayed_work(&ctx->rsrc_put_work); @@ -7349,13 +7356,8 @@ static int io_rsrc_ref_quiesce(struct fixed_rsrc_data *data, mutex_unlock(&ctx->uring_lock); ret = io_run_task_work_sig(); mutex_lock(&ctx->uring_lock); - if (ret < 0) return ret; - backup_node = alloc_fixed_rsrc_ref_node(ctx); - if (!backup_node) - return -ENOMEM; - init_fixed_file_ref_node(ctx, backup_node); } while (1); destroy_fixed_rsrc_ref_node(backup_node); @@ -7390,7 +7392,6 @@ static void free_fixed_rsrc_data(struct fixed_rsrc_data *data) static int io_sqe_files_unregister(struct io_ring_ctx *ctx) { struct fixed_rsrc_data *data = ctx->file_data; - struct fixed_rsrc_ref_node *backup_node; unsigned nr_tables, i; int ret; @@ -7401,12 +7402,7 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx) */ if (!data || percpu_ref_is_dying(&data->refs)) return -ENXIO; - backup_node = alloc_fixed_rsrc_ref_node(ctx); - if (!backup_node) - return -ENOMEM; - init_fixed_file_ref_node(ctx, backup_node); - - ret = io_rsrc_ref_quiesce(data, ctx, backup_node); + ret = io_rsrc_ref_quiesce(data, ctx, io_ring_file_put); if (ret) return ret; -- 2.24.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] rsrc quiesce fixes/hardening 2021-02-19 20:45 [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov ` (2 preceding siblings ...) 2021-02-19 20:45 ` [PATCH 3/3] io_uring: keep generic rsrc infra generic Pavel Begunkov @ 2021-02-19 20:49 ` Pavel Begunkov 2021-02-20 6:32 ` Hao Xu 2021-02-20 0:28 ` Jens Axboe 4 siblings, 1 reply; 12+ messages in thread From: Pavel Begunkov @ 2021-02-19 20:49 UTC (permalink / raw) To: Jens Axboe, io-uring, Hao Xu On 19/02/2021 20:45, Pavel Begunkov wrote: > 1/3 addresses new races in io_rsrc_ref_quiesce(), others are hardenings. s/1/2/ Hao, any chance you guys can drag these patches through the same tests you've done for "io_uring: don't hold uring_lock ..."? > > Pavel Begunkov (3): > io_uring: zero ref_node after killing it > io_uring: fix io_rsrc_ref_quiesce races > io_uring: keep generic rsrc infra generic > > fs/io_uring.c | 65 +++++++++++++++++---------------------------------- > 1 file changed, 21 insertions(+), 44 deletions(-) > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] rsrc quiesce fixes/hardening 2021-02-19 20:49 ` [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov @ 2021-02-20 6:32 ` Hao Xu 2021-02-20 14:14 ` Pavel Begunkov 0 siblings, 1 reply; 12+ messages in thread From: Hao Xu @ 2021-02-20 6:32 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe, io-uring 在 2021/2/20 上午4:49, Pavel Begunkov 写道: > On 19/02/2021 20:45, Pavel Begunkov wrote: >> 1/3 addresses new races in io_rsrc_ref_quiesce(), others are hardenings. > > s/1/2/ > > Hao, any chance you guys can drag these patches through the same > tests you've done for "io_uring: don't hold uring_lock ..."? > gotcha, will test it soon. >> >> Pavel Begunkov (3): >> io_uring: zero ref_node after killing it >> io_uring: fix io_rsrc_ref_quiesce races >> io_uring: keep generic rsrc infra generic >> >> fs/io_uring.c | 65 +++++++++++++++++---------------------------------- >> 1 file changed, 21 insertions(+), 44 deletions(-) >> > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] rsrc quiesce fixes/hardening 2021-02-20 6:32 ` Hao Xu @ 2021-02-20 14:14 ` Pavel Begunkov 0 siblings, 0 replies; 12+ messages in thread From: Pavel Begunkov @ 2021-02-20 14:14 UTC (permalink / raw) To: Hao Xu, Jens Axboe, io-uring On 20/02/2021 06:32, Hao Xu wrote: > 在 2021/2/20 上午4:49, Pavel Begunkov 写道: >> On 19/02/2021 20:45, Pavel Begunkov wrote: >>> 1/3 addresses new races in io_rsrc_ref_quiesce(), others are hardenings. >> >> s/1/2/ >> >> Hao, any chance you guys can drag these patches through the same >> tests you've done for "io_uring: don't hold uring_lock ..."? >> > gotcha, will test it soon. awesome, thanks >>> >>> Pavel Begunkov (3): >>> io_uring: zero ref_node after killing it >>> io_uring: fix io_rsrc_ref_quiesce races >>> io_uring: keep generic rsrc infra generic >>> >>> fs/io_uring.c | 65 +++++++++++++++++---------------------------------- >>> 1 file changed, 21 insertions(+), 44 deletions(-) >>> >> > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] rsrc quiesce fixes/hardening 2021-02-19 20:45 [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov ` (3 preceding siblings ...) 2021-02-19 20:49 ` [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov @ 2021-02-20 0:28 ` Jens Axboe 4 siblings, 0 replies; 12+ messages in thread From: Jens Axboe @ 2021-02-20 0:28 UTC (permalink / raw) To: Pavel Begunkov, io-uring On 2/19/21 1:45 PM, Pavel Begunkov wrote: > 1/3 addresses new races in io_rsrc_ref_quiesce(), others are hardenings. > > Pavel Begunkov (3): > io_uring: zero ref_node after killing it > io_uring: fix io_rsrc_ref_quiesce races > io_uring: keep generic rsrc infra generic > > fs/io_uring.c | 65 +++++++++++++++++---------------------------------- > 1 file changed, 21 insertions(+), 44 deletions(-) Looks (and tests) good to me, applied. -- Jens Axboe ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-02-21 9:47 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-02-19 20:45 [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov 2021-02-19 20:45 ` [PATCH 1/3] io_uring: zero ref_node after killing it Pavel Begunkov 2021-02-19 20:45 ` [PATCH 2/3] io_uring: fix io_rsrc_ref_quiesce races Pavel Begunkov 2021-02-20 6:31 ` Hao Xu 2021-02-20 14:07 ` Pavel Begunkov 2021-02-20 14:33 ` Pavel Begunkov 2021-02-21 9:46 ` Hao Xu 2021-02-19 20:45 ` [PATCH 3/3] io_uring: keep generic rsrc infra generic Pavel Begunkov 2021-02-19 20:49 ` [PATCH 0/3] rsrc quiesce fixes/hardening Pavel Begunkov 2021-02-20 6:32 ` Hao Xu 2021-02-20 14:14 ` Pavel Begunkov 2021-02-20 0:28 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox