* [PATCH 0/2] poll fixes @ 2021-09-12 16:23 Hao Xu 2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu 2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu 0 siblings, 2 replies; 11+ messages in thread From: Hao Xu @ 2021-09-12 16:23 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi Two fixes about poll. Hao Xu (2): io_uring: fix tw list mess-up by adding tw while it's already in tw list io_uring: fix race between poll completion and cancel_hash insertion fs/io_uring.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) -- 2.24.4 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list 2021-09-12 16:23 [PATCH 0/2] poll fixes Hao Xu @ 2021-09-12 16:23 ` Hao Xu 2021-09-15 9:44 ` Pavel Begunkov 2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu 1 sibling, 1 reply; 11+ messages in thread From: Hao Xu @ 2021-09-12 16:23 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi For multishot mode, there may be cases like: io_poll_task_func() -> add_wait_queue() async_wake() ->io_req_task_work_add() this one mess up the running task_work list since req->io_task_work.node is in use. similar situation for req->io_task_work.fallback_node. Fix it by set node->next = NULL before we run the tw, so that when we add req back to the wait queue in middle of tw running, we can safely re-add it to the tw list. Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work") Signed-off-by: Hao Xu <[email protected]> --- fs/io_uring.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 30d959416eba..c16f6be3d46b 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work) struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, fallback_work.work); struct llist_node *node = llist_del_all(&ctx->fallback_llist); - struct io_kiocb *req, *tmp; + struct io_kiocb *req; bool locked = false; percpu_ref_get(&ctx->refs); - llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node) + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); + while (member_address_is_nonnull(req, io_task_work.fallback_node)) { + node = req->io_task_work.fallback_node.next; + req->io_task_work.fallback_node.next = NULL; req->io_task_work.func(req, &locked); - + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); + } if (locked) { if (ctx->submit_state.compl_nr) io_submit_flush_completions(ctx); @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb) locked = mutex_trylock(&ctx->uring_lock); percpu_ref_get(&ctx->refs); } + node->next = NULL; req->io_task_work.func(req, &locked); node = next; } while (node); -- 2.24.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list 2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu @ 2021-09-15 9:44 ` Pavel Begunkov 2021-09-15 10:48 ` Hao Xu 0 siblings, 1 reply; 11+ messages in thread From: Pavel Begunkov @ 2021-09-15 9:44 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 9/12/21 5:23 PM, Hao Xu wrote: > For multishot mode, there may be cases like: > io_poll_task_func() > -> add_wait_queue() > async_wake() > ->io_req_task_work_add() > this one mess up the running task_work list > since req->io_task_work.node is in use. > > similar situation for req->io_task_work.fallback_node. > Fix it by set node->next = NULL before we run the tw, so that when we > add req back to the wait queue in middle of tw running, we can safely > re-add it to the tw list. It may get screwed before we get to "node->next = NULL;", -> async_wake() -> io_req_task_work_add() -> async_wake() -> io_req_task_work_add() tctx_task_work() > Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work") > Signed-off-by: Hao Xu <[email protected]> > --- > > fs/io_uring.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 30d959416eba..c16f6be3d46b 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work) > struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, > fallback_work.work); > struct llist_node *node = llist_del_all(&ctx->fallback_llist); > - struct io_kiocb *req, *tmp; > + struct io_kiocb *req; > bool locked = false; > > percpu_ref_get(&ctx->refs); > - llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node) > + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); > + while (member_address_is_nonnull(req, io_task_work.fallback_node)) { > + node = req->io_task_work.fallback_node.next; > + req->io_task_work.fallback_node.next = NULL; > req->io_task_work.func(req, &locked); > - > + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); > + } > if (locked) { > if (ctx->submit_state.compl_nr) > io_submit_flush_completions(ctx); > @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb) > locked = mutex_trylock(&ctx->uring_lock); > percpu_ref_get(&ctx->refs); > } > + node->next = NULL; > req->io_task_work.func(req, &locked); > node = next; > } while (node); > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list 2021-09-15 9:44 ` Pavel Begunkov @ 2021-09-15 10:48 ` Hao Xu 2021-09-26 9:48 ` Hao Xu 0 siblings, 1 reply; 11+ messages in thread From: Hao Xu @ 2021-09-15 10:48 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/9/15 下午5:44, Pavel Begunkov 写道: > On 9/12/21 5:23 PM, Hao Xu wrote: >> For multishot mode, there may be cases like: >> io_poll_task_func() >> -> add_wait_queue() >> async_wake() >> ->io_req_task_work_add() >> this one mess up the running task_work list >> since req->io_task_work.node is in use. >> >> similar situation for req->io_task_work.fallback_node. >> Fix it by set node->next = NULL before we run the tw, so that when we >> add req back to the wait queue in middle of tw running, we can safely >> re-add it to the tw list. > > It may get screwed before we get to "node->next = NULL;", > > -> async_wake() > -> io_req_task_work_add() > -> async_wake() > -> io_req_task_work_add() > tctx_task_work() True, this may happen if there is second poll wait entry. This pacth is for single wait entry only.. I'm thinking about the second poll entry issue, would be in a separate patch. > > >> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work") >> Signed-off-by: Hao Xu <[email protected]> >> --- >> >> fs/io_uring.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 30d959416eba..c16f6be3d46b 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work) >> struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, >> fallback_work.work); >> struct llist_node *node = llist_del_all(&ctx->fallback_llist); >> - struct io_kiocb *req, *tmp; >> + struct io_kiocb *req; >> bool locked = false; >> >> percpu_ref_get(&ctx->refs); >> - llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node) >> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); >> + while (member_address_is_nonnull(req, io_task_work.fallback_node)) { >> + node = req->io_task_work.fallback_node.next; >> + req->io_task_work.fallback_node.next = NULL; >> req->io_task_work.func(req, &locked); >> - >> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); >> + } >> if (locked) { >> if (ctx->submit_state.compl_nr) >> io_submit_flush_completions(ctx); >> @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb) >> locked = mutex_trylock(&ctx->uring_lock); >> percpu_ref_get(&ctx->refs); >> } >> + node->next = NULL; >> req->io_task_work.func(req, &locked); >> node = next; >> } while (node); >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list 2021-09-15 10:48 ` Hao Xu @ 2021-09-26 9:48 ` Hao Xu 2021-09-29 11:16 ` Pavel Begunkov 0 siblings, 1 reply; 11+ messages in thread From: Hao Xu @ 2021-09-26 9:48 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/9/15 下午6:48, Hao Xu 写道: > 在 2021/9/15 下午5:44, Pavel Begunkov 写道: >> On 9/12/21 5:23 PM, Hao Xu wrote: >>> For multishot mode, there may be cases like: >>> io_poll_task_func() >>> -> add_wait_queue() >>> async_wake() >>> ->io_req_task_work_add() >>> this one mess up the running task_work list >>> since req->io_task_work.node is in use. >>> >>> similar situation for req->io_task_work.fallback_node. >>> Fix it by set node->next = NULL before we run the tw, so that when we >>> add req back to the wait queue in middle of tw running, we can safely >>> re-add it to the tw list. >> >> It may get screwed before we get to "node->next = NULL;", >> >> -> async_wake() >> -> io_req_task_work_add() >> -> async_wake() >> -> io_req_task_work_add() >> tctx_task_work() > True, this may happen if there is second poll wait entry. > This pacth is for single wait entry only.. > I'm thinking about the second poll entry issue, would be in a separate > patch. hmm, reviewed this email again and now I think I got what you were saying, do you mean the second async_wake() triggered before we removed the wait entry in the first async_wake(), like async_wake async_wake ->del wait entry >> >> >>> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work") >>> Signed-off-by: Hao Xu <[email protected]> >>> --- >>> >>> fs/io_uring.c | 11 ++++++++--- >>> 1 file changed, 8 insertions(+), 3 deletions(-) >>> >>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>> index 30d959416eba..c16f6be3d46b 100644 >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct >>> work_struct *work) >>> struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, >>> fallback_work.work); >>> struct llist_node *node = llist_del_all(&ctx->fallback_llist); >>> - struct io_kiocb *req, *tmp; >>> + struct io_kiocb *req; >>> bool locked = false; >>> percpu_ref_get(&ctx->refs); >>> - llist_for_each_entry_safe(req, tmp, node, >>> io_task_work.fallback_node) >>> + req = llist_entry(node, struct io_kiocb, >>> io_task_work.fallback_node); >>> + while (member_address_is_nonnull(req, >>> io_task_work.fallback_node)) { >>> + node = req->io_task_work.fallback_node.next; >>> + req->io_task_work.fallback_node.next = NULL; >>> req->io_task_work.func(req, &locked); >>> - >>> + req = llist_entry(node, struct io_kiocb, >>> io_task_work.fallback_node); >>> + } >>> if (locked) { >>> if (ctx->submit_state.compl_nr) >>> io_submit_flush_completions(ctx); >>> @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head >>> *cb) >>> locked = mutex_trylock(&ctx->uring_lock); >>> percpu_ref_get(&ctx->refs); >>> } >>> + node->next = NULL; >>> req->io_task_work.func(req, &locked); >>> node = next; >>> } while (node); >>> >> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list 2021-09-26 9:48 ` Hao Xu @ 2021-09-29 11:16 ` Pavel Begunkov 0 siblings, 0 replies; 11+ messages in thread From: Pavel Begunkov @ 2021-09-29 11:16 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 9/26/21 10:48 AM, Hao Xu wrote: > 在 2021/9/15 下午6:48, Hao Xu 写道: >> 在 2021/9/15 下午5:44, Pavel Begunkov 写道: >>> On 9/12/21 5:23 PM, Hao Xu wrote: >>>> For multishot mode, there may be cases like: >>>> io_poll_task_func() >>>> -> add_wait_queue() >>>> async_wake() >>>> ->io_req_task_work_add() >>>> this one mess up the running task_work list >>>> since req->io_task_work.node is in use. >>>> >>>> similar situation for req->io_task_work.fallback_node. >>>> Fix it by set node->next = NULL before we run the tw, so that when we >>>> add req back to the wait queue in middle of tw running, we can safely >>>> re-add it to the tw list. >>> >>> It may get screwed before we get to "node->next = NULL;", >>> >>> -> async_wake() >>> -> io_req_task_work_add() >>> -> async_wake() >>> -> io_req_task_work_add() >>> tctx_task_work() >> True, this may happen if there is second poll wait entry. >> This pacth is for single wait entry only.. >> I'm thinking about the second poll entry issue, would be in a separate >> patch. > hmm, reviewed this email again and now I think I got what you were > saying, do you mean the second async_wake() triggered before we removed > the wait entry in the first async_wake(), like > > async_wake > async_wake > ->del wait entry Looks we had different problems in mind, let's move the conversation to the new thread with resent patches >>>> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work") >>>> Signed-off-by: Hao Xu <[email protected]> >>>> --- >>>> >>>> fs/io_uring.c | 11 ++++++++--- >>>> 1 file changed, 8 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>>> index 30d959416eba..c16f6be3d46b 100644 >>>> --- a/fs/io_uring.c >>>> +++ b/fs/io_uring.c >>>> @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work) >>>> struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, >>>> fallback_work.work); >>>> struct llist_node *node = llist_del_all(&ctx->fallback_llist); >>>> - struct io_kiocb *req, *tmp; >>>> + struct io_kiocb *req; >>>> bool locked = false; >>>> percpu_ref_get(&ctx->refs); >>>> - llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node) >>>> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); >>>> + while (member_address_is_nonnull(req, io_task_work.fallback_node)) { >>>> + node = req->io_task_work.fallback_node.next; >>>> + req->io_task_work.fallback_node.next = NULL; >>>> req->io_task_work.func(req, &locked); >>>> - >>>> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node); >>>> + } >>>> if (locked) { >>>> if (ctx->submit_state.compl_nr) >>>> io_submit_flush_completions(ctx); >>>> @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb) >>>> locked = mutex_trylock(&ctx->uring_lock); >>>> percpu_ref_get(&ctx->refs); >>>> } >>>> + node->next = NULL; >>>> req->io_task_work.func(req, &locked); >>>> node = next; >>>> } while (node); >>>> >>> > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion 2021-09-12 16:23 [PATCH 0/2] poll fixes Hao Xu 2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu @ 2021-09-12 16:23 ` Hao Xu 2021-09-15 9:50 ` Pavel Begunkov 2021-09-15 10:12 ` Pavel Begunkov 1 sibling, 2 replies; 11+ messages in thread From: Hao Xu @ 2021-09-12 16:23 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi If poll arming and poll completion runs parallelly, there maybe races. For instance, run io_poll_add in iowq and io_poll_task_func in original context, then: iowq original context io_poll_add vfs_poll (interruption happens tw queued to original context) io_poll_task_func generate cqe del from cancel_hash[] if !poll.done insert to cancel_hash[] The entry left in cancel_hash[], similar case for fast poll. Fix it by set poll.done = true when del from cancel_hash[]. Signed-off-by: Hao Xu <[email protected]> --- Didn't find the exact commit to add Fixes: for.. fs/io_uring.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index c16f6be3d46b..988679e5063f 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask) } if (req->poll.events & EPOLLONESHOT) flags = 0; - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) { - req->poll.done = true; + if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) flags = 0; - } if (flags & IORING_CQE_F_MORE) ctx->cq_extra++; @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked) if (done) { io_poll_remove_double(req); hash_del(&req->hash_node); + req->poll.done = true; } else { req->result = 0; add_wait_queue(req->poll.head, &req->poll.wait); @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked) hash_del(&req->hash_node); io_poll_remove_double(req); + req->poll.done = true; spin_unlock(&ctx->completion_lock); if (!READ_ONCE(apoll->poll.canceled)) -- 2.24.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion 2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu @ 2021-09-15 9:50 ` Pavel Begunkov 2021-09-15 10:49 ` Hao Xu 2021-09-15 10:12 ` Pavel Begunkov 1 sibling, 1 reply; 11+ messages in thread From: Pavel Begunkov @ 2021-09-15 9:50 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 9/12/21 5:23 PM, Hao Xu wrote: > If poll arming and poll completion runs parallelly, there maybe races. > For instance, run io_poll_add in iowq and io_poll_task_func in original > context, then: > iowq original context > io_poll_add > vfs_poll > (interruption happens > tw queued to original > context) io_poll_task_func > generate cqe > del from cancel_hash[] > if !poll.done > insert to cancel_hash[] > > The entry left in cancel_hash[], similar case for fast poll. > Fix it by set poll.done = true when del from cancel_hash[]. Sounds like a valid concern. And the code looks good, but one of two patches crashed the kernel somewhere in io_read(). ./232c93d07b74-test will be retesting > Signed-off-by: Hao Xu <[email protected]> > --- > > Didn't find the exact commit to add Fixes: for.. That's ok. Maybe you can find which kernel versions are affected? So we can add stable and specify where it should be backported? E.g. Cc: [email protected] # 5.12+ > fs/io_uring.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index c16f6be3d46b..988679e5063f 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask) > } > if (req->poll.events & EPOLLONESHOT) > flags = 0; > - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) { > - req->poll.done = true; > + if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) > flags = 0; > - } > if (flags & IORING_CQE_F_MORE) > ctx->cq_extra++; > > @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked) > if (done) { > io_poll_remove_double(req); > hash_del(&req->hash_node); > + req->poll.done = true; > } else { > req->result = 0; > add_wait_queue(req->poll.head, &req->poll.wait); > @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked) > > hash_del(&req->hash_node); > io_poll_remove_double(req); > + req->poll.done = true; > spin_unlock(&ctx->completion_lock); > > if (!READ_ONCE(apoll->poll.canceled)) > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion 2021-09-15 9:50 ` Pavel Begunkov @ 2021-09-15 10:49 ` Hao Xu 0 siblings, 0 replies; 11+ messages in thread From: Hao Xu @ 2021-09-15 10:49 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/9/15 下午5:50, Pavel Begunkov 写道: > On 9/12/21 5:23 PM, Hao Xu wrote: >> If poll arming and poll completion runs parallelly, there maybe races. >> For instance, run io_poll_add in iowq and io_poll_task_func in original >> context, then: >> iowq original context >> io_poll_add >> vfs_poll >> (interruption happens >> tw queued to original >> context) io_poll_task_func >> generate cqe >> del from cancel_hash[] >> if !poll.done >> insert to cancel_hash[] >> >> The entry left in cancel_hash[], similar case for fast poll. >> Fix it by set poll.done = true when del from cancel_hash[]. > > Sounds like a valid concern. And the code looks good, but one > of two patches crashed the kernel somewhere in io_read(). > > ./232c93d07b74-test I'll run it too, didn't meet that when I did the test. > > will be retesting > > >> Signed-off-by: Hao Xu <[email protected]> >> --- >> >> Didn't find the exact commit to add Fixes: for.. > > That's ok. Maybe you can find which kernel versions are > affected? So we can add stable and specify where it should > be backported? E.g. Sure, will do that. > > Cc: [email protected] # 5.12+ > > >> fs/io_uring.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index c16f6be3d46b..988679e5063f 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask) >> } >> if (req->poll.events & EPOLLONESHOT) >> flags = 0; >> - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) { >> - req->poll.done = true; >> + if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) >> flags = 0; >> - } >> if (flags & IORING_CQE_F_MORE) >> ctx->cq_extra++; >> >> @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked) >> if (done) { >> io_poll_remove_double(req); >> hash_del(&req->hash_node); >> + req->poll.done = true; >> } else { >> req->result = 0; >> add_wait_queue(req->poll.head, &req->poll.wait); >> @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked) >> >> hash_del(&req->hash_node); >> io_poll_remove_double(req); >> + req->poll.done = true; >> spin_unlock(&ctx->completion_lock); >> >> if (!READ_ONCE(apoll->poll.canceled)) >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion 2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu 2021-09-15 9:50 ` Pavel Begunkov @ 2021-09-15 10:12 ` Pavel Begunkov 2021-09-15 10:50 ` Hao Xu 1 sibling, 1 reply; 11+ messages in thread From: Pavel Begunkov @ 2021-09-15 10:12 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 9/12/21 5:23 PM, Hao Xu wrote: > If poll arming and poll completion runs parallelly, there maybe races. > For instance, run io_poll_add in iowq and io_poll_task_func in original > context, then: > iowq original context > io_poll_add > vfs_poll > (interruption happens > tw queued to original > context) io_poll_task_func > generate cqe > del from cancel_hash[] > if !poll.done > insert to cancel_hash[] > > The entry left in cancel_hash[], similar case for fast poll. > Fix it by set poll.done = true when del from cancel_hash[]. > > Signed-off-by: Hao Xu <[email protected]> > --- > > Didn't find the exact commit to add Fixes: for.. > > fs/io_uring.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index c16f6be3d46b..988679e5063f 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask) > } > if (req->poll.events & EPOLLONESHOT) > flags = 0; > - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) { > - req->poll.done = true; > + if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) > flags = 0; > - } > if (flags & IORING_CQE_F_MORE) > ctx->cq_extra++; > > @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked) > if (done) { > io_poll_remove_double(req); > hash_del(&req->hash_node); > + req->poll.done = true; > } else { > req->result = 0; > add_wait_queue(req->poll.head, &req->poll.wait); > @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked) > > hash_del(&req->hash_node); > io_poll_remove_double(req); > + req->poll.done = true; Only poll request has req->poll. E.g. it overwrites parts of req->rw.kiocb, I guess .ki_complete in particular. struct async_poll *apoll = req->apoll; apoll->poll.done = true; > spin_unlock(&ctx->completion_lock); > > if (!READ_ONCE(apoll->poll.canceled)) > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion 2021-09-15 10:12 ` Pavel Begunkov @ 2021-09-15 10:50 ` Hao Xu 0 siblings, 0 replies; 11+ messages in thread From: Hao Xu @ 2021-09-15 10:50 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/9/15 下午6:12, Pavel Begunkov 写道: > On 9/12/21 5:23 PM, Hao Xu wrote: >> If poll arming and poll completion runs parallelly, there maybe races. >> For instance, run io_poll_add in iowq and io_poll_task_func in original >> context, then: >> iowq original context >> io_poll_add >> vfs_poll >> (interruption happens >> tw queued to original >> context) io_poll_task_func >> generate cqe >> del from cancel_hash[] >> if !poll.done >> insert to cancel_hash[] >> >> The entry left in cancel_hash[], similar case for fast poll. >> Fix it by set poll.done = true when del from cancel_hash[]. >> >> Signed-off-by: Hao Xu <[email protected]> >> --- >> >> Didn't find the exact commit to add Fixes: for.. >> >> fs/io_uring.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index c16f6be3d46b..988679e5063f 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask) >> } >> if (req->poll.events & EPOLLONESHOT) >> flags = 0; >> - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) { >> - req->poll.done = true; >> + if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) >> flags = 0; >> - } >> if (flags & IORING_CQE_F_MORE) >> ctx->cq_extra++; >> >> @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked) >> if (done) { >> io_poll_remove_double(req); >> hash_del(&req->hash_node); >> + req->poll.done = true; >> } else { >> req->result = 0; >> add_wait_queue(req->poll.head, &req->poll.wait); >> @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked) >> >> hash_del(&req->hash_node); >> io_poll_remove_double(req); >> + req->poll.done = true; > > Only poll request has req->poll. E.g. it overwrites parts of req->rw.kiocb, > I guess .ki_complete in particular. > > struct async_poll *apoll = req->apoll; > apoll->poll.done = true; Thanks! > > >> spin_unlock(&ctx->completion_lock); >> >> if (!READ_ONCE(apoll->poll.canceled)) >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-09-29 11:17 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-09-12 16:23 [PATCH 0/2] poll fixes Hao Xu 2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu 2021-09-15 9:44 ` Pavel Begunkov 2021-09-15 10:48 ` Hao Xu 2021-09-26 9:48 ` Hao Xu 2021-09-29 11:16 ` Pavel Begunkov 2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu 2021-09-15 9:50 ` Pavel Begunkov 2021-09-15 10:49 ` Hao Xu 2021-09-15 10:12 ` Pavel Begunkov 2021-09-15 10:50 ` Hao Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox