* [PATCH 0/2] poll fixes
@ 2021-09-12 16:23 Hao Xu
2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu
0 siblings, 2 replies; 11+ messages in thread
From: Hao Xu @ 2021-09-12 16:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi
Two fixes about poll.
Hao Xu (2):
io_uring: fix tw list mess-up by adding tw while it's already in tw
list
io_uring: fix race between poll completion and cancel_hash insertion
fs/io_uring.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
--
2.24.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
2021-09-12 16:23 [PATCH 0/2] poll fixes Hao Xu
@ 2021-09-12 16:23 ` Hao Xu
2021-09-15 9:44 ` Pavel Begunkov
2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu
1 sibling, 1 reply; 11+ messages in thread
From: Hao Xu @ 2021-09-12 16:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi
For multishot mode, there may be cases like:
io_poll_task_func()
-> add_wait_queue()
async_wake()
->io_req_task_work_add()
this one mess up the running task_work list
since req->io_task_work.node is in use.
similar situation for req->io_task_work.fallback_node.
Fix it by set node->next = NULL before we run the tw, so that when we
add req back to the wait queue in middle of tw running, we can safely
re-add it to the tw list.
Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
Signed-off-by: Hao Xu <[email protected]>
---
fs/io_uring.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 30d959416eba..c16f6be3d46b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work)
struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
fallback_work.work);
struct llist_node *node = llist_del_all(&ctx->fallback_llist);
- struct io_kiocb *req, *tmp;
+ struct io_kiocb *req;
bool locked = false;
percpu_ref_get(&ctx->refs);
- llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
+ req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
+ while (member_address_is_nonnull(req, io_task_work.fallback_node)) {
+ node = req->io_task_work.fallback_node.next;
+ req->io_task_work.fallback_node.next = NULL;
req->io_task_work.func(req, &locked);
-
+ req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
+ }
if (locked) {
if (ctx->submit_state.compl_nr)
io_submit_flush_completions(ctx);
@@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb)
locked = mutex_trylock(&ctx->uring_lock);
percpu_ref_get(&ctx->refs);
}
+ node->next = NULL;
req->io_task_work.func(req, &locked);
node = next;
} while (node);
--
2.24.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion
2021-09-12 16:23 [PATCH 0/2] poll fixes Hao Xu
2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
@ 2021-09-12 16:23 ` Hao Xu
2021-09-15 9:50 ` Pavel Begunkov
2021-09-15 10:12 ` Pavel Begunkov
1 sibling, 2 replies; 11+ messages in thread
From: Hao Xu @ 2021-09-12 16:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi
If poll arming and poll completion runs parallelly, there maybe races.
For instance, run io_poll_add in iowq and io_poll_task_func in original
context, then:
iowq original context
io_poll_add
vfs_poll
(interruption happens
tw queued to original
context) io_poll_task_func
generate cqe
del from cancel_hash[]
if !poll.done
insert to cancel_hash[]
The entry left in cancel_hash[], similar case for fast poll.
Fix it by set poll.done = true when del from cancel_hash[].
Signed-off-by: Hao Xu <[email protected]>
---
Didn't find the exact commit to add Fixes: for..
fs/io_uring.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index c16f6be3d46b..988679e5063f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask)
}
if (req->poll.events & EPOLLONESHOT)
flags = 0;
- if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) {
- req->poll.done = true;
+ if (!io_cqring_fill_event(ctx, req->user_data, error, flags))
flags = 0;
- }
if (flags & IORING_CQE_F_MORE)
ctx->cq_extra++;
@@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
if (done) {
io_poll_remove_double(req);
hash_del(&req->hash_node);
+ req->poll.done = true;
} else {
req->result = 0;
add_wait_queue(req->poll.head, &req->poll.wait);
@@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked)
hash_del(&req->hash_node);
io_poll_remove_double(req);
+ req->poll.done = true;
spin_unlock(&ctx->completion_lock);
if (!READ_ONCE(apoll->poll.canceled))
--
2.24.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
@ 2021-09-15 9:44 ` Pavel Begunkov
2021-09-15 10:48 ` Hao Xu
0 siblings, 1 reply; 11+ messages in thread
From: Pavel Begunkov @ 2021-09-15 9:44 UTC (permalink / raw)
To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi
On 9/12/21 5:23 PM, Hao Xu wrote:
> For multishot mode, there may be cases like:
> io_poll_task_func()
> -> add_wait_queue()
> async_wake()
> ->io_req_task_work_add()
> this one mess up the running task_work list
> since req->io_task_work.node is in use.
>
> similar situation for req->io_task_work.fallback_node.
> Fix it by set node->next = NULL before we run the tw, so that when we
> add req back to the wait queue in middle of tw running, we can safely
> re-add it to the tw list.
It may get screwed before we get to "node->next = NULL;",
-> async_wake()
-> io_req_task_work_add()
-> async_wake()
-> io_req_task_work_add()
tctx_task_work()
> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
> Signed-off-by: Hao Xu <[email protected]>
> ---
>
> fs/io_uring.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 30d959416eba..c16f6be3d46b 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work)
> struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
> fallback_work.work);
> struct llist_node *node = llist_del_all(&ctx->fallback_llist);
> - struct io_kiocb *req, *tmp;
> + struct io_kiocb *req;
> bool locked = false;
>
> percpu_ref_get(&ctx->refs);
> - llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
> + while (member_address_is_nonnull(req, io_task_work.fallback_node)) {
> + node = req->io_task_work.fallback_node.next;
> + req->io_task_work.fallback_node.next = NULL;
> req->io_task_work.func(req, &locked);
> -
> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
> + }
> if (locked) {
> if (ctx->submit_state.compl_nr)
> io_submit_flush_completions(ctx);
> @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb)
> locked = mutex_trylock(&ctx->uring_lock);
> percpu_ref_get(&ctx->refs);
> }
> + node->next = NULL;
> req->io_task_work.func(req, &locked);
> node = next;
> } while (node);
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion
2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu
@ 2021-09-15 9:50 ` Pavel Begunkov
2021-09-15 10:49 ` Hao Xu
2021-09-15 10:12 ` Pavel Begunkov
1 sibling, 1 reply; 11+ messages in thread
From: Pavel Begunkov @ 2021-09-15 9:50 UTC (permalink / raw)
To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi
On 9/12/21 5:23 PM, Hao Xu wrote:
> If poll arming and poll completion runs parallelly, there maybe races.
> For instance, run io_poll_add in iowq and io_poll_task_func in original
> context, then:
> iowq original context
> io_poll_add
> vfs_poll
> (interruption happens
> tw queued to original
> context) io_poll_task_func
> generate cqe
> del from cancel_hash[]
> if !poll.done
> insert to cancel_hash[]
>
> The entry left in cancel_hash[], similar case for fast poll.
> Fix it by set poll.done = true when del from cancel_hash[].
Sounds like a valid concern. And the code looks good, but one
of two patches crashed the kernel somewhere in io_read().
./232c93d07b74-test
will be retesting
> Signed-off-by: Hao Xu <[email protected]>
> ---
>
> Didn't find the exact commit to add Fixes: for..
That's ok. Maybe you can find which kernel versions are
affected? So we can add stable and specify where it should
be backported? E.g.
Cc: [email protected] # 5.12+
> fs/io_uring.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index c16f6be3d46b..988679e5063f 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask)
> }
> if (req->poll.events & EPOLLONESHOT)
> flags = 0;
> - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) {
> - req->poll.done = true;
> + if (!io_cqring_fill_event(ctx, req->user_data, error, flags))
> flags = 0;
> - }
> if (flags & IORING_CQE_F_MORE)
> ctx->cq_extra++;
>
> @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
> if (done) {
> io_poll_remove_double(req);
> hash_del(&req->hash_node);
> + req->poll.done = true;
> } else {
> req->result = 0;
> add_wait_queue(req->poll.head, &req->poll.wait);
> @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked)
>
> hash_del(&req->hash_node);
> io_poll_remove_double(req);
> + req->poll.done = true;
> spin_unlock(&ctx->completion_lock);
>
> if (!READ_ONCE(apoll->poll.canceled))
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion
2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu
2021-09-15 9:50 ` Pavel Begunkov
@ 2021-09-15 10:12 ` Pavel Begunkov
2021-09-15 10:50 ` Hao Xu
1 sibling, 1 reply; 11+ messages in thread
From: Pavel Begunkov @ 2021-09-15 10:12 UTC (permalink / raw)
To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi
On 9/12/21 5:23 PM, Hao Xu wrote:
> If poll arming and poll completion runs parallelly, there maybe races.
> For instance, run io_poll_add in iowq and io_poll_task_func in original
> context, then:
> iowq original context
> io_poll_add
> vfs_poll
> (interruption happens
> tw queued to original
> context) io_poll_task_func
> generate cqe
> del from cancel_hash[]
> if !poll.done
> insert to cancel_hash[]
>
> The entry left in cancel_hash[], similar case for fast poll.
> Fix it by set poll.done = true when del from cancel_hash[].
>
> Signed-off-by: Hao Xu <[email protected]>
> ---
>
> Didn't find the exact commit to add Fixes: for..
>
> fs/io_uring.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index c16f6be3d46b..988679e5063f 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask)
> }
> if (req->poll.events & EPOLLONESHOT)
> flags = 0;
> - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) {
> - req->poll.done = true;
> + if (!io_cqring_fill_event(ctx, req->user_data, error, flags))
> flags = 0;
> - }
> if (flags & IORING_CQE_F_MORE)
> ctx->cq_extra++;
>
> @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
> if (done) {
> io_poll_remove_double(req);
> hash_del(&req->hash_node);
> + req->poll.done = true;
> } else {
> req->result = 0;
> add_wait_queue(req->poll.head, &req->poll.wait);
> @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked)
>
> hash_del(&req->hash_node);
> io_poll_remove_double(req);
> + req->poll.done = true;
Only poll request has req->poll. E.g. it overwrites parts of req->rw.kiocb,
I guess .ki_complete in particular.
struct async_poll *apoll = req->apoll;
apoll->poll.done = true;
> spin_unlock(&ctx->completion_lock);
>
> if (!READ_ONCE(apoll->poll.canceled))
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
2021-09-15 9:44 ` Pavel Begunkov
@ 2021-09-15 10:48 ` Hao Xu
2021-09-26 9:48 ` Hao Xu
0 siblings, 1 reply; 11+ messages in thread
From: Hao Xu @ 2021-09-15 10:48 UTC (permalink / raw)
To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi
在 2021/9/15 下午5:44, Pavel Begunkov 写道:
> On 9/12/21 5:23 PM, Hao Xu wrote:
>> For multishot mode, there may be cases like:
>> io_poll_task_func()
>> -> add_wait_queue()
>> async_wake()
>> ->io_req_task_work_add()
>> this one mess up the running task_work list
>> since req->io_task_work.node is in use.
>>
>> similar situation for req->io_task_work.fallback_node.
>> Fix it by set node->next = NULL before we run the tw, so that when we
>> add req back to the wait queue in middle of tw running, we can safely
>> re-add it to the tw list.
>
> It may get screwed before we get to "node->next = NULL;",
>
> -> async_wake()
> -> io_req_task_work_add()
> -> async_wake()
> -> io_req_task_work_add()
> tctx_task_work()
True, this may happen if there is second poll wait entry.
This pacth is for single wait entry only..
I'm thinking about the second poll entry issue, would be in a separate
patch.
>
>
>> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
>> Signed-off-by: Hao Xu <[email protected]>
>> ---
>>
>> fs/io_uring.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 30d959416eba..c16f6be3d46b 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work)
>> struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
>> fallback_work.work);
>> struct llist_node *node = llist_del_all(&ctx->fallback_llist);
>> - struct io_kiocb *req, *tmp;
>> + struct io_kiocb *req;
>> bool locked = false;
>>
>> percpu_ref_get(&ctx->refs);
>> - llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
>> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
>> + while (member_address_is_nonnull(req, io_task_work.fallback_node)) {
>> + node = req->io_task_work.fallback_node.next;
>> + req->io_task_work.fallback_node.next = NULL;
>> req->io_task_work.func(req, &locked);
>> -
>> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
>> + }
>> if (locked) {
>> if (ctx->submit_state.compl_nr)
>> io_submit_flush_completions(ctx);
>> @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb)
>> locked = mutex_trylock(&ctx->uring_lock);
>> percpu_ref_get(&ctx->refs);
>> }
>> + node->next = NULL;
>> req->io_task_work.func(req, &locked);
>> node = next;
>> } while (node);
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion
2021-09-15 9:50 ` Pavel Begunkov
@ 2021-09-15 10:49 ` Hao Xu
0 siblings, 0 replies; 11+ messages in thread
From: Hao Xu @ 2021-09-15 10:49 UTC (permalink / raw)
To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi
在 2021/9/15 下午5:50, Pavel Begunkov 写道:
> On 9/12/21 5:23 PM, Hao Xu wrote:
>> If poll arming and poll completion runs parallelly, there maybe races.
>> For instance, run io_poll_add in iowq and io_poll_task_func in original
>> context, then:
>> iowq original context
>> io_poll_add
>> vfs_poll
>> (interruption happens
>> tw queued to original
>> context) io_poll_task_func
>> generate cqe
>> del from cancel_hash[]
>> if !poll.done
>> insert to cancel_hash[]
>>
>> The entry left in cancel_hash[], similar case for fast poll.
>> Fix it by set poll.done = true when del from cancel_hash[].
>
> Sounds like a valid concern. And the code looks good, but one
> of two patches crashed the kernel somewhere in io_read().
>
> ./232c93d07b74-test
I'll run it too, didn't meet that when I did the test.
>
> will be retesting
>
>
>> Signed-off-by: Hao Xu <[email protected]>
>> ---
>>
>> Didn't find the exact commit to add Fixes: for..
>
> That's ok. Maybe you can find which kernel versions are
> affected? So we can add stable and specify where it should
> be backported? E.g.
Sure, will do that.
>
> Cc: [email protected] # 5.12+
>
>
>> fs/io_uring.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index c16f6be3d46b..988679e5063f 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask)
>> }
>> if (req->poll.events & EPOLLONESHOT)
>> flags = 0;
>> - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) {
>> - req->poll.done = true;
>> + if (!io_cqring_fill_event(ctx, req->user_data, error, flags))
>> flags = 0;
>> - }
>> if (flags & IORING_CQE_F_MORE)
>> ctx->cq_extra++;
>>
>> @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
>> if (done) {
>> io_poll_remove_double(req);
>> hash_del(&req->hash_node);
>> + req->poll.done = true;
>> } else {
>> req->result = 0;
>> add_wait_queue(req->poll.head, &req->poll.wait);
>> @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked)
>>
>> hash_del(&req->hash_node);
>> io_poll_remove_double(req);
>> + req->poll.done = true;
>> spin_unlock(&ctx->completion_lock);
>>
>> if (!READ_ONCE(apoll->poll.canceled))
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion
2021-09-15 10:12 ` Pavel Begunkov
@ 2021-09-15 10:50 ` Hao Xu
0 siblings, 0 replies; 11+ messages in thread
From: Hao Xu @ 2021-09-15 10:50 UTC (permalink / raw)
To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi
在 2021/9/15 下午6:12, Pavel Begunkov 写道:
> On 9/12/21 5:23 PM, Hao Xu wrote:
>> If poll arming and poll completion runs parallelly, there maybe races.
>> For instance, run io_poll_add in iowq and io_poll_task_func in original
>> context, then:
>> iowq original context
>> io_poll_add
>> vfs_poll
>> (interruption happens
>> tw queued to original
>> context) io_poll_task_func
>> generate cqe
>> del from cancel_hash[]
>> if !poll.done
>> insert to cancel_hash[]
>>
>> The entry left in cancel_hash[], similar case for fast poll.
>> Fix it by set poll.done = true when del from cancel_hash[].
>>
>> Signed-off-by: Hao Xu <[email protected]>
>> ---
>>
>> Didn't find the exact commit to add Fixes: for..
>>
>> fs/io_uring.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index c16f6be3d46b..988679e5063f 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -5118,10 +5118,8 @@ static bool __io_poll_complete(struct io_kiocb *req, __poll_t mask)
>> }
>> if (req->poll.events & EPOLLONESHOT)
>> flags = 0;
>> - if (!io_cqring_fill_event(ctx, req->user_data, error, flags)) {
>> - req->poll.done = true;
>> + if (!io_cqring_fill_event(ctx, req->user_data, error, flags))
>> flags = 0;
>> - }
>> if (flags & IORING_CQE_F_MORE)
>> ctx->cq_extra++;
>>
>> @@ -5152,6 +5150,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
>> if (done) {
>> io_poll_remove_double(req);
>> hash_del(&req->hash_node);
>> + req->poll.done = true;
>> } else {
>> req->result = 0;
>> add_wait_queue(req->poll.head, &req->poll.wait);
>> @@ -5289,6 +5288,7 @@ static void io_async_task_func(struct io_kiocb *req, bool *locked)
>>
>> hash_del(&req->hash_node);
>> io_poll_remove_double(req);
>> + req->poll.done = true;
>
> Only poll request has req->poll. E.g. it overwrites parts of req->rw.kiocb,
> I guess .ki_complete in particular.
>
> struct async_poll *apoll = req->apoll;
> apoll->poll.done = true;
Thanks!
>
>
>> spin_unlock(&ctx->completion_lock);
>>
>> if (!READ_ONCE(apoll->poll.canceled))
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
2021-09-15 10:48 ` Hao Xu
@ 2021-09-26 9:48 ` Hao Xu
2021-09-29 11:16 ` Pavel Begunkov
0 siblings, 1 reply; 11+ messages in thread
From: Hao Xu @ 2021-09-26 9:48 UTC (permalink / raw)
To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi
在 2021/9/15 下午6:48, Hao Xu 写道:
> 在 2021/9/15 下午5:44, Pavel Begunkov 写道:
>> On 9/12/21 5:23 PM, Hao Xu wrote:
>>> For multishot mode, there may be cases like:
>>> io_poll_task_func()
>>> -> add_wait_queue()
>>> async_wake()
>>> ->io_req_task_work_add()
>>> this one mess up the running task_work list
>>> since req->io_task_work.node is in use.
>>>
>>> similar situation for req->io_task_work.fallback_node.
>>> Fix it by set node->next = NULL before we run the tw, so that when we
>>> add req back to the wait queue in middle of tw running, we can safely
>>> re-add it to the tw list.
>>
>> It may get screwed before we get to "node->next = NULL;",
>>
>> -> async_wake()
>> -> io_req_task_work_add()
>> -> async_wake()
>> -> io_req_task_work_add()
>> tctx_task_work()
> True, this may happen if there is second poll wait entry.
> This pacth is for single wait entry only..
> I'm thinking about the second poll entry issue, would be in a separate
> patch.
hmm, reviewed this email again and now I think I got what you were
saying, do you mean the second async_wake() triggered before we removed
the wait entry in the first async_wake(), like
async_wake
async_wake
->del wait entry
>>
>>
>>> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
>>> Signed-off-by: Hao Xu <[email protected]>
>>> ---
>>>
>>> fs/io_uring.c | 11 ++++++++---
>>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index 30d959416eba..c16f6be3d46b 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct
>>> work_struct *work)
>>> struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
>>> fallback_work.work);
>>> struct llist_node *node = llist_del_all(&ctx->fallback_llist);
>>> - struct io_kiocb *req, *tmp;
>>> + struct io_kiocb *req;
>>> bool locked = false;
>>> percpu_ref_get(&ctx->refs);
>>> - llist_for_each_entry_safe(req, tmp, node,
>>> io_task_work.fallback_node)
>>> + req = llist_entry(node, struct io_kiocb,
>>> io_task_work.fallback_node);
>>> + while (member_address_is_nonnull(req,
>>> io_task_work.fallback_node)) {
>>> + node = req->io_task_work.fallback_node.next;
>>> + req->io_task_work.fallback_node.next = NULL;
>>> req->io_task_work.func(req, &locked);
>>> -
>>> + req = llist_entry(node, struct io_kiocb,
>>> io_task_work.fallback_node);
>>> + }
>>> if (locked) {
>>> if (ctx->submit_state.compl_nr)
>>> io_submit_flush_completions(ctx);
>>> @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head
>>> *cb)
>>> locked = mutex_trylock(&ctx->uring_lock);
>>> percpu_ref_get(&ctx->refs);
>>> }
>>> + node->next = NULL;
>>> req->io_task_work.func(req, &locked);
>>> node = next;
>>> } while (node);
>>>
>>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list
2021-09-26 9:48 ` Hao Xu
@ 2021-09-29 11:16 ` Pavel Begunkov
0 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2021-09-29 11:16 UTC (permalink / raw)
To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi
On 9/26/21 10:48 AM, Hao Xu wrote:
> 在 2021/9/15 下午6:48, Hao Xu 写道:
>> 在 2021/9/15 下午5:44, Pavel Begunkov 写道:
>>> On 9/12/21 5:23 PM, Hao Xu wrote:
>>>> For multishot mode, there may be cases like:
>>>> io_poll_task_func()
>>>> -> add_wait_queue()
>>>> async_wake()
>>>> ->io_req_task_work_add()
>>>> this one mess up the running task_work list
>>>> since req->io_task_work.node is in use.
>>>>
>>>> similar situation for req->io_task_work.fallback_node.
>>>> Fix it by set node->next = NULL before we run the tw, so that when we
>>>> add req back to the wait queue in middle of tw running, we can safely
>>>> re-add it to the tw list.
>>>
>>> It may get screwed before we get to "node->next = NULL;",
>>>
>>> -> async_wake()
>>> -> io_req_task_work_add()
>>> -> async_wake()
>>> -> io_req_task_work_add()
>>> tctx_task_work()
>> True, this may happen if there is second poll wait entry.
>> This pacth is for single wait entry only..
>> I'm thinking about the second poll entry issue, would be in a separate
>> patch.
> hmm, reviewed this email again and now I think I got what you were
> saying, do you mean the second async_wake() triggered before we removed
> the wait entry in the first async_wake(), like
>
> async_wake
> async_wake
> ->del wait entry
Looks we had different problems in mind, let's move the conversation to
the new thread with resent patches
>>>> Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work")
>>>> Signed-off-by: Hao Xu <[email protected]>
>>>> ---
>>>>
>>>> fs/io_uring.c | 11 ++++++++---
>>>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>> index 30d959416eba..c16f6be3d46b 100644
>>>> --- a/fs/io_uring.c
>>>> +++ b/fs/io_uring.c
>>>> @@ -1216,13 +1216,17 @@ static void io_fallback_req_func(struct work_struct *work)
>>>> struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
>>>> fallback_work.work);
>>>> struct llist_node *node = llist_del_all(&ctx->fallback_llist);
>>>> - struct io_kiocb *req, *tmp;
>>>> + struct io_kiocb *req;
>>>> bool locked = false;
>>>> percpu_ref_get(&ctx->refs);
>>>> - llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
>>>> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
>>>> + while (member_address_is_nonnull(req, io_task_work.fallback_node)) {
>>>> + node = req->io_task_work.fallback_node.next;
>>>> + req->io_task_work.fallback_node.next = NULL;
>>>> req->io_task_work.func(req, &locked);
>>>> -
>>>> + req = llist_entry(node, struct io_kiocb, io_task_work.fallback_node);
>>>> + }
>>>> if (locked) {
>>>> if (ctx->submit_state.compl_nr)
>>>> io_submit_flush_completions(ctx);
>>>> @@ -2126,6 +2130,7 @@ static void tctx_task_work(struct callback_head *cb)
>>>> locked = mutex_trylock(&ctx->uring_lock);
>>>> percpu_ref_get(&ctx->refs);
>>>> }
>>>> + node->next = NULL;
>>>> req->io_task_work.func(req, &locked);
>>>> node = next;
>>>> } while (node);
>>>>
>>>
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-09-29 11:17 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-12 16:23 [PATCH 0/2] poll fixes Hao Xu
2021-09-12 16:23 ` [PATCH 1/2] io_uring: fix tw list mess-up by adding tw while it's already in tw list Hao Xu
2021-09-15 9:44 ` Pavel Begunkov
2021-09-15 10:48 ` Hao Xu
2021-09-26 9:48 ` Hao Xu
2021-09-29 11:16 ` Pavel Begunkov
2021-09-12 16:23 ` [PATCH 2/2] io_uring: fix race between poll completion and cancel_hash insertion Hao Xu
2021-09-15 9:50 ` Pavel Begunkov
2021-09-15 10:49 ` Hao Xu
2021-09-15 10:12 ` Pavel Begunkov
2021-09-15 10:50 ` Hao Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox