public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 0/2] post F_COMP_LOCKED optimisations
@ 2020-10-14 19:44 Pavel Begunkov
  2020-10-14 19:44 ` [PATCH 1/2] io_uring: optimise COMP_LOCK-less flush_completion Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Pavel Begunkov @ 2020-10-14 19:44 UTC (permalink / raw)
  To: Jens Axboe, io-uring

A naive test with io_uring-bench with [nops, submit/complete batch=32]
shows 5500 vs 8500 KIOPS. Not really representative but might hurt if
left without [1/2].

Pavel Begunkov (2):
  io_uring: optimise COMP_LOCK-less flush_completion
  io_uring: optimise io_fail_links()

 fs/io_uring.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] io_uring: optimise COMP_LOCK-less flush_completion
  2020-10-14 19:44 [PATCH 0/2] post F_COMP_LOCKED optimisations Pavel Begunkov
@ 2020-10-14 19:44 ` Pavel Begunkov
  2020-10-14 19:44 ` [PATCH 2/2] io_uring: optimise io_fail_links() Pavel Begunkov
  2020-10-14 19:53 ` [PATCH 0/2] post F_COMP_LOCKED optimisations Jens Axboe
  2 siblings, 0 replies; 9+ messages in thread
From: Pavel Begunkov @ 2020-10-14 19:44 UTC (permalink / raw)
  To: Jens Axboe, io-uring

After removing REQ_F_COMP_LOCKED io_submit_flush_completions() was left
deferring completions via task_work_add() in the hot path, that might
end up in a performance regression.

io_put_req() takes ->completion_lock only when at least one of
REQ_F_FAIL_LINK or REQ_F_LINK_TIMEOUT is set. There is also
REQ_F_WORK_INITIALIZED, freeing which while holding the lock have to be
avoided because it may deadlock with work.fs->lock. And if none of them
is set we can put under ->completion_lock and save an extra unlock/lock.

That actually works even better because it also works for most linked
requests unlike it was before.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index cb2640f6fdb2..f61af4d487fd 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1601,12 +1601,19 @@ static void io_submit_flush_completions(struct io_comp_state *cs)
 		req = list_first_entry(&cs->list, struct io_kiocb, compl.list);
 		list_del(&req->compl.list);
 		__io_cqring_fill_event(req, req->result, req->compl.cflags);
-		if (!(req->flags & REQ_F_LINK_HEAD)) {
-			io_put_req_deferred(req, 1);
-		} else {
+
+		/*
+		 * io_free_req() doesn't care about completion_lock unless one
+		 * of these flags is set. REQ_F_WORK_INITIALIZED is in the list
+		 * because of a potential deadlock with req->work.fs->lock
+		 */
+		if (req->flags & (REQ_F_FAIL_LINK|REQ_F_LINK_TIMEOUT
+				 |REQ_F_WORK_INITIALIZED)) {
 			spin_unlock_irq(&ctx->completion_lock);
 			io_put_req(req);
 			spin_lock_irq(&ctx->completion_lock);
+		} else {
+			io_put_req(req);
 		}
 	}
 	io_commit_cqring(ctx);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] io_uring: optimise io_fail_links()
  2020-10-14 19:44 [PATCH 0/2] post F_COMP_LOCKED optimisations Pavel Begunkov
  2020-10-14 19:44 ` [PATCH 1/2] io_uring: optimise COMP_LOCK-less flush_completion Pavel Begunkov
@ 2020-10-14 19:44 ` Pavel Begunkov
  2020-10-15  8:53   ` Nathan Chancellor
  2020-10-14 19:53 ` [PATCH 0/2] post F_COMP_LOCKED optimisations Jens Axboe
  2 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2020-10-14 19:44 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Optimise put handling in __io_fail_links() after removing
REQ_F_COMP_LOCKED.

Signed-off-by: Pavel Begunkov <[email protected]>
---
 fs/io_uring.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f61af4d487fd..271a016e8507 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1816,7 +1816,16 @@ static void __io_fail_links(struct io_kiocb *req)
 		trace_io_uring_fail_link(req, link);
 
 		io_cqring_fill_event(link, -ECANCELED);
-		io_put_req_deferred(link, 2);
+
+		/*
+		 * It's ok to free under spinlock as they're not linked anymore,
+		 * but avoid REQ_F_WORK_INITIALIZED because it may deadlock on
+		 * work.fs->lock.
+		 */
+		if (link->flags | REQ_F_WORK_INITIALIZED)
+			io_put_req_deferred(link, 2);
+		else
+			io_double_put_req(link);
 	}
 
 	io_commit_cqring(ctx);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] post F_COMP_LOCKED optimisations
  2020-10-14 19:44 [PATCH 0/2] post F_COMP_LOCKED optimisations Pavel Begunkov
  2020-10-14 19:44 ` [PATCH 1/2] io_uring: optimise COMP_LOCK-less flush_completion Pavel Begunkov
  2020-10-14 19:44 ` [PATCH 2/2] io_uring: optimise io_fail_links() Pavel Begunkov
@ 2020-10-14 19:53 ` Jens Axboe
  2020-10-14 20:00   ` Pavel Begunkov
  2 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2020-10-14 19:53 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 10/14/20 1:44 PM, Pavel Begunkov wrote:
> A naive test with io_uring-bench with [nops, submit/complete batch=32]
> shows 5500 vs 8500 KIOPS. Not really representative but might hurt if
> left without [1/2].

Part of this is undoubtedly that we only use the task_work for things
that need signaling, for the deferred put case we should be able to
get by with using notfy == 0 for task_work.

That said, it's nice to avoid when possible. At this point, I'd like
to fold these into the last patch of the original series. What do you
think?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] post F_COMP_LOCKED optimisations
  2020-10-14 19:53 ` [PATCH 0/2] post F_COMP_LOCKED optimisations Jens Axboe
@ 2020-10-14 20:00   ` Pavel Begunkov
  2020-10-14 20:24     ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2020-10-14 20:00 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 14/10/2020 20:53, Jens Axboe wrote:
> On 10/14/20 1:44 PM, Pavel Begunkov wrote:
>> A naive test with io_uring-bench with [nops, submit/complete batch=32]
>> shows 5500 vs 8500 KIOPS. Not really representative but might hurt if
>> left without [1/2].
> 
> Part of this is undoubtedly that we only use the task_work for things
> that need signaling, for the deferred put case we should be able to
> get by with using notfy == 0 for task_work.
> 
> That said, it's nice to avoid when possible. At this point, I'd like
> to fold these into the last patch of the original series. What do you
> think?

Yep, was thinking the same. It's better to be in 5.10, and that would
be easier to do by squashing.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] post F_COMP_LOCKED optimisations
  2020-10-14 20:00   ` Pavel Begunkov
@ 2020-10-14 20:24     ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-14 20:24 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 10/14/20 2:00 PM, Pavel Begunkov wrote:
> On 14/10/2020 20:53, Jens Axboe wrote:
>> On 10/14/20 1:44 PM, Pavel Begunkov wrote:
>>> A naive test with io_uring-bench with [nops, submit/complete batch=32]
>>> shows 5500 vs 8500 KIOPS. Not really representative but might hurt if
>>> left without [1/2].
>>
>> Part of this is undoubtedly that we only use the task_work for things
>> that need signaling, for the deferred put case we should be able to
>> get by with using notfy == 0 for task_work.
>>
>> That said, it's nice to avoid when possible. At this point, I'd like
>> to fold these into the last patch of the original series. What do you
>> think?
> 
> Yep, was thinking the same. It's better to be in 5.10, and that would
> be easier to do by squashing.

OK, done.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] io_uring: optimise io_fail_links()
  2020-10-14 19:44 ` [PATCH 2/2] io_uring: optimise io_fail_links() Pavel Begunkov
@ 2020-10-15  8:53   ` Nathan Chancellor
  2020-10-15 10:11     ` Pavel Begunkov
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Chancellor @ 2020-10-15  8:53 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, io-uring, clang-built-linux

On Wed, Oct 14, 2020 at 08:44:22PM +0100, Pavel Begunkov wrote:
> Optimise put handling in __io_fail_links() after removing
> REQ_F_COMP_LOCKED.
> 
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
>  fs/io_uring.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index f61af4d487fd..271a016e8507 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1816,7 +1816,16 @@ static void __io_fail_links(struct io_kiocb *req)
>  		trace_io_uring_fail_link(req, link);
>  
>  		io_cqring_fill_event(link, -ECANCELED);
> -		io_put_req_deferred(link, 2);
> +
> +		/*
> +		 * It's ok to free under spinlock as they're not linked anymore,
> +		 * but avoid REQ_F_WORK_INITIALIZED because it may deadlock on
> +		 * work.fs->lock.
> +		 */
> +		if (link->flags | REQ_F_WORK_INITIALIZED)
> +			io_put_req_deferred(link, 2);
> +		else
> +			io_double_put_req(link);
>  	}
>  
>  	io_commit_cqring(ctx);
> -- 
> 2.24.0
> 

This part of commit 9c357fed168a ("io_uring: fix REQ_F_COMP_LOCKED by
killing it") causes a clang warning:

fs/io_uring.c:1816:19: warning: bitwise or with non-zero value always
evaluates to true [-Wtautological-bitwise-compare]
                if (link->flags | REQ_F_WORK_INITIALIZED)
                    ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.

According to the comment, was it intended for that to be a bitwise AND
then negated to check for the absence of it? If so, wouldn't it be
clearer to flip the condition so that a negation is not necessary like
below? I can send a formal patch if my analysis is correct but if not,
feel free to fix it yourself and add

Reported-by: Nathan Chancellor <[email protected]>

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 66c41d53a9d3..28b1a0fede3e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1813,10 +1813,10 @@ static void __io_fail_links(struct io_kiocb *req)
 		 * but avoid REQ_F_WORK_INITIALIZED because it may deadlock on
 		 * work.fs->lock.
 		 */
-		if (link->flags | REQ_F_WORK_INITIALIZED)
-			io_put_req_deferred(link, 2);
-		else
+		if (link->flags & REQ_F_WORK_INITIALIZED)
 			io_double_put_req(link);
+		else
+			io_put_req_deferred(link, 2);
 	}
 
 	io_commit_cqring(ctx);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] io_uring: optimise io_fail_links()
  2020-10-15  8:53   ` Nathan Chancellor
@ 2020-10-15 10:11     ` Pavel Begunkov
  2020-10-15 13:09       ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Begunkov @ 2020-10-15 10:11 UTC (permalink / raw)
  To: Nathan Chancellor; +Cc: Jens Axboe, io-uring, clang-built-linux

On 15/10/2020 09:53, Nathan Chancellor wrote:
> On Wed, Oct 14, 2020 at 08:44:22PM +0100, Pavel Begunkov wrote:
>> -		io_put_req_deferred(link, 2);
>> +
>> +		/*
>> +		 * It's ok to free under spinlock as they're not linked anymore,
>> +		 * but avoid REQ_F_WORK_INITIALIZED because it may deadlock on
>> +		 * work.fs->lock.
>> +		 */
>> +		if (link->flags | REQ_F_WORK_INITIALIZED)
>> +			io_put_req_deferred(link, 2);
>> +		else
>> +			io_double_put_req(link);
> 
> fs/io_uring.c:1816:19: warning: bitwise or with non-zero value always
> evaluates to true [-Wtautological-bitwise-compare]
>                 if (link->flags | REQ_F_WORK_INITIALIZED)
>                     ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
> 1 warning generated.
> 
> According to the comment, was it intended for that to be a bitwise AND
> then negated to check for the absence of it? If so, wouldn't it be
> clearer to flip the condition so that a negation is not necessary like
> below? I can send a formal patch if my analysis is correct but if not,
> feel free to fix it yourself and add

I have no idea what have happened, but yeah, there should be "&",
though without any additional negation. That's because deferred
version is safer. 

Nathan, thanks for letting know!
Jens, could you please fold in the change below.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index 66c41d53a9d3..2c83c2688ec5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1813,7 +1813,7 @@ static void __io_fail_links(struct io_kiocb *req)
 		 * but avoid REQ_F_WORK_INITIALIZED because it may deadlock on
 		 * work.fs->lock.
 		 */
-		if (link->flags | REQ_F_WORK_INITIALIZED)
+		if (link->flags & REQ_F_WORK_INITIALIZED)
 			io_put_req_deferred(link, 2);
 		else
 			io_double_put_req(link);


-- 
Pavel Begunkov



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] io_uring: optimise io_fail_links()
  2020-10-15 10:11     ` Pavel Begunkov
@ 2020-10-15 13:09       ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-15 13:09 UTC (permalink / raw)
  To: Pavel Begunkov, Nathan Chancellor; +Cc: io-uring, clang-built-linux

On 10/15/20 4:11 AM, Pavel Begunkov wrote:
> On 15/10/2020 09:53, Nathan Chancellor wrote:
>> On Wed, Oct 14, 2020 at 08:44:22PM +0100, Pavel Begunkov wrote:
>>> -		io_put_req_deferred(link, 2);
>>> +
>>> +		/*
>>> +		 * It's ok to free under spinlock as they're not linked anymore,
>>> +		 * but avoid REQ_F_WORK_INITIALIZED because it may deadlock on
>>> +		 * work.fs->lock.
>>> +		 */
>>> +		if (link->flags | REQ_F_WORK_INITIALIZED)
>>> +			io_put_req_deferred(link, 2);
>>> +		else
>>> +			io_double_put_req(link);
>>
>> fs/io_uring.c:1816:19: warning: bitwise or with non-zero value always
>> evaluates to true [-Wtautological-bitwise-compare]
>>                 if (link->flags | REQ_F_WORK_INITIALIZED)
>>                     ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
>> 1 warning generated.
>>
>> According to the comment, was it intended for that to be a bitwise AND
>> then negated to check for the absence of it? If so, wouldn't it be
>> clearer to flip the condition so that a negation is not necessary like
>> below? I can send a formal patch if my analysis is correct but if not,
>> feel free to fix it yourself and add
> 
> I have no idea what have happened, but yeah, there should be "&",
> though without any additional negation. That's because deferred
> version is safer. 
> 
> Nathan, thanks for letting know!
> Jens, could you please fold in the change below.

Done.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-10-15 13:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-14 19:44 [PATCH 0/2] post F_COMP_LOCKED optimisations Pavel Begunkov
2020-10-14 19:44 ` [PATCH 1/2] io_uring: optimise COMP_LOCK-less flush_completion Pavel Begunkov
2020-10-14 19:44 ` [PATCH 2/2] io_uring: optimise io_fail_links() Pavel Begunkov
2020-10-15  8:53   ` Nathan Chancellor
2020-10-15 10:11     ` Pavel Begunkov
2020-10-15 13:09       ` Jens Axboe
2020-10-14 19:53 ` [PATCH 0/2] post F_COMP_LOCKED optimisations Jens Axboe
2020-10-14 20:00   ` Pavel Begunkov
2020-10-14 20:24     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox