public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Ming Lei <[email protected]>
Cc: Jens Axboe <[email protected]>,
	[email protected], [email protected],
	Kanchan Joshi <[email protected]>
Subject: Re: (subset) [PATCH 00/11] remove aux CQE caches
Date: Sat, 16 Mar 2024 04:13:41 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <ZfUX/kSYOW6we1SB@fedora>

On 3/16/24 03:54, Ming Lei wrote:
> On Sat, Mar 16, 2024 at 02:54:19AM +0000, Pavel Begunkov wrote:
>> On 3/16/24 02:24, Ming Lei wrote:
>>> On Sat, Mar 16, 2024 at 10:04 AM Ming Lei <[email protected]> wrote:
>>>>
>>>> On Fri, Mar 15, 2024 at 04:53:21PM -0600, Jens Axboe wrote:
>>>>>
>>>>> On Fri, 15 Mar 2024 15:29:50 +0000, Pavel Begunkov wrote:
>>>>>> Patch 1 is a fix.
>>>>>>
>>>>>> Patches 2-7 are cleanups mainly dealing with issue_flags conversions,
>>>>>> misundertsandings of the flags and of the tw state. It'd be great to have
>>>>>> even without even w/o the rest.
>>>>>>
>>>>>> 8-11 mandate ctx locking for task_work and finally removes the CQE
>>>>>> caches, instead we post directly into the CQ. Note that the cache is
>>>>>> used by multishot auxiliary completions.
>>>>>>
>>>>>> [...]
>>>>>
>>>>> Applied, thanks!
>>>>
>>>> Hi Jens and Pavel,
>>>>
>>>> Looks this patch causes hang when running './check ublk/002' in blktests.
>>>
>>> Not take close look, and  I guess it hangs in
>>>
>>> io_uring_cmd_del_cancelable() -> io_ring_submit_lock
>>
>> Thanks, the trace doesn't completely explains it, but my blind spot
>> was io_uring_cmd_done() potentially grabbing the mutex. They're
>> supposed to be irq safe mimicking io_req_task_work_add(), that's how
>> nvme passthrough uses it as well (but at least it doesn't need the
>> cancellation bits).
>>
>> One option is to replace it with a spinlock, the other is to delay
>> the io_uring_cmd_del_cancelable() call to the task_work callback.
>> The latter would be cleaner and more preferable, but I'm lacking
>> context to tell if that would be correct. Ming, what do you think?
> 
> I prefer to the latter approach because the two cancelable helpers are
> run in fast path.
> 
> Looks all new io_uring_cmd_complete() in ublk have this issue, and the
> following patch should avoid them all.

The one I have in mind on top of the current tree would be like below.
Untested, and doesn't allow this cancellation thing for iopoll. I'll
prepare something tomorrow.


diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index e45d4cd5ef82..000ba435451c 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -14,19 +14,15 @@
  #include "rsrc.h"
  #include "uring_cmd.h"
  
-static void io_uring_cmd_del_cancelable(struct io_uring_cmd *cmd,
-		unsigned int issue_flags)
+static void io_uring_cmd_del_cancelable(struct io_uring_cmd *cmd)
  {
  	struct io_kiocb *req = cmd_to_io_kiocb(cmd);
-	struct io_ring_ctx *ctx = req->ctx;
  
  	if (!(cmd->flags & IORING_URING_CMD_CANCELABLE))
  		return;
  
  	cmd->flags &= ~IORING_URING_CMD_CANCELABLE;
-	io_ring_submit_lock(ctx, issue_flags);
  	hlist_del(&req->hash_node);
-	io_ring_submit_unlock(ctx, issue_flags);
  }
  
  /*
@@ -80,6 +76,15 @@ static inline void io_req_set_cqe32_extra(struct io_kiocb *req,
  	req->big_cqe.extra2 = extra2;
  }
  
+static void io_req_task_cmd_complete(struct io_kiocb *req,
+				     struct io_tw_state *ts)
+{
+	struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
+
+	io_uring_cmd_del_cancelable(ioucmd);
+	io_req_task_complete(req, ts);
+}
+
  /*
   * Called by consumers of io_uring_cmd, if they originally returned
   * -EIOCBQUEUED upon receiving the command.
@@ -89,8 +94,6 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, ssize_t res2,
  {
  	struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);
  
-	io_uring_cmd_del_cancelable(ioucmd, issue_flags);
-
  	if (ret < 0)
  		req_set_fail(req);
  
@@ -105,7 +108,7 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, ssize_t res2,
  			return;
  		io_req_complete_defer(req);
  	} else {
-		req->io_task_work.func = io_req_task_complete;
+		req->io_task_work.func = io_req_task_cmd_complete;
  		io_req_task_work_add(req);
  	}
  }

-- 
Pavel Begunkov

  reply	other threads:[~2024-03-16  4:15 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-15 15:29 [PATCH 00/11] remove aux CQE caches Pavel Begunkov
2024-03-15 15:29 ` [PATCH 01/11] io_uring: fix poll_remove stalled req completion Pavel Begunkov
2024-03-15 15:29 ` [PATCH 02/11] io_uring/cmd: kill one issue_flags to tw conversion Pavel Begunkov
2024-03-15 15:29 ` [PATCH 03/11] io_uring/cmd: fix tw <-> issue_flags conversion Pavel Begunkov
2024-03-15 15:29 ` [PATCH 04/11] io_uring/cmd: introduce io_uring_cmd_complete Pavel Begunkov
2024-03-15 15:29 ` [PATCH 05/11] ublk: don't hard code IO_URING_F_UNLOCKED Pavel Begunkov
2024-03-15 15:29 ` [PATCH 06/11] nvme/io_uring: " Pavel Begunkov
2024-03-15 15:29 ` [PATCH 07/11] io_uring/rw: avoid punting to io-wq directly Pavel Begunkov
2024-03-15 15:29 ` [PATCH 08/11] io_uring: force tw ctx locking Pavel Begunkov
2024-03-15 15:40   ` Jens Axboe
2024-03-15 16:14     ` Pavel Begunkov
2024-03-15 15:29 ` [PATCH 09/11] io_uring: remove struct io_tw_state::locked Pavel Begunkov
2024-03-15 15:30 ` [PATCH 10/11] io_uring: refactor io_fill_cqe_req_aux Pavel Begunkov
2024-03-15 15:30 ` [PATCH 11/11] io_uring: get rid of intermediate aux cqe caches Pavel Begunkov
2024-03-15 16:20   ` Jens Axboe
2024-03-15 16:23     ` Pavel Begunkov
2024-03-15 16:25       ` Jens Axboe
2024-03-15 16:27         ` Jens Axboe
2024-03-15 16:44           ` Pavel Begunkov
2024-03-15 16:49             ` Jens Axboe
2024-03-15 17:26               ` Pavel Begunkov
2024-03-15 18:26                 ` Jens Axboe
2024-03-15 18:51                   ` Pavel Begunkov
2024-03-15 19:02                     ` Jens Axboe
2024-03-15 16:29         ` Pavel Begunkov
2024-03-15 16:33           ` Jens Axboe
2024-03-15 15:42 ` [PATCH 00/11] remove aux CQE caches Jens Axboe
2024-03-15 16:00 ` Jens Axboe
2024-03-15 22:53 ` (subset) " Jens Axboe
2024-03-16  2:03   ` Ming Lei
2024-03-16  2:24     ` Ming Lei
2024-03-16  2:54       ` Pavel Begunkov
2024-03-16  3:54         ` Ming Lei
2024-03-16  4:13           ` Pavel Begunkov [this message]
2024-03-16  4:20             ` Pavel Begunkov
2024-03-16  9:53               ` Ming Lei
2024-03-16 11:52   ` Ming Lei
2024-03-16 13:27     ` Pavel Begunkov
2024-03-16 13:56       ` Ming Lei
2024-03-17 20:55         ` Pavel Begunkov
2024-03-17 21:24           ` Jens Axboe
2024-03-17 21:29             ` Pavel Begunkov
2024-03-17 21:32               ` Jens Axboe
2024-03-17 21:34                 ` Pavel Begunkov
2024-03-17 21:47                   ` Pavel Begunkov
2024-03-17 21:51                     ` Jens Axboe
2024-03-17 22:07                       ` Jens Axboe
2024-03-17 22:24                         ` Jens Axboe
2024-03-18  0:15                           ` Ming Lei
2024-03-18  1:34                             ` Jens Axboe
2024-03-18  1:44                               ` Jens Axboe
2024-03-18  1:49                               ` Ming Lei
2024-03-17 23:16                       ` Pavel Begunkov
2024-03-16 14:39       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox