From: Kanchan Joshi <[email protected]>
To: Jens Axboe <[email protected]>
Cc: [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected], [email protected]
Subject: Re: [PATCH v4 1/5] fs,io_uring: add infrastructure for uring-cmd
Date: Fri, 6 May 2022 12:42:16 +0530 [thread overview]
Message-ID: <20220506071216.GB20217@test-zns> (raw)
In-Reply-To: <[email protected]>
[-- Attachment #1: Type: text/plain, Size: 2864 bytes --]
On Thu, May 05, 2022 at 10:17:39AM -0600, Jens Axboe wrote:
>On 5/5/22 12:06 AM, Kanchan Joshi wrote:
>> +static int io_uring_cmd_prep(struct io_kiocb *req,
>> + const struct io_uring_sqe *sqe)
>> +{
>> + struct io_uring_cmd *ioucmd = &req->uring_cmd;
>> + struct io_ring_ctx *ctx = req->ctx;
>> +
>> + if (ctx->flags & IORING_SETUP_IOPOLL)
>> + return -EOPNOTSUPP;
>> + /* do not support uring-cmd without big SQE/CQE */
>> + if (!(ctx->flags & IORING_SETUP_SQE128))
>> + return -EOPNOTSUPP;
>> + if (!(ctx->flags & IORING_SETUP_CQE32))
>> + return -EOPNOTSUPP;
>> + if (sqe->ioprio || sqe->rw_flags)
>> + return -EINVAL;
>> + ioucmd->cmd = sqe->cmd;
>> + ioucmd->cmd_op = READ_ONCE(sqe->cmd_op);
>> + return 0;
>> +}
>
>While looking at the other suggested changes, I noticed a more
>fundamental issue with the passthrough support. For any other command,
>SQE contents are stable once prep has been done. The above does do that
>for the basic items, but this case is special as the lower level command
>itself resides in the SQE.
>
>For cases where the command needs deferral, it's problematic. There are
>two main cases where this can happen:
>
>- The issue attempt yields -EAGAIN (we ran out of requests, etc). If you
> look at other commands, if they have data that doesn't fit in the
> io_kiocb itself, then they need to allocate room for that data and have
> it be persistent
>
>- Deferral is specified by the application, using eg IOSQE_IO_LINK or
> IOSQE_ASYNC.
>
>We're totally missing support for both of these cases. Consider the case
>where the ring is setup with an SQ size of 1. You prep a passthrough
>command (request A) and issue it with io_uring_submit(). Due to one of
>the two above mentioned conditions, the internal request is deferred.
>Either it was sent to ->uring_cmd() but we got -EAGAIN, or it was
>deferred even before that happened. The application doesn't know this
>happened, it gets another SQE to submit a new request (request B). Fills
>it in, calls io_uring_submit(). Since we only have one SQE available in
>that ring, when request A gets re-issued, it's now happily reading SQE
>contents from command B. Oops.
>
>This is why prep handlers are the only ones that get an sqe passed to
>them. They are supposed to ensure that we no longer read from the SQE
>past that. Applications can always rely on that fact that once
>io_uring_submit() has been done, which consumes the SQE in the SQ ring,
>that no further reads are done from that SQE.
>
Thanks for explaining; gives great deal of clarity.
Are there already some tests (liburing, fio etc.) that you use to test
this part?
Different from what you mentioned, but I was forcing failure scenario by
setting low QD in nvme and pumping commands at higher QD than that.
But this was just testing that we return failure to usespace (since
deferral was not there).
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2022-05-06 9:19 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20220505061142epcas5p2c943572766bfd5088138fe0f7873c96c@epcas5p2.samsung.com>
2022-05-05 6:06 ` [PATCH v4 0/5] io_uring passthrough for nvme Kanchan Joshi
[not found] ` <CGME20220505061144epcas5p3821a9516dad2b5eff5a25c56dbe164df@epcas5p3.samsung.com>
2022-05-05 6:06 ` [PATCH v4 1/5] fs,io_uring: add infrastructure for uring-cmd Kanchan Joshi
2022-05-05 12:52 ` Jens Axboe
2022-05-05 13:48 ` Ming Lei
2022-05-05 13:54 ` Jens Axboe
2022-05-05 13:29 ` Christoph Hellwig
2022-05-05 16:17 ` Jens Axboe
2022-05-05 17:04 ` Jens Axboe
2022-05-06 7:12 ` Kanchan Joshi [this message]
2022-05-10 14:23 ` Kanchan Joshi
2022-05-10 14:35 ` Jens Axboe
[not found] ` <CGME20220505061146epcas5p3919c48d58d353a62a5858ee10ad162a0@epcas5p3.samsung.com>
2022-05-05 6:06 ` [PATCH v4 2/5] block: wire-up support for passthrough plugging Kanchan Joshi
2022-05-05 14:21 ` Ming Lei
[not found] ` <CGME20220505061148epcas5p188618b5b15a95cbe48c8c1559a18c994@epcas5p1.samsung.com>
2022-05-05 6:06 ` [PATCH v4 3/5] nvme: refactor nvme_submit_user_cmd() Kanchan Joshi
2022-05-05 13:30 ` Christoph Hellwig
2022-05-05 18:37 ` Clay Mayers
2022-05-05 19:03 ` Jens Axboe
2022-05-05 19:11 ` Jens Axboe
2022-05-05 19:30 ` Clay Mayers
2022-05-05 19:31 ` Jens Axboe
2022-05-05 19:50 ` hch
2022-05-05 20:44 ` Jens Axboe
2022-05-06 5:56 ` hch
[not found] ` <CGME20220505061150epcas5p2b60880c541a4b2f144c348834c7cbf0b@epcas5p2.samsung.com>
2022-05-05 6:06 ` [PATCH v4 4/5] nvme: wire-up uring-cmd support for io-passthru on char-device Kanchan Joshi
2022-05-05 13:33 ` Christoph Hellwig
2022-05-05 13:38 ` Jens Axboe
2022-05-05 13:42 ` Christoph Hellwig
2022-05-05 13:50 ` Jens Axboe
2022-05-05 17:23 ` Jens Axboe
2022-05-06 8:28 ` Christoph Hellwig
2022-05-06 13:37 ` Jens Axboe
2022-05-06 14:50 ` Christoph Hellwig
2022-05-06 14:57 ` Jens Axboe
2022-05-07 5:03 ` Christoph Hellwig
2022-05-07 12:53 ` Jens Axboe
2022-05-09 6:00 ` Christoph Hellwig
2022-05-09 12:52 ` Jens Axboe
[not found] ` <CGME20220505061151epcas5p2523dc661a0daf3e6185dee771eade393@epcas5p2.samsung.com>
2022-05-05 6:06 ` [PATCH v4 5/5] nvme: add vectored-io support for uring-cmd Kanchan Joshi
2022-05-05 18:20 ` [PATCH v4 0/5] io_uring passthrough for nvme Jens Axboe
2022-05-05 18:29 ` Jens Axboe
2022-05-06 6:42 ` Kanchan Joshi
2022-05-06 13:14 ` Jens Axboe
2022-05-10 7:20 ` Christoph Hellwig
2022-05-10 12:29 ` Jens Axboe
2022-05-10 14:21 ` Kanchan Joshi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220506071216.GB20217@test-zns \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox