public inbox for [email protected]
 help / color / mirror / Atom feed
From: Kanchan Joshi <[email protected]>
To: Christoph Hellwig <[email protected]>
Cc: "Kanchan Joshi" <[email protected]>,
	"Jens Axboe" <[email protected]>,
	[email protected], [email protected],
	"Pavel Begunkov" <[email protected]>,
	"Ming Lei" <[email protected]>,
	"Luis Chamberlain" <[email protected]>,
	"Pankaj Raghav" <[email protected]>,
	"Javier González" <[email protected]>,
	"Anuj Gupta" <[email protected]>
Subject: Re: [RFC 3/5] io_uring: add infra and support for IORING_OP_URING_CMD
Date: Mon, 4 Apr 2022 20:44:20 +0530	[thread overview]
Message-ID: <CA+1E3r+nHBace_K1Zt-FrOgGF5d0=TDoNtU65bFuWX8R7p8+DQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>

On Mon, Apr 4, 2022 at 12:46 PM Christoph Hellwig <[email protected]> wrote:
>
> Cann we plese spell out instastructure here?  Or did you mean
> infraread anyway :)

:-) sure, I see the problem with this shorthand now.

> > -enum io_uring_cmd_flags {
> > -     IO_URING_F_COMPLETE_DEFER       = 1,
> > -     IO_URING_F_UNLOCKED             = 2,
> > -     /* int's last bit, sign checks are usually faster than a bit test */
> > -     IO_URING_F_NONBLOCK             = INT_MIN,
> > -};
>
> This doesn't actually get used anywhere outside of io_uring.c, so why
> move it?

This got missed out.
We are going to use it for things like this in nvme (from past series):

+ if (ioucmd && (ioucmd->flags & IO_URING_F_NONBLOCK)) {
+ rq_flags |= REQ_NOWAIT;
+ blk_flags |= BLK_MQ_REQ_NOWAIT;
+ }
+ req = nvme_alloc_request(q, cmd, blk_flags, rq_flags);

Also for polling too. Will fix this up in the next version.

> > +static void io_uring_cmd_work(struct io_kiocb *req, bool *locked)
> > +{
> > +     req->uring_cmd.driver_cb(&req->uring_cmd);
> > +}
> > +
> > +void io_uring_cmd_complete_in_task(struct io_uring_cmd *ioucmd,
> > +                     void (*driver_cb)(struct io_uring_cmd *))
> > +{
> > +     struct io_kiocb *req = container_of(ioucmd, struct io_kiocb, uring_cmd);
> > +
> > +     req->uring_cmd.driver_cb = driver_cb;
> > +     req->io_task_work.func = io_uring_cmd_work;
> > +     io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
> > +}
> > +EXPORT_SYMBOL_GPL(io_uring_cmd_complete_in_task);
>
> I'm still not a fund of the double indirect call here.  I don't really
> have a good idea yet, but I plan to look into it.
>
> >  static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
>
> Also it would be great to not add it between io_req_task_queue_fail and
> the callback set by it.

Right. Will change.

> > +void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret)
> > +{
> > +     struct io_kiocb *req = container_of(ioucmd, struct io_kiocb, uring_cmd);
> > +
> > +     if (ret < 0)
> > +             req_set_fail(req);
> > +     io_req_complete(req, ret);
> > +}
> > +EXPORT_SYMBOL_GPL(io_uring_cmd_done);
>
> It seems like all callers of io_req_complete actually call req_set_fail
> on failure.  So maybe it would be nice pre-cleanup to handle the
> req_set_fail call from ĩo_req_complete?

io_tee and io_splice do slightly different handling.
If we take this route, it seems they can use __io_req_complete() instead.

> > +     /* queued async, consumer will call io_uring_cmd_done() when complete */
> > +     if (ret == -EIOCBQUEUED)
> > +             return 0;
> > +     io_uring_cmd_done(ioucmd, ret);
>
> Why not:
>         if (ret != -EIOCBQUEUED)
>                 io_uring_cmd_done(ioucmd, ret);
>         return 0;
>
> That being said I wonder why not just remove the retun value from
> ->async_cmd entirely and just require the implementation to always call
> io_uring_cmd_done?  That removes the confusion on who needs to call it
> entirely, similarly to what we do in the block layer for ->submit_bio.

Right, this seems doable at this point as we do not do any special
handling (in io_uring) on the return code.

> > +struct io_uring_cmd {
> > +     struct file     *file;
> > +     void            *cmd;
> > +     /* for irq-completion - if driver requires doing stuff in task-context*/
> > +     void (*driver_cb)(struct io_uring_cmd *cmd);
> > +     u32             flags;
> > +     u32             cmd_op;
> > +     u16             cmd_len;
>
> The cmd_len field does not seem to actually be used anywhere.

Another stuff that got left out from the previous series :-(
Using this field for a bit of sanity checking at the moment. Like this in nvme:

+ if (ioucmd->cmd_len != sizeof(struct nvme_passthru_cmd64))
+ return -EINVAL;
+ cptr = (struct nvme_passthru_cmd64 *)ioucmd->cmd;

> > +++ b/include/uapi/linux/io_uring.h
> > @@ -22,10 +22,12 @@ struct io_uring_sqe {
> >       union {
> >               __u64   off;    /* offset into file */
> >               __u64   addr2;
> > +             __u32   cmd_op;
> >       };
> >       union {
> >               __u64   addr;   /* pointer to buffer or iovecs */
> >               __u64   splice_off_in;
> > +             __u16   cmd_len;
> >       };
> >       __u32   len;            /* buffer size or number of iovecs */
> >       union {
> > @@ -60,7 +62,10 @@ struct io_uring_sqe {
> >               __s32   splice_fd_in;
> >               __u32   file_index;
> >       };
> > -     __u64   __pad2[2];
> > +     union {
> > +             __u64   __pad2[2];
> > +             __u64   cmd;
> > +     };
>
> Can someone explain these changes to me a little more?

All these three (cmd_op, cmd_len and cmd) are operation specific
fields. user-space supplies these into SQE, io_uring packages those
into "struct io_uring_cmd" and pass that down to the provider for
doing the real processing.

1. cmd_op = operation code for async cmd (e.g. passthru ioctl opcode
or whatever else we want to turn async from user-space)
2. cmd_len = actual length of async command (e.g. we have max 80 bytes
with big-sqe and for nvme-passthru we use the max, but for some other
usecase one can go with smaller length)
3. cmd = this is the starting-offset where async-command is placed (by
user-space) inside the big-sqe. 16 bytes in first-sqe, and next 64
bytes comes from second-sqe.

And if we were doing pointer-based command submission, this "cmd"
would have pointed to user-space command (of whatever length). Just to
put the entire thought-process across; I understand that we are not
taking that route.

  parent reply	other threads:[~2022-04-04 15:14 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220401110829epcas5p39f3cf4d3f6eb8a5c59794787a2b72b15@epcas5p3.samsung.com>
2022-04-01 11:03 ` [RFC 0/5] big-cqe based uring-passthru Kanchan Joshi
     [not found]   ` <CGME20220401110831epcas5p403bacabe8f7e5262356fdc1a2e66df90@epcas5p4.samsung.com>
2022-04-01 11:03     ` [RFC 1/5] io_uring: add support for 128-byte SQEs Kanchan Joshi
     [not found]   ` <CGME20220401110833epcas5p18e828a307a646cef5b7aa429be4396e0@epcas5p1.samsung.com>
2022-04-01 11:03     ` [RFC 2/5] fs: add file_operations->async_cmd() Kanchan Joshi
2022-04-04  7:09       ` Christoph Hellwig
     [not found]   ` <CGME20220401110834epcas5p4d1e5e8d1beb1a6205d670bbcb932bf77@epcas5p4.samsung.com>
2022-04-01 11:03     ` [RFC 3/5] io_uring: add infra and support for IORING_OP_URING_CMD Kanchan Joshi
2022-04-04  7:16       ` Christoph Hellwig
2022-04-04  8:20         ` Pavel Begunkov
2022-04-05  5:58           ` Christoph Hellwig
2022-04-06  6:37             ` Kanchan Joshi
2022-04-04 15:14         ` Kanchan Joshi [this message]
2022-04-05  6:00           ` Christoph Hellwig
2022-04-05 16:27             ` Kanchan Joshi
     [not found]   ` <CGME20220401110836epcas5p37bd59ab5a48cf77ca3ac05052a164b0b@epcas5p3.samsung.com>
2022-04-01 11:03     ` [RFC 4/5] io_uring: add support for big-cqe Kanchan Joshi
2022-04-04  7:07       ` Christoph Hellwig
2022-04-04 14:04         ` Kanchan Joshi
     [not found]   ` <CGME20220401110838epcas5p2c1a2e776923dfe5bf65a3e7946820150@epcas5p2.samsung.com>
2022-04-01 11:03     ` [RFC 5/5] nvme: wire-up support for async-passthru on char-device Kanchan Joshi
2022-04-04  7:20       ` Christoph Hellwig
2022-04-04 14:25         ` Kanchan Joshi
2022-04-05  6:02           ` Christoph Hellwig
2022-04-05 15:40             ` Jens Axboe
2022-04-05 15:49             ` Kanchan Joshi
2022-04-06  5:20               ` Kanchan Joshi
2022-04-06  5:23                 ` Christoph Hellwig
2022-04-23 17:53                 ` Christoph Hellwig
2022-04-25 17:38                   ` Kanchan Joshi
2022-04-29 13:16                     ` Kanchan Joshi
2022-04-04  7:21   ` [RFC 0/5] big-cqe based uring-passthru Christoph Hellwig
2022-04-05 15:37     ` Kanchan Joshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+1E3r+nHBace_K1Zt-FrOgGF5d0=TDoNtU65bFuWX8R7p8+DQ@mail.gmail.com' \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox