public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: "Darrick J. Wong" <[email protected]>
Cc: [email protected]
Subject: Re: [PATCH 2/5] io_uring: add support for IORING_OP_URING_CMD
Date: Wed, 27 Jan 2021 18:45:38 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <20210128003831.GE7695@magnolia>

On 1/27/21 5:38 PM, Darrick J. Wong wrote:
> On Wed, Jan 27, 2021 at 02:25:38PM -0700, Jens Axboe wrote:
>> This is a file private kind of request. io_uring doesn't know what's
>> in this command type, it's for the file_operations->uring_cmd()
>> handler to deal with.
>>
>> Signed-off-by: Jens Axboe <[email protected]>
>> ---
>>  fs/io_uring.c                 | 59 +++++++++++++++++++++++++++++++++++
>>  include/linux/io_uring.h      | 12 +++++++
>>  include/uapi/linux/io_uring.h |  1 +
>>  3 files changed, 72 insertions(+)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 03748faa5295..55c2714a591e 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -712,6 +712,7 @@ struct io_kiocb {
>>  		struct io_shutdown	shutdown;
>>  		struct io_rename	rename;
>>  		struct io_unlink	unlink;
>> +		struct io_uring_cmd	uring_cmd;
>>  		/* use only after cleaning per-op data, see io_clean_op() */
>>  		struct io_completion	compl;
>>  	};
>> @@ -805,6 +806,8 @@ struct io_op_def {
>>  	unsigned		needs_async_data : 1;
>>  	/* should block plug */
>>  	unsigned		plug : 1;
>> +	/* doesn't support personality */
>> +	unsigned		no_personality : 1;
>>  	/* size of async data needed, if any */
>>  	unsigned short		async_size;
>>  	unsigned		work_flags;
>> @@ -998,6 +1001,11 @@ static const struct io_op_def io_op_defs[] = {
>>  		.work_flags		= IO_WQ_WORK_MM | IO_WQ_WORK_FILES |
>>  						IO_WQ_WORK_FS | IO_WQ_WORK_BLKCG,
>>  	},
>> +	[IORING_OP_URING_CMD] = {
>> +		.needs_file		= 1,
>> +		.no_personality		= 1,
>> +		.work_flags		= IO_WQ_WORK_MM | IO_WQ_WORK_BLKCG,
>> +	},
>>  };
>>  
>>  enum io_mem_account {
>> @@ -3797,6 +3805,47 @@ static int io_unlinkat(struct io_kiocb *req, bool force_nonblock)
>>  	return 0;
>>  }
>>  
>> +static void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret)
>> +{
>> +	struct io_kiocb *req = container_of(cmd, struct io_kiocb, uring_cmd);
>> +
>> +	if (ret < 0)
>> +		req_set_fail_links(req);
>> +	io_req_complete(req, ret);
>> +}
>> +
>> +static int io_uring_cmd_prep(struct io_kiocb *req,
>> +			     const struct io_uring_sqe *sqe)
>> +{
>> +	struct io_uring_cmd *cmd = &req->uring_cmd;
>> +
>> +	if (!req->file->f_op->uring_cmd)
>> +		return -EOPNOTSUPP;
>> +
>> +	memcpy(&cmd->pdu, (void *) &sqe->off, sizeof(cmd->pdu));
> 
> Hmmm.  struct io_uring_pdu is (by my count) 6x uint64_t (==48 bytes) in
> size.  This starts copying the pdu from byte 8 in struct io_uring_sqe,
> and the sqe is 64 bytes in size.

Correct

> I guess (having not played much with io_uring) that the stuff in the
> first eight bytes of the sqe are header info that's common to all
> io_uring operations, and hence not passed to io_uring_cmd*.

Exactly

> Assuming that I got that right, that means that the pdu information
> doesn't actually go all the way to the end of the sqe, which currently
> is just a bunch of padding.  Was that intentional, or does this mean
> that io_uring_pdu could actually be 8 bytes longer?

Also correct. The reason is actually kind of stupid, and I think we
should just fix that up. struct io_uring_cmd should fit within the first
cacheline of io_kiocb, to avoid bloating that one. But with the members
in there, it ends up being 8 bytes too big, if we grab those 8 bytes.
What I think we should do is get rid of ->done, and just have drivers
call io_uring_cmd_done() instead. We can provide an empty hook for that.
Then we can reclaim the 8 bytes, and grow the io_uring_cmd to 56 bytes.

> Also, I thought io_uring_seq.user_data was supposed to coincide with
> io_uring_pdu.reserved?  They don't seem to...?
>
> (I could be totally off here, fwiw.)

I think you are, I even added a BUILD check for that:

BUILD_BUG_ON(offsetof(struct io_uring_sqe, user_data) !=
	     offsetof(struct io_uring_pdu, reserved));

to ensure that that is the case.

> The reason why I'm counting bytes so stingily is that xfs_scrub issues
> millions upon millions of ioctl calls to scrub an XFS.  Wouldn't it be
> nice if there was a way to submit a single userspace buffer to the
> kernel and let it run every scrubber for that fs object in order?  I
> could cram all that data into the pdu struct ... if it had 56 bytes of
> space.

For other purposes too, the bigger we can make the inline data, the more
likely we are that we can fit everything we need in there. I'm going to
make the change to bump it to 56 bytes.

> If not, it wouldn't be a big deal to use one of the data[4] fields as a
> pointer to a larger struct, but where's the fun in that? :)

Agree :-)

> Granted I'm programming speculatively in my head, not building an actual
> prototype.  There are all kinds of other questions I have, like, can a
> uring command handler access the task struct or the userspace memory of
> the process it was called from?  What happens when the user is madly
> pounding on ^C while uring commands are running?  I should probably
> figure out the answers to those questions and maybe even write/crib a
> program first... 

Well, either the program would exit if it had no SIGINT handler, and
that would signal the async task handling it and cancel it. Or you
handle it, and then you need to cancel on your own.

-- 
Jens Axboe


  reply	other threads:[~2021-01-28  1:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27 21:25 [PATCHSET RFC 0/5] file_operations based io_uring commands Jens Axboe
2021-01-27 21:25 ` [PATCH 1/5] fs: add file_operations->uring_cmd() Jens Axboe
2021-01-27 21:25 ` [PATCH 2/5] io_uring: add support for IORING_OP_URING_CMD Jens Axboe
2021-01-28  0:38   ` Darrick J. Wong
2021-01-28  1:45     ` Jens Axboe [this message]
2021-01-28  2:19       ` Jens Axboe
2021-02-20  3:57         ` Stefan Metzmacher
2021-02-20 14:50           ` Jens Axboe
2021-02-20 16:45             ` Jens Axboe
2021-02-22 20:04               ` Stefan Metzmacher
2021-02-22 20:14                 ` Jens Axboe
2021-02-23  8:14                   ` Stefan Metzmacher
2021-02-23 13:21                     ` Pavel Begunkov
2021-01-27 21:25 ` [PATCH 3/5] block: wire up support for file_operations->uring_cmd() Jens Axboe
2021-01-27 21:25 ` [PATCH 4/5] block: add example ioctl Jens Axboe
2021-01-27 21:25 ` [PATCH 5/5] net: wire up support for file_operations->uring_cmd() Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox