public inbox for [email protected]
 help / color / mirror / Atom feed
From: Bijan Mottahedeh <[email protected]>
To: Pavel Begunkov <[email protected]>, [email protected]
Cc: [email protected]
Subject: Re: [PATCH 7/8] io_uring: support readv/writev with fixed buffers
Date: Tue, 17 Nov 2020 14:59:53 -0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>


>> Support readv/writev with fixed buffers, and introduce IOSQE_FIXED_BUFFER,
>> consistent with fixed files.
> 
> I don't like it at all, see issues below. The actual implementation would
> be much uglier.
> 
> I propose you to split the series and push separately. Your first 6 patches
> first, I don't have conceptual objections to them. Then registration sharing
> (I still need to look it up). And then we can return to this, if you're not
> yet convinced.

Ok.  The sharing patch is actually the highest priority for us so it'd 
be great to know if you think it's in the right direction.

Should I submit them as they are or address your fixed_file_ref comments 
in Patch 4/8 as well?  Would I need your prep patch beforehand?

>> +static ssize_t io_import_iovec_fixed(int rw, struct io_kiocb *req, void *buf,
>> +				     unsigned segs, unsigned fast_segs,
>> +				     struct iovec **iovec,
>> +				     struct iov_iter *iter)
>> +{
>> +	struct io_ring_ctx *ctx = req->ctx;
>> +	struct io_mapped_ubuf *imu;
>> +	struct iovec *iov;
>> +	u16 index, buf_index;
>> +	ssize_t ret;
>> +	unsigned long seg;
>> +
>> +	if (unlikely(!ctx->buf_data))
>> +		return -EFAULT;
>> +
>> +	ret = import_iovec(rw, buf, segs, fast_segs, iovec, iter);
> 
> Did you test it? import_iovec() does access_ok() against each iov_base,
> which in your case are an index.

I used liburing:test/file-{register,update} as models for the equivalent 
buffer tests and they seem to work.  I can send out the tests and the 
liburing changes if you want.

The fixed io test registers an empty iov table first:

	ret = io_uring_register_buffers(ring, iovs, UIO_MAXIOV);

It next updates the table with two actual buffers at offset 768:

         ret = io_uring_register_buffers_update(ring, 768, ups, 2);

It later uses the buffer at index 768 for writev similar to the 
file-register test (IOSQE_FIXED_BUFFER instead of IOSQE_FIXED_FILE):

         iovs[768].iov_base = (void *)768;
         iovs[768].iov_len = pagesize;


         io_uring_prep_writev(sqe, fd, iovs, 1, 0);
         sqe->flags |= IOSQE_FIXED_BUFFER;

         ret = io_uring_submit(ring);

Below is the iovec returned from

io_import_iovec_fixed()
-> io_import_vec()

{iov_base = 0x300 <dm_early_create+412>, iov_len = 4096}

>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	iov = (struct iovec *)iter->iov;
>> +
>> +	for (seg = 0; seg < iter->nr_segs; seg++) {
>> +		buf_index = *(u16 *)(&iov[seg].iov_base);
> 
> That's ugly, and also not consistent with rw_fixed, because iov_base is
> used to calculate offset.

Would offset be applicable when using readv/writev?

My thinkig was that for those cases, each iovec should be used exactly 
as registered.

> 
>> +		if (unlikely(buf_index < 0 || buf_index >= ctx->nr_user_bufs))
>> +			return -EFAULT;
>> +
>> +		index = array_index_nospec(buf_index, ctx->nr_user_bufs);
>> +		imu = io_buf_from_index(ctx, index);
>> +		if (!imu->ubuf || !imu->len)
>> +			return -EFAULT;
>> +		if (iov[seg].iov_len > imu->len)
>> +			return -EFAULT;
>> +
>> +		iov[seg].iov_base = (void *)imu->ubuf;
> 
> Nope, that's not different from non registered version.
> What import_fixed actually do is setting up the iter argument to point
> to a bvec (a vector of struct page *).

So in fact, the buffers end up being pinned again because they are not 
being as bvecs?

> 
> So it either would need to keep a vector of bvecs, that's a vector of vectors,
> that's not supported by iter, etc., so you'll also need to iterate over them
> in io_read/write and so on. Or flat 2D structure into 1D, but that's still ugly.

So you're saying there's no clean way to create a readv/writev + fixed 
buffers API?  It would've been nice to have a consistent API between 
files and buffers.


>> @@ -5692,7 +5743,7 @@ static int io_timeout_remove_prep(struct io_kiocb *req,
>>   {
>>   	if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
>>   		return -EINVAL;
>> -	if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
>> +	if (unlikely(req->flags & (REQ_F_FIXED_RSRC | REQ_F_BUFFER_SELECT)))
> 
> Why it's here?
> 
> #define REQ_F_FIXED_RSRC	(REQ_F_FIXED_FILE | REQ_F_FIXED_BUFFER)
> So, why do you | with REQ_F_BUFFER_SELECT again here?

I thought to group fixed files/buffers but distinguish them from 
selected buffers.

>> @@ -87,6 +88,8 @@ enum {
>>   #define IOSQE_ASYNC		(1U << IOSQE_ASYNC_BIT)
>>   /* select buffer from sqe->buf_group */
>>   #define IOSQE_BUFFER_SELECT	(1U << IOSQE_BUFFER_SELECT_BIT)
>> +/* use fixed buffer set */
>> +#define IOSQE_FIXED_BUFFER	(1U << IOSQE_FIXED_BUFFER_BIT)
> 
> Unfortenatuly, we're almost out of flags bits -- it's a 1 byte
> field and 6 bits are already taken. Let's not use it.

  reply	other threads:[~2020-11-17 23:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 23:00 [PATCH 0/8] io_uring: buffer registration enhancements Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 1/8] io_uring: modularize io_sqe_buffer_register Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 2/8] io_uring: modularize io_sqe_buffers_register Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 3/8] io_uring: generalize fixed file functionality Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 4/8] io_uring: implement fixed buffers registration similar to fixed files Bijan Mottahedeh
2020-11-15 13:33   ` Pavel Begunkov
2020-11-16 21:24     ` Bijan Mottahedeh
2020-11-16 23:09       ` Pavel Begunkov
2020-11-17  0:41         ` Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 5/8] io_uring: generalize files_update functionlity to rsrc_update Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 6/8] io_uring: support buffer registration updates Bijan Mottahedeh
2020-11-18 20:17   ` Pavel Begunkov
2020-12-09  0:42     ` Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 7/8] io_uring: support readv/writev with fixed buffers Bijan Mottahedeh
2020-11-17 11:04   ` Pavel Begunkov
2020-11-17 22:59     ` Bijan Mottahedeh [this message]
2020-11-18  9:14       ` Pavel Begunkov
2020-11-18 20:12       ` Pavel Begunkov
     [not found]         ` <[email protected]>
     [not found]           ` <[email protected]>
2020-11-19 19:27             ` Bijan Mottahedeh
2020-11-12 23:00 ` [PATCH 8/8] io_uring: support buffer registration sharing Bijan Mottahedeh
2020-11-16 23:28 ` [PATCH 0/8] io_uring: buffer registration enhancements Pavel Begunkov
2020-11-17  0:21   ` Bijan Mottahedeh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox