public inbox for [email protected]
 help / color / mirror / Atom feed
* [POC RFC 0/3] splice(2) support for io_uring
@ 2020-01-22  0:05 Pavel Begunkov
  2020-01-22  1:55 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2020-01-22  0:05 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel, linux-fsdevel; +Cc: Alexander Viro

It works well for basic cases, but there is still work to be done. E.g.
it misses @hash_reg_file checks for the second (output) file. Anyway,
there are some questions I want to discuss:

- why sqe->len is __u32? Splice uses size_t, and I think it's better
to have something wider (e.g. u64) for fututre use. That's the story
behind added sqe->splice_len.

- it requires 2 fds, and it's painful. Currently file managing is done
by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm
thinking to make each opcode function handle file grabbing/putting
themself with some helpers, as it's done in the patch for splice's
out-file.
    1. Opcode handler knows, whether it have/needs a file, and thus
       doesn't need extra checks done in common path.
    2. It will be more consistent with splice.
Objections? Ideas?

- do we need offset pointers with fallback to file->f_pos? Or is it
enough to have offset value. Jens, I remember you added the first
option somewhere, could you tell the reasoning?


Pavel Begunkov (3):
  splice: make do_splice public
  io_uring: add interface for getting files
  io_uring: add splice(2) support

 fs/io_uring.c                 | 152 ++++++++++++++++++++++++++++------
 fs/splice.c                   |   6 +-
 include/linux/splice.h        |   3 +
 include/uapi/linux/io_uring.h |  16 +++-
 4 files changed, 147 insertions(+), 30 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [POC RFC 0/3] splice(2) support for io_uring
  2020-01-22  0:05 Pavel Begunkov
@ 2020-01-22  1:55 ` Jens Axboe
  2020-01-22  3:11   ` Pavel Begunkov
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2020-01-22  1:55 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring, linux-kernel, linux-fsdevel; +Cc: Alexander Viro

On 1/21/20 5:05 PM, Pavel Begunkov wrote:
> It works well for basic cases, but there is still work to be done. E.g.
> it misses @hash_reg_file checks for the second (output) file. Anyway,
> there are some questions I want to discuss:
> 
> - why sqe->len is __u32? Splice uses size_t, and I think it's better
> to have something wider (e.g. u64) for fututre use. That's the story
> behind added sqe->splice_len.

IO operations in Linux generally are INT_MAX, so the u32 is plenty big.
That's why I chose it. For this specifically, if you look at splice:

	if (unlikely(len > MAX_RW_COUNT))
		len = MAX_RW_COUNT;

so anything larger is truncated anyway.

> - it requires 2 fds, and it's painful. Currently file managing is done
> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm
> thinking to make each opcode function handle file grabbing/putting
> themself with some helpers, as it's done in the patch for splice's
> out-file.
>     1. Opcode handler knows, whether it have/needs a file, and thus
>        doesn't need extra checks done in common path.
>     2. It will be more consistent with splice.
> Objections? Ideas?

Sounds reasonable to me, but always easier to judge in patch form :-)

> - do we need offset pointers with fallback to file->f_pos? Or is it
> enough to have offset value. Jens, I remember you added the first
> option somewhere, could you tell the reasoning?

I recently added support for -1/cur position, which splice also uses. So
you should be fine with that.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [POC RFC 0/3] splice(2) support for io_uring
  2020-01-22  1:55 ` Jens Axboe
@ 2020-01-22  3:11   ` Pavel Begunkov
  2020-01-22  3:30     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Begunkov @ 2020-01-22  3:11 UTC (permalink / raw)
  To: Jens Axboe, io-uring, linux-kernel, linux-fsdevel; +Cc: Alexander Viro


[-- Attachment #1.1: Type: text/plain, Size: 2148 bytes --]

On 22/01/2020 04:55, Jens Axboe wrote:
> On 1/21/20 5:05 PM, Pavel Begunkov wrote:
>> It works well for basic cases, but there is still work to be done. E.g.
>> it misses @hash_reg_file checks for the second (output) file. Anyway,
>> there are some questions I want to discuss:
>>
>> - why sqe->len is __u32? Splice uses size_t, and I think it's better
>> to have something wider (e.g. u64) for fututre use. That's the story
>> behind added sqe->splice_len.
> 
> IO operations in Linux generally are INT_MAX, so the u32 is plenty big.
> That's why I chose it. For this specifically, if you look at splice:
> 
> 	if (unlikely(len > MAX_RW_COUNT))
> 		len = MAX_RW_COUNT;
> 
> so anything larger is truncated anyway.

Yeah, I saw this one, but that was rather an argument for the future. It's
pretty easy to transfer more than 4GB with sg list, but that would be the case
for splice.

> 
>> - it requires 2 fds, and it's painful. Currently file managing is done
>> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm
>> thinking to make each opcode function handle file grabbing/putting
>> themself with some helpers, as it's done in the patch for splice's
>> out-file.
>>     1. Opcode handler knows, whether it have/needs a file, and thus
>>        doesn't need extra checks done in common path.
>>     2. It will be more consistent with splice.
>> Objections? Ideas?
> 
> Sounds reasonable to me, but always easier to judge in patch form :-)
> 
>> - do we need offset pointers with fallback to file->f_pos? Or is it
>> enough to have offset value. Jens, I remember you added the first
>> option somewhere, could you tell the reasoning?
> 
> I recently added support for -1/cur position, which splice also uses. So
> you should be fine with that.
> 

I always have been thinking about it as a legacy from old days, and one of the
problems of posix. It's not hard to count it in the userspace especially in C++
or high-level languages, and is just another obstacle for having a performant
API. So, I'd rather get rid of it here. But is there any reasons against?

-- 
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [POC RFC 0/3] splice(2) support for io_uring
  2020-01-22  3:11   ` Pavel Begunkov
@ 2020-01-22  3:30     ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2020-01-22  3:30 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring, linux-kernel, linux-fsdevel; +Cc: Alexander Viro

On 1/21/20 8:11 PM, Pavel Begunkov wrote:
> On 22/01/2020 04:55, Jens Axboe wrote:
>> On 1/21/20 5:05 PM, Pavel Begunkov wrote:
>>> It works well for basic cases, but there is still work to be done. E.g.
>>> it misses @hash_reg_file checks for the second (output) file. Anyway,
>>> there are some questions I want to discuss:
>>>
>>> - why sqe->len is __u32? Splice uses size_t, and I think it's better
>>> to have something wider (e.g. u64) for fututre use. That's the story
>>> behind added sqe->splice_len.
>>
>> IO operations in Linux generally are INT_MAX, so the u32 is plenty big.
>> That's why I chose it. For this specifically, if you look at splice:
>>
>> 	if (unlikely(len > MAX_RW_COUNT))
>> 		len = MAX_RW_COUNT;
>>
>> so anything larger is truncated anyway.
> 
> Yeah, I saw this one, but that was rather an argument for the future.
> It's pretty easy to transfer more than 4GB with sg list, but that
> would be the case for splice.

I don't see this changing, ever, basically. And probably not a big deal,
if you want to do more than 2GB worth of IO, you simply splice them over
multiple commands. At those sizes, the overhead there is negligible.

>>> - it requires 2 fds, and it's painful. Currently file managing is done
>>> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm
>>> thinking to make each opcode function handle file grabbing/putting
>>> themself with some helpers, as it's done in the patch for splice's
>>> out-file.
>>>     1. Opcode handler knows, whether it have/needs a file, and thus
>>>        doesn't need extra checks done in common path.
>>>     2. It will be more consistent with splice.
>>> Objections? Ideas?
>>
>> Sounds reasonable to me, but always easier to judge in patch form :-)
>>
>>> - do we need offset pointers with fallback to file->f_pos? Or is it
>>> enough to have offset value. Jens, I remember you added the first
>>> option somewhere, could you tell the reasoning?
>>
>> I recently added support for -1/cur position, which splice also uses. So
>> you should be fine with that.
>>
> 
> I always have been thinking about it as a legacy from old days, and
> one of the problems of posix. It's not hard to count it in the
> userspace especially in C++ or high-level languages, and is just
> another obstacle for having a performant API. So, I'd rather get rid
> of it here. But is there any reasons against?

It's not always trivial to do in libraries, or programming languages
even. That's why it exists. I would not expect anyone to use it outside
of that.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [POC RFC 0/3] splice(2) support for io_uring
@ 2020-04-06 19:09 Askar Safin
  2020-04-06 20:01 ` Pavel Begunkov
  0 siblings, 1 reply; 6+ messages in thread
From: Askar Safin @ 2020-04-06 19:09 UTC (permalink / raw)
  To: asml.silence; +Cc: io-uring, linux-kernel, linux-fsdevel

Hi. Thanks for your splice io_uring patch. Maybe it will be good idea to add uring operation, which will unify splice, sendfile and copy_file_range instead of just IORING_OP_SPLICE?
==
Askar Safin
https://github.com/safinaskar

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [POC RFC 0/3] splice(2) support for io_uring
  2020-04-06 19:09 [POC RFC 0/3] splice(2) support for io_uring Askar Safin
@ 2020-04-06 20:01 ` Pavel Begunkov
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2020-04-06 20:01 UTC (permalink / raw)
  To: Askar Safin; +Cc: io-uring, linux-kernel, linux-fsdevel, Jens Axboe


[-- Attachment #1.1: Type: text/plain, Size: 682 bytes --]

On 06/04/2020 22:09, Askar Safin wrote:
> Hi. Thanks for your splice io_uring patch. Maybe it will be good idea to add uring operation, which will unify splice, sendfile and copy_file_range instead of just IORING_OP_SPLICE?

It doesn't have to follow splice(2) semantics, so can be extended, in theory.

Though I don't see much profit in doing that for now. sendfile(2) is done by
splicing through an internal pipe, and this pipe will complicate things for
io-wq (i.e. io_uring's thread pool). On the other hand, it'd be of the same
performance and even more flexible to send 2 linked splice requests with a
pre-allocated pipe from the userspace.

-- 
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-06 20:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-06 19:09 [POC RFC 0/3] splice(2) support for io_uring Askar Safin
2020-04-06 20:01 ` Pavel Begunkov
  -- strict thread matches above, loose matches on Subject: below --
2020-01-22  0:05 Pavel Begunkov
2020-01-22  1:55 ` Jens Axboe
2020-01-22  3:11   ` Pavel Begunkov
2020-01-22  3:30     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox