* [PATCH] io_uring: read/readv must commit ring mapped buffers upfront
@ 2022-06-16 1:55 Jens Axboe
2022-06-16 4:41 ` Hao Xu
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2022-06-16 1:55 UTC (permalink / raw)
To: io-uring; +Cc: Pavel Begunkov
For recv/recvmsg, IO either completes immediately or gets queued for a
retry. This isn't the case for read/readv, if eg a normal file or a block
device is used. Here, an operation can get queued with the block layer.
If this happens, ring mapped buffers must get committed immediately to
avoid that the next read can consume the same buffer.
Add an io_op_def flag for this, buffer_ring_commit. If set, when a mapped
buffer is selected, it is immediately committed.
Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Jens Axboe <[email protected]>
---
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5d479428d8e5..05703bcf73fd 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1098,6 +1098,8 @@ struct io_op_def {
unsigned poll_exclusive : 1;
/* op supports buffer selection */
unsigned buffer_select : 1;
+ /* op needs immediate commit of ring mapped buffers */
+ unsigned buffer_ring_commit : 1;
/* do prep async if is going to be punted */
unsigned needs_async_setup : 1;
/* opcode is not supported by this kernel */
@@ -1122,6 +1124,7 @@ static const struct io_op_def io_op_defs[] = {
.unbound_nonreg_file = 1,
.pollin = 1,
.buffer_select = 1,
+ .buffer_ring_commit = 1,
.needs_async_setup = 1,
.plug = 1,
.audit_skip = 1,
@@ -1239,6 +1242,7 @@ static const struct io_op_def io_op_defs[] = {
.unbound_nonreg_file = 1,
.pollin = 1,
.buffer_select = 1,
+ .buffer_ring_commit = 1,
.plug = 1,
.audit_skip = 1,
.ioprio = 1,
@@ -3836,7 +3840,8 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
req->buf_list = bl;
req->buf_index = buf->bid;
- if (issue_flags & IO_URING_F_UNLOCKED) {
+ if (issue_flags & IO_URING_F_UNLOCKED ||
+ io_op_defs[req->opcode].buffer_ring_commit) {
/*
* If we came in unlocked, we have no choice but to consume the
* buffer here. This does mean it'll be pinned until the IO
--
Jens Axboe
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] io_uring: read/readv must commit ring mapped buffers upfront
2022-06-16 1:55 [PATCH] io_uring: read/readv must commit ring mapped buffers upfront Jens Axboe
@ 2022-06-16 4:41 ` Hao Xu
2022-06-16 11:14 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Hao Xu @ 2022-06-16 4:41 UTC (permalink / raw)
To: Jens Axboe, io-uring; +Cc: Pavel Begunkov
On 6/16/22 09:55, Jens Axboe wrote:
> For recv/recvmsg, IO either completes immediately or gets queued for a
> retry. This isn't the case for read/readv, if eg a normal file or a block
> device is used. Here, an operation can get queued with the block layer.
> If this happens, ring mapped buffers must get committed immediately to
> avoid that the next read can consume the same buffer.
>
> Add an io_op_def flag for this, buffer_ring_commit. If set, when a mapped
> buffer is selected, it is immediately committed.
>
> Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
> Signed-off-by: Jens Axboe <[email protected]>
>
> ---
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 5d479428d8e5..05703bcf73fd 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1098,6 +1098,8 @@ struct io_op_def {
> unsigned poll_exclusive : 1;
> /* op supports buffer selection */
> unsigned buffer_select : 1;
> + /* op needs immediate commit of ring mapped buffers */
> + unsigned buffer_ring_commit : 1;
> /* do prep async if is going to be punted */
> unsigned needs_async_setup : 1;
> /* opcode is not supported by this kernel */
> @@ -1122,6 +1124,7 @@ static const struct io_op_def io_op_defs[] = {
> .unbound_nonreg_file = 1,
> .pollin = 1,
> .buffer_select = 1,
> + .buffer_ring_commit = 1,
> .needs_async_setup = 1,
> .plug = 1,
> .audit_skip = 1,
> @@ -1239,6 +1242,7 @@ static const struct io_op_def io_op_defs[] = {
> .unbound_nonreg_file = 1,
> .pollin = 1,
> .buffer_select = 1,
> + .buffer_ring_commit = 1,
> .plug = 1,
> .audit_skip = 1,
> .ioprio = 1,
This way we also commit the buffer for read(sockfd) unconditionally.
Would it be better to commit buffer only for read(reg/blk fd) ?
> @@ -3836,7 +3840,8 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
> req->buf_list = bl;
> req->buf_index = buf->bid;
>
> - if (issue_flags & IO_URING_F_UNLOCKED) {
> + if (issue_flags & IO_URING_F_UNLOCKED ||
> + io_op_defs[req->opcode].buffer_ring_commit) {
> /*
> * If we came in unlocked, we have no choice but to consume the
> * buffer here. This does mean it'll be pinned until the IO
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] io_uring: read/readv must commit ring mapped buffers upfront
2022-06-16 4:41 ` Hao Xu
@ 2022-06-16 11:14 ` Jens Axboe
0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2022-06-16 11:14 UTC (permalink / raw)
To: Hao Xu, io-uring; +Cc: Pavel Begunkov
On 6/15/22 10:41 PM, Hao Xu wrote:
> On 6/16/22 09:55, Jens Axboe wrote:
>> For recv/recvmsg, IO either completes immediately or gets queued for a
>> retry. This isn't the case for read/readv, if eg a normal file or a block
>> device is used. Here, an operation can get queued with the block layer.
>> If this happens, ring mapped buffers must get committed immediately to
>> avoid that the next read can consume the same buffer.
>>
>> Add an io_op_def flag for this, buffer_ring_commit. If set, when a mapped
>> buffer is selected, it is immediately committed.
>>
>> Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
>> Signed-off-by: Jens Axboe <[email protected]>
>>
>> ---
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 5d479428d8e5..05703bcf73fd 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1098,6 +1098,8 @@ struct io_op_def {
>> unsigned poll_exclusive : 1;
>> /* op supports buffer selection */
>> unsigned buffer_select : 1;
>> + /* op needs immediate commit of ring mapped buffers */
>> + unsigned buffer_ring_commit : 1;
>> /* do prep async if is going to be punted */
>> unsigned needs_async_setup : 1;
>> /* opcode is not supported by this kernel */
>> @@ -1122,6 +1124,7 @@ static const struct io_op_def io_op_defs[] = {
>> .unbound_nonreg_file = 1,
>> .pollin = 1,
>> .buffer_select = 1,
>> + .buffer_ring_commit = 1,
>> .needs_async_setup = 1,
>> .plug = 1,
>> .audit_skip = 1,
>> @@ -1239,6 +1242,7 @@ static const struct io_op_def io_op_defs[] = {
>> .unbound_nonreg_file = 1,
>> .pollin = 1,
>> .buffer_select = 1,
>> + .buffer_ring_commit = 1,
>> .plug = 1,
>> .audit_skip = 1,
>> .ioprio = 1,
>
>
> This way we also commit the buffer for read(sockfd) unconditionally.
> Would it be better to commit buffer only for read(reg/blk fd) ?
Right, it will. We could potentially make it look at if the file is
pollable or not, downside is that then the logic needs to go into
io_read() - or if not, the buffer selection will need implied knowledge
of what the consumer might do.
I don't worry too much about sockets, generally I expect people to use
recv/recvmsg for those. But we do have other pollable file types, like
pipes, so it would be nice to make it consistent with the model that is
"don't consume if you did no IO and will retry".
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-06-16 11:14 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-16 1:55 [PATCH] io_uring: read/readv must commit ring mapped buffers upfront Jens Axboe
2022-06-16 4:41 ` Hao Xu
2022-06-16 11:14 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox