* [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests
@ 2022-09-27 23:51 Pavel Begunkov
2022-09-28 13:53 ` Jens Axboe
2022-09-28 15:23 ` Stefan Metzmacher
0 siblings, 2 replies; 5+ messages in thread
From: Pavel Begunkov @ 2022-09-27 23:51 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe, asml.silence, Stefan Metzmacher
We currently only add a notification CQE when the send succeded, i.e.
cqe.res >= 0. However, it'd be more robust to do buffer notifications
for failed requests as well in case drivers decide do something fanky.
Always return a buffer notification after initial prep, don't hide it.
This behaviour is better aligned with documentation and the patch also
helps the userspace to respect it.
Cc: [email protected] # 6.0
Suggested-by: Stefan Metzmacher <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
We need it as soon as possible, and it's likely almost time
for 6.1 rcs.
io_uring/net.c | 29 ++++++++---------------------
1 file changed, 8 insertions(+), 21 deletions(-)
diff --git a/io_uring/net.c b/io_uring/net.c
index 6b69eff6887e..5058a9fc9e9c 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -916,7 +916,6 @@ void io_send_zc_cleanup(struct io_kiocb *req)
kfree(io->free_iov);
}
if (zc->notif) {
- zc->notif->flags |= REQ_F_CQE_SKIP;
io_notif_flush(zc->notif);
zc->notif = NULL;
}
@@ -1047,7 +1046,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags)
struct msghdr msg;
struct iovec iov;
struct socket *sock;
- unsigned msg_flags, cflags;
+ unsigned msg_flags;
int ret, min_ret = 0;
sock = sock_from_file(req->file);
@@ -1115,8 +1114,6 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags)
req->flags |= REQ_F_PARTIAL_IO;
return io_setup_async_addr(req, &__address, issue_flags);
}
- if (ret < 0 && !zc->done_io)
- zc->notif->flags |= REQ_F_CQE_SKIP;
if (ret == -ERESTARTSYS)
ret = -EINTR;
req_set_fail(req);
@@ -1129,8 +1126,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags)
io_notif_flush(zc->notif);
req->flags &= ~REQ_F_NEED_CLEANUP;
- cflags = ret >= 0 ? IORING_CQE_F_MORE : 0;
- io_req_set_res(req, ret, cflags);
+ io_req_set_res(req, ret, IORING_CQE_F_MORE);
return IOU_OK;
}
@@ -1139,7 +1135,7 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags)
struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
struct io_async_msghdr iomsg, *kmsg;
struct socket *sock;
- unsigned flags, cflags;
+ unsigned flags;
int ret, min_ret = 0;
sock = sock_from_file(req->file);
@@ -1178,8 +1174,6 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags)
req->flags |= REQ_F_PARTIAL_IO;
return io_setup_async_msg(req, kmsg, issue_flags);
}
- if (ret < 0 && !sr->done_io)
- sr->notif->flags |= REQ_F_CQE_SKIP;
if (ret == -ERESTARTSYS)
ret = -EINTR;
req_set_fail(req);
@@ -1196,27 +1190,20 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags)
io_notif_flush(sr->notif);
req->flags &= ~REQ_F_NEED_CLEANUP;
- cflags = ret >= 0 ? IORING_CQE_F_MORE : 0;
- io_req_set_res(req, ret, cflags);
+ io_req_set_res(req, ret, IORING_CQE_F_MORE);
return IOU_OK;
}
void io_sendrecv_fail(struct io_kiocb *req)
{
struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
- int res = req->cqe.res;
if (req->flags & REQ_F_PARTIAL_IO)
- res = sr->done_io;
+ req->cqe.res = sr->done_io;
+
if ((req->flags & REQ_F_NEED_CLEANUP) &&
- (req->opcode == IORING_OP_SEND_ZC || req->opcode == IORING_OP_SENDMSG_ZC)) {
- /* preserve notification for partial I/O */
- if (res < 0)
- sr->notif->flags |= REQ_F_CQE_SKIP;
- io_notif_flush(sr->notif);
- sr->notif = NULL;
- }
- io_req_set_res(req, res, req->cqe.flags);
+ (req->opcode == IORING_OP_SEND_ZC || req->opcode == IORING_OP_SENDMSG_ZC))
+ req->cqe.flags |= IORING_CQE_F_MORE;
}
int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
--
2.37.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests
2022-09-27 23:51 [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests Pavel Begunkov
@ 2022-09-28 13:53 ` Jens Axboe
2022-09-28 15:23 ` Stefan Metzmacher
1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2022-09-28 13:53 UTC (permalink / raw)
To: io-uring, Pavel Begunkov; +Cc: Stefan Metzmacher
On Wed, 28 Sep 2022 00:51:49 +0100, Pavel Begunkov wrote:
> We currently only add a notification CQE when the send succeded, i.e.
> cqe.res >= 0. However, it'd be more robust to do buffer notifications
> for failed requests as well in case drivers decide do something fanky.
>
> Always return a buffer notification after initial prep, don't hide it.
> This behaviour is better aligned with documentation and the patch also
> helps the userspace to respect it.
>
> [...]
Applied, thanks!
[1/1] io_uring/net: don't skip notifs for failed requests
commit: 6ae91ac9a6aa7d6005c3c6d0f4d263fbab9f377f
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests
2022-09-27 23:51 [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests Pavel Begunkov
2022-09-28 13:53 ` Jens Axboe
@ 2022-09-28 15:23 ` Stefan Metzmacher
2022-09-28 16:56 ` Pavel Begunkov
1 sibling, 1 reply; 5+ messages in thread
From: Stefan Metzmacher @ 2022-09-28 15:23 UTC (permalink / raw)
To: Pavel Begunkov, io-uring; +Cc: Jens Axboe
Hi Pavel,
> We currently only add a notification CQE when the send succeded, i.e.
> cqe.res >= 0. However, it'd be more robust to do buffer notifications
> for failed requests as well in case drivers decide do something fanky.
>
> Always return a buffer notification after initial prep, don't hide it.
> This behaviour is better aligned with documentation and the patch also
> helps the userspace to respect it.
Just as reference, this was the version I was testing with:
https://git.samba.org/?p=metze/linux/wip.git;a=commitdiff;h=7ffb896cdb8ccd55065f7ffae9fb8050e39211c7
> void io_sendrecv_fail(struct io_kiocb *req)
> {
> struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
> - int res = req->cqe.res;
>
> if (req->flags & REQ_F_PARTIAL_IO)
> - res = sr->done_io;
> + req->cqe.res = sr->done_io;
> +
> if ((req->flags & REQ_F_NEED_CLEANUP) &&
> - (req->opcode == IORING_OP_SEND_ZC || req->opcode == IORING_OP_SENDMSG_ZC)) {
> - /* preserve notification for partial I/O */
> - if (res < 0)
> - sr->notif->flags |= REQ_F_CQE_SKIP;
> - io_notif_flush(sr->notif);
> - sr->notif = NULL;
Here we rely on io_send_zc_cleanup(), correct?
Note that I hit a very bad problem during my tests of SENDMSG_ZC.
BUG(); in first_iovec_segment() triggered very easily.
The problem is io_setup_async_msg() in the partial retry case,
which seems to happen more often with _ZC.
if (!async_msg->free_iov)
async_msg->msg.msg_iter.iov = async_msg->fast_iov;
Is wrong it needs to be something like this:
+ if (!kmsg->free_iov) {
+ size_t fast_idx = kmsg->msg.msg_iter.iov - kmsg->fast_iov;
+ async_msg->msg.msg_iter.iov = &async_msg->fast_iov[fast_idx];
+ }
As iov_iter_iovec_advance() may change i->iov in order to have i->iov_offset
being only relative to the first element.
I'm not sure about the 'kmsg->free_iov' case, do we reuse the
callers memory or should we make a copy?
I initially used this
https://git.samba.org/?p=metze/linux/wip.git;a=commitdiff;h=e1d3a9f5c7708a37172d258753ed7377eaac9e33
But I didn't test with the non-fast_iov case.
BTW: I tested with 5 vectors with length like this 4, 0, 64, 32, 8388608
and got a short write with about ~ 2000000.
I'm not sure if it was already a problem before:
commit 257e84a5377fbbc336ff563833a8712619acce56
io_uring: refactor sendmsg/recvmsg iov managing
But I guess it was a potential problem before starting with
7ba89d2af17aa879dda30f5d5d3f152e587fc551 where io_net_retry()
was introduced.
metze
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests
2022-09-28 15:23 ` Stefan Metzmacher
@ 2022-09-28 16:56 ` Pavel Begunkov
2022-09-28 18:58 ` Pavel Begunkov
0 siblings, 1 reply; 5+ messages in thread
From: Pavel Begunkov @ 2022-09-28 16:56 UTC (permalink / raw)
To: Stefan Metzmacher, io-uring; +Cc: Jens Axboe
On 9/28/22 16:23, Stefan Metzmacher wrote:
>
> Hi Pavel,
>
>> We currently only add a notification CQE when the send succeded, i.e.
>> cqe.res >= 0. However, it'd be more robust to do buffer notifications
>> for failed requests as well in case drivers decide do something fanky.
>>
>> Always return a buffer notification after initial prep, don't hide it.
>> This behaviour is better aligned with documentation and the patch also
>> helps the userspace to respect it.
>
> Just as reference, this was the version I was testing with:
> https://git.samba.org/?p=metze/linux/wip.git;a=commitdiff;h=7ffb896cdb8ccd55065f7ffae9fb8050e39211c7
>
>> void io_sendrecv_fail(struct io_kiocb *req)
>> {
>> struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
>> - int res = req->cqe.res;
>> if (req->flags & REQ_F_PARTIAL_IO)
>> - res = sr->done_io;
>> + req->cqe.res = sr->done_io;
>> +
>> if ((req->flags & REQ_F_NEED_CLEANUP) &&
>> - (req->opcode == IORING_OP_SEND_ZC || req->opcode == IORING_OP_SENDMSG_ZC)) {
>> - /* preserve notification for partial I/O */
>> - if (res < 0)
>> - sr->notif->flags |= REQ_F_CQE_SKIP;
>> - io_notif_flush(sr->notif);
>> - sr->notif = NULL;
>
> Here we rely on io_send_zc_cleanup(), correct?
>
> Note that I hit a very bad problem during my tests of SENDMSG_ZC.
> BUG(); in first_iovec_segment() triggered very easily.
> The problem is io_setup_async_msg() in the partial retry case,
> which seems to happen more often with _ZC.
>
> if (!async_msg->free_iov)
> async_msg->msg.msg_iter.iov = async_msg->fast_iov;
>
> Is wrong it needs to be something like this:
>
> + if (!kmsg->free_iov) {
> + size_t fast_idx = kmsg->msg.msg_iter.iov - kmsg->fast_iov;
> + async_msg->msg.msg_iter.iov = &async_msg->fast_iov[fast_idx];
> + }
I agree, it doesn't look right. It indeed needs sth like
io_uring/rw.c:io_req_map_rw()
> As iov_iter_iovec_advance() may change i->iov in order to have i->iov_offset
> being only relative to the first element.
>
> I'm not sure about the 'kmsg->free_iov' case, do we reuse the
> callers memory or should we make a copy?
> I initially used this
> https://git.samba.org/?p=metze/linux/wip.git;a=commitdiff;h=e1d3a9f5c7708a37172d258753ed7377eaac9e33
> But I didn't test with the non-fast_iov case.
>
> BTW: I tested with 5 vectors with length like this 4, 0, 64, 32, 8388608
> and got a short write with about ~ 2000000.
>
> I'm not sure if it was already a problem before:
>
> commit 257e84a5377fbbc336ff563833a8712619acce56
> io_uring: refactor sendmsg/recvmsg iov managing
>
> But I guess it was a potential problem before starting with
> 7ba89d2af17aa879dda30f5d5d3f152e587fc551 where io_net_retry()
> was introduced.
>
> metze
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests
2022-09-28 16:56 ` Pavel Begunkov
@ 2022-09-28 18:58 ` Pavel Begunkov
0 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2022-09-28 18:58 UTC (permalink / raw)
To: Stefan Metzmacher, io-uring; +Cc: Jens Axboe
On 9/28/22 17:56, Pavel Begunkov wrote:
> On 9/28/22 16:23, Stefan Metzmacher wrote:
>> Just as reference, this was the version I was testing with:
>> https://git.samba.org/?p=metze/linux/wip.git;a=commitdiff;h=7ffb896cdb8ccd55065f7ffae9fb8050e39211c7
>>
>>> void io_sendrecv_fail(struct io_kiocb *req)
>>> {
>>> struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
>>> - int res = req->cqe.res;
>>> if (req->flags & REQ_F_PARTIAL_IO)
>>> - res = sr->done_io;
>>> + req->cqe.res = sr->done_io;
>>> +
>>> if ((req->flags & REQ_F_NEED_CLEANUP) &&
>>> - (req->opcode == IORING_OP_SEND_ZC || req->opcode == IORING_OP_SENDMSG_ZC)) {
>>> - /* preserve notification for partial I/O */
>>> - if (res < 0)
>>> - sr->notif->flags |= REQ_F_CQE_SKIP;
>>> - io_notif_flush(sr->notif);
>>> - sr->notif = NULL;
>>
>> Here we rely on io_send_zc_cleanup(), correct?
Right
>> Note that I hit a very bad problem during my tests of SENDMSG_ZC.
>> BUG(); in first_iovec_segment() triggered very easily.
>> The problem is io_setup_async_msg() in the partial retry case,
>> which seems to happen more often with _ZC.
>>
>> if (!async_msg->free_iov)
>> async_msg->msg.msg_iter.iov = async_msg->fast_iov;
>>
>> Is wrong it needs to be something like this:
>>
>> + if (!kmsg->free_iov) {
>> + size_t fast_idx = kmsg->msg.msg_iter.iov - kmsg->fast_iov;
>> + async_msg->msg.msg_iter.iov = &async_msg->fast_iov[fast_idx];
>> + }
>
> I agree, it doesn't look right. It indeed needs sth like
> io_uring/rw.c:io_req_map_rw()
Took a closer look, that chunk above looks good and matches
io_req_map_rw() apart from non essential differences. Can you
send a patch?
>> As iov_iter_iovec_advance() may change i->iov in order to have i->iov_offset
>> being only relative to the first element.
>>
>> I'm not sure about the 'kmsg->free_iov' case, do we reuse the
>> callers memory or should we make a copy?
We can reuse it, we own it and it's immutable from
the iter perspective.
>> BTW: I tested with 5 vectors with length like this 4, 0, 64, 32, 8388608
>> and got a short write with about ~ 2000000.
Which is interesting to know. What does 2M here mean? Is it
consistently retries when sending more than 2M bytes?
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-09-28 19:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-27 23:51 [PATCH for-6.1] io_uring/net: don't skip notifs for failed requests Pavel Begunkov
2022-09-28 13:53 ` Jens Axboe
2022-09-28 15:23 ` Stefan Metzmacher
2022-09-28 16:56 ` Pavel Begunkov
2022-09-28 18:58 ` Pavel Begunkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox