public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG
@ 2022-03-23 22:41 Jens Axboe
  2022-03-23 22:41 ` [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jens Axboe @ 2022-03-23 22:41 UTC (permalink / raw)
  To: io-uring; +Cc: constantine.gavrilov

Hi,

If we get a partial receive, we don't retry even if MSG_WAITALL is set.
Ensure that we retry for the remainder in that case.

The ordering of patches may look a bit odd here, but it's done this way
to make it easier to handle for the stable backport.

v2:
- Only do it for SOCK_STREAM/SOCK_SEQPACKET

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly
  2022-03-23 22:41 [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Jens Axboe
@ 2022-03-23 22:41 ` Jens Axboe
  2022-03-24  0:32   ` Stefan Metzmacher
  2022-03-23 22:41 ` [PATCH 2/2] io_uring: add flag for disabling provided buffer recycling Jens Axboe
  2022-03-25  3:16 ` [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Constantine Gavrilov
  2 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2022-03-23 22:41 UTC (permalink / raw)
  To: io-uring; +Cc: constantine.gavrilov, Jens Axboe, stable

We currently don't attempt to get the full asked for length even if
MSG_WAITALL is set, if we get a partial receive. If we do see a partial
receive, then just note how many bytes we did and return -EAGAIN to
get it retried.

The iov is advanced appropriately for the vector based case, and we
manually bump the buffer and remainder for the non-vector case.

Cc: [email protected]
Reported-by: Constantine Gavrilov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
 fs/io_uring.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f41d91ce1fd0..a70de170aea1 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -612,6 +612,7 @@ struct io_sr_msg {
 	int				msg_flags;
 	int				bgid;
 	size_t				len;
+	size_t				done_io;
 };
 
 struct io_open {
@@ -5417,12 +5418,21 @@ static int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	if (req->ctx->compat)
 		sr->msg_flags |= MSG_CMSG_COMPAT;
 #endif
+	sr->done_io = 0;
 	return 0;
 }
 
+static bool io_net_retry(struct socket *sock, int flags)
+{
+	if (!(flags & MSG_WAITALL))
+		return false;
+	return sock->type == SOCK_STREAM || sock->type == SOCK_SEQPACKET;
+}
+
 static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
 {
 	struct io_async_msghdr iomsg, *kmsg;
+	struct io_sr_msg *sr = &req->sr_msg;
 	struct socket *sock;
 	struct io_buffer *kbuf;
 	unsigned flags;
@@ -5465,6 +5475,10 @@ static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
 			return io_setup_async_msg(req, kmsg);
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
+		if (ret > 0 && io_net_retry(sock, flags)) {
+			sr->done_io += ret;
+			return io_setup_async_msg(req, kmsg);
+		}
 		req_set_fail(req);
 	} else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
 		req_set_fail(req);
@@ -5474,6 +5488,10 @@ static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
 	if (kmsg->free_iov)
 		kfree(kmsg->free_iov);
 	req->flags &= ~REQ_F_NEED_CLEANUP;
+	if (ret >= 0)
+		ret += sr->done_io;
+	else if (sr->done_io)
+		ret = sr->done_io;
 	__io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
 	return 0;
 }
@@ -5524,12 +5542,22 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
 			return -EAGAIN;
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
+		if (ret > 0 && io_net_retry(sock, flags)) {
+			sr->len -= ret;
+			sr->buf += ret;
+			sr->done_io += ret;
+			return -EAGAIN;
+		}
 		req_set_fail(req);
 	} else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
 out_free:
 		req_set_fail(req);
 	}
 
+	if (ret >= 0)
+		ret += sr->done_io;
+	else if (sr->done_io)
+		ret = sr->done_io;
 	__io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
 	return 0;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] io_uring: add flag for disabling provided buffer recycling
  2022-03-23 22:41 [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Jens Axboe
  2022-03-23 22:41 ` [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly Jens Axboe
@ 2022-03-23 22:41 ` Jens Axboe
  2022-03-25  3:16 ` [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Constantine Gavrilov
  2 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2022-03-23 22:41 UTC (permalink / raw)
  To: io-uring; +Cc: constantine.gavrilov, Jens Axboe

If we need to continue doing this IO, then we don't want a potentially
selected buffer recycled. Add a flag for that.

Set this for recv/recvmsg if they do partial IO.

Signed-off-by: Jens Axboe <[email protected]>
---
 fs/io_uring.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index a70de170aea1..88556e654c5a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -783,6 +783,7 @@ enum {
 	REQ_F_SKIP_LINK_CQES_BIT,
 	REQ_F_SINGLE_POLL_BIT,
 	REQ_F_DOUBLE_POLL_BIT,
+	REQ_F_PARTIAL_IO_BIT,
 	/* keep async read/write and isreg together and in order */
 	REQ_F_SUPPORT_NOWAIT_BIT,
 	REQ_F_ISREG_BIT,
@@ -845,6 +846,8 @@ enum {
 	REQ_F_SINGLE_POLL	= BIT(REQ_F_SINGLE_POLL_BIT),
 	/* double poll may active */
 	REQ_F_DOUBLE_POLL	= BIT(REQ_F_DOUBLE_POLL_BIT),
+	/* request has already done partial IO */
+	REQ_F_PARTIAL_IO	= BIT(REQ_F_PARTIAL_IO_BIT),
 };
 
 struct async_poll {
@@ -1392,6 +1395,9 @@ static void io_kbuf_recycle(struct io_kiocb *req, unsigned issue_flags)
 
 	if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
 		return;
+	/* don't recycle if we already did IO to this buffer */
+	if (req->flags & REQ_F_PARTIAL_IO)
+		return;
 
 	if (issue_flags & IO_URING_F_UNLOCKED)
 		mutex_lock(&ctx->uring_lock);
@@ -5477,6 +5483,7 @@ static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
 			ret = -EINTR;
 		if (ret > 0 && io_net_retry(sock, flags)) {
 			sr->done_io += ret;
+			req->flags |= REQ_F_PARTIAL_IO;
 			return io_setup_async_msg(req, kmsg);
 		}
 		req_set_fail(req);
@@ -5546,6 +5553,7 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
 			sr->len -= ret;
 			sr->buf += ret;
 			sr->done_io += ret;
+			req->flags |= REQ_F_PARTIAL_IO;
 			return -EAGAIN;
 		}
 		req_set_fail(req);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly
  2022-03-23 22:41 ` [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly Jens Axboe
@ 2022-03-24  0:32   ` Stefan Metzmacher
  2022-03-24  0:35     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Metzmacher @ 2022-03-24  0:32 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: constantine.gavrilov, stable

Hi Jens,

> @@ -5524,12 +5542,22 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
>   			return -EAGAIN;
>   		if (ret == -ERESTARTSYS)
>   			ret = -EINTR;
> +		if (ret > 0 && io_net_retry(sock, flags)) {
> +			sr->len -= ret;
> +			sr->buf += ret;
> +			sr->done_io += ret;
> +			return -EAGAIN;
> +		}
>   		req_set_fail(req);
>   	} else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
>   out_free:
>   		req_set_fail(req);

The change only affects retry based socket io in the main thread, correct?

The truncated mesages still trigger req_set_fail if MSG_WAITALL was set?

metze


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly
  2022-03-24  0:32   ` Stefan Metzmacher
@ 2022-03-24  0:35     ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2022-03-24  0:35 UTC (permalink / raw)
  To: Stefan Metzmacher, io-uring; +Cc: constantine.gavrilov, stable

On 3/23/22 6:32 PM, Stefan Metzmacher wrote:
> Hi Jens,
> 
>> @@ -5524,12 +5542,22 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
>>               return -EAGAIN;
>>           if (ret == -ERESTARTSYS)
>>               ret = -EINTR;
>> +        if (ret > 0 && io_net_retry(sock, flags)) {
>> +            sr->len -= ret;
>> +            sr->buf += ret;
>> +            sr->done_io += ret;
>> +            return -EAGAIN;
>> +        }
>>           req_set_fail(req);
>>       } else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
>>   out_free:
>>           req_set_fail(req);
> 
> The change only affects retry based socket io in the main thread, correct?

Not sure I follow - it affects retries for streams based sockets, where
previously they could be broken into two pieces when they should not be.

> The truncated mesages still trigger req_set_fail if MSG_WAITALL was set?

If we don't retry, regardless of flags, then we'll set failure on the
request to break links. If it ends up transferring the whole amount,
regardless of whether or not it happens in one go or not, it will not
fail links.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG
  2022-03-23 22:41 [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Jens Axboe
  2022-03-23 22:41 ` [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly Jens Axboe
  2022-03-23 22:41 ` [PATCH 2/2] io_uring: add flag for disabling provided buffer recycling Jens Axboe
@ 2022-03-25  3:16 ` Constantine Gavrilov
  2022-03-25 14:50   ` Jens Axboe
  2 siblings, 1 reply; 7+ messages in thread
From: Constantine Gavrilov @ 2022-03-25  3:16 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On Thu, Mar 24, 2022 at 12:41 AM Jens Axboe <[email protected]> wrote:
>
> Hi,
>
> If we get a partial receive, we don't retry even if MSG_WAITALL is set.
> Ensure that we retry for the remainder in that case.
>
> The ordering of patches may look a bit odd here, but it's done this way
> to make it easier to handle for the stable backport.
>
> v2:
> - Only do it for SOCK_STREAM/SOCK_SEQPACKET
>
> --
> Jens Axboe
>
>

Jens:

I was able to test the patch, after making some minor changes to apply
to Fedora kernel. I confirm that the patch works.

-- 
----------------------------------------
Constantine Gavrilov
Storage Architect
Master Inventor
Tel-Aviv IBM Storage Lab
1 Azrieli Center, Tel-Aviv
----------------------------------------

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG
  2022-03-25  3:16 ` [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Constantine Gavrilov
@ 2022-03-25 14:50   ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2022-03-25 14:50 UTC (permalink / raw)
  To: Constantine Gavrilov; +Cc: io-uring

On 3/24/22 9:16 PM, Constantine Gavrilov wrote:
> On Thu, Mar 24, 2022 at 12:41 AM Jens Axboe <[email protected]> wrote:
>>
>> Hi,
>>
>> If we get a partial receive, we don't retry even if MSG_WAITALL is set.
>> Ensure that we retry for the remainder in that case.
>>
>> The ordering of patches may look a bit odd here, but it's done this way
>> to make it easier to handle for the stable backport.
>>
>> v2:
>> - Only do it for SOCK_STREAM/SOCK_SEQPACKET
>>
>> --
>> Jens Axboe
>>
>>
> 
> Jens:
> 
> I was able to test the patch, after making some minor changes to apply
> to Fedora kernel. I confirm that the patch works.

Great, thanks for testing! It'll go into the current kernel tree next
week and make its way back to stable after that.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-25 14:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-23 22:41 [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Jens Axboe
2022-03-23 22:41 ` [PATCH 1/2] io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly Jens Axboe
2022-03-24  0:32   ` Stefan Metzmacher
2022-03-24  0:35     ` Jens Axboe
2022-03-23 22:41 ` [PATCH 2/2] io_uring: add flag for disabling provided buffer recycling Jens Axboe
2022-03-25  3:16 ` [PATCHSET v2 0/2] Fix MSG_WAITALL for IORING_OP_RECV/RECVMSG Constantine Gavrilov
2022-03-25 14:50   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox