public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] io_uring/net: don't fail linked ops when done_io > 0
@ 2026-02-26 22:03 Hannes Furmans
  0 siblings, 0 replies; only message in thread
From: Hannes Furmans @ 2026-02-26 22:03 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, linux-kernel, stable, Hannes Furmans

When io_uring recv/send with MSG_WAITALL accumulates partial data
through done_io and then encounters an error or EOF, req_set_fail()
sets REQ_F_FAIL despite the CQE result being positive (done_io bytes).
io_disarm_next() then sees REQ_F_FAIL and cancels all linked operations
with -ECANCELED, even though the user-visible result indicates success.

This manifests in two code paths:

1) Direct completion: io_recv/io_send fall through to req_set_fail()
   when ret < min_ret, even if done_io > 0. The CQE shows done_io
   (positive) but REQ_F_FAIL severs the link chain.

2) io-wq fallback: after APOLL_MAX_RETRY (128) poll retries, the
   request moves to io-wq. io_recv returns IOU_RETRY from the
   MSG_WAITALL retry path, io-wq fails the request with -EAGAIN, and
   io_req_defer_failed -> io_sendrecv_fail overwrites cqe.res with
   done_io but leaves REQ_F_FAIL set.

Fix this by:
- Not calling req_set_fail() when done_io > 0 in io_recv, io_recvmsg,
  io_send, io_sendmsg, io_send_zc, io_sendmsg_zc
- Clearing REQ_F_FAIL in io_sendrecv_fail() when done_io > 0

This makes MSG_WAITALL partial completions consistent with
non-MSG_WAITALL behavior, where positive results never sever the
IO_LINK chain.

Reproducer: MSG_WAITALL recv via IO_LINK -> write on a UNIX socketpair
where the sender closes after partial data. The recv CQE shows positive
bytes but the linked write gets -ECANCELED.

Fixes: 0031275d119e ("io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL")
Cc: stable@vger.kernel.org
Signed-off-by: Hannes Furmans <hannes@stillwind.ai>
---
 io_uring/net.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/io_uring/net.c b/io_uring/net.c
index 8576c6cb2236..ebe51db34af8 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -576,7 +576,8 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
 		}
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
-		req_set_fail(req);
+		if (!sr->done_io)
+			req_set_fail(req);
 	}
 	io_req_msg_cleanup(req, issue_flags);
 	if (ret >= 0)
@@ -688,7 +689,8 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags)
 		}
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
-		req_set_fail(req);
+		if (!sr->done_io)
+			req_set_fail(req);
 	}
 	if (ret >= 0)
 		ret += sr->done_io;
@@ -1074,7 +1076,8 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
 		}
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
-		req_set_fail(req);
+		if (!sr->done_io)
+			req_set_fail(req);
 	} else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
 		req_set_fail(req);
 	}
@@ -1220,7 +1223,8 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags)
 		}
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
-		req_set_fail(req);
+		if (!sr->done_io)
+			req_set_fail(req);
 	} else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
 out_free:
 		req_set_fail(req);
@@ -1498,7 +1502,8 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags)
 		}
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
-		req_set_fail(req);
+		if (!zc->done_io)
+			req_set_fail(req);
 	}
 
 	if (ret >= 0)
@@ -1570,7 +1575,8 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags)
 		}
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
-		req_set_fail(req);
+		if (!sr->done_io)
+			req_set_fail(req);
 	}
 
 	if (ret >= 0)
@@ -1595,8 +1601,10 @@ void io_sendrecv_fail(struct io_kiocb *req)
 {
 	struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
 
-	if (sr->done_io)
+	if (sr->done_io) {
 		req->cqe.res = sr->done_io;
+		req->flags &= ~REQ_F_FAIL;
+	}
 
 	if ((req->flags & REQ_F_NEED_CLEANUP) &&
 	    (req->opcode == IORING_OP_SEND_ZC || req->opcode == IORING_OP_SENDMSG_ZC))
-- 
2.53.0


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2026-02-26 22:03 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-26 22:03 [PATCH] io_uring/net: don't fail linked ops when done_io > 0 Hannes Furmans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox