public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v2 0/2] async hybrid for pollable requests
@ 2021-10-18 11:29 Hao Xu
  2021-10-18 11:29 ` [PATCH 1/2] io_uring: split logic of force_nonblock Hao Xu
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Hao Xu @ 2021-10-18 11:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

1/2 is a prep patch. 2/2 is the main one.
The good thing: see commit message.
the side effect: for normal io-worker path, added two if and two local
variables. for FORCE_ASYNC path, added three if and several dereferences

I think it is fine since the io-worker path is not the fast path, and
the benefit of this patchset is worth it.

Btw, we need to tweak the io-cancel.c a bit, not a big problem.
liburing tests will come later.

v1-->v2:
 - split logic of force_nonblock
 - tweak added code in io_wq_submit_work to reduce overhead
 from Pavel's commments.

Hao Xu (2):
  io_uring: split logic of force_nonblock
  io_uring: implement async hybrid mode for pollable requests

 fs/io_uring.c                 | 85 ++++++++++++++++++++++++++---------
 include/uapi/linux/io_uring.h |  4 +-
 2 files changed, 66 insertions(+), 23 deletions(-)

-- 
2.24.4


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] io_uring: split logic of force_nonblock
  2021-10-18 11:29 [PATCH v2 0/2] async hybrid for pollable requests Hao Xu
@ 2021-10-18 11:29 ` Hao Xu
  2021-10-18 12:13   ` Pavel Begunkov
  2021-10-18 12:27   ` Pavel Begunkov
  2021-10-18 11:29 ` [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests Hao Xu
  2021-10-18 12:31 ` [PATCH v2 0/2] async hybrid " Pavel Begunkov
  2 siblings, 2 replies; 12+ messages in thread
From: Hao Xu @ 2021-10-18 11:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

Currently force_nonblock stands for three meanings:
 - nowait or not
 - in an io-worker or not(hold uring_lock or not)

Let's split the logic to two flags, IO_URING_F_NONBLOCK and
IO_URING_F_UNLOCKED for convenience of the next patch.

Signed-off-by: Hao Xu <[email protected]>
---
 fs/io_uring.c | 50 ++++++++++++++++++++++++++++----------------------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index b6da03c26122..727cad6c36fc 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -199,6 +199,7 @@ struct io_rings {
 
 enum io_uring_cmd_flags {
 	IO_URING_F_COMPLETE_DEFER	= 1,
+	IO_URING_F_UNLOCKED		= 2,
 	/* int's last bit, sign checks are usually faster than a bit test */
 	IO_URING_F_NONBLOCK		= INT_MIN,
 };
@@ -2926,7 +2927,7 @@ static void kiocb_done(struct kiocb *kiocb, ssize_t ret,
 			struct io_ring_ctx *ctx = req->ctx;
 
 			req_set_fail(req);
-			if (!(issue_flags & IO_URING_F_NONBLOCK)) {
+			if (issue_flags & IO_URING_F_UNLOCKED) {
 				mutex_lock(&ctx->uring_lock);
 				__io_req_complete(req, issue_flags, ret, cflags);
 				mutex_unlock(&ctx->uring_lock);
@@ -3036,7 +3037,7 @@ static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
 {
 	struct io_buffer *kbuf = req->kbuf;
 	struct io_buffer *head;
-	bool needs_lock = !(issue_flags & IO_URING_F_NONBLOCK);
+	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
 
 	if (req->flags & REQ_F_BUFFER_SELECTED)
 		return kbuf;
@@ -3341,7 +3342,7 @@ static inline int io_rw_prep_async(struct io_kiocb *req, int rw)
 	int ret;
 
 	/* submission path, ->uring_lock should already be taken */
-	ret = io_import_iovec(rw, req, &iov, &iorw->s, IO_URING_F_NONBLOCK);
+	ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
 	if (unlikely(ret < 0))
 		return ret;
 
@@ -3452,6 +3453,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 	struct iovec *iovec;
 	struct kiocb *kiocb = &req->rw.kiocb;
 	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+	bool in_worker = issue_flags & IO_URING_F_UNLOCKED;
 	struct io_async_rw *rw;
 	ssize_t ret, ret2;
 
@@ -3495,7 +3497,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 	if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
 		req->flags &= ~REQ_F_REISSUE;
 		/* IOPOLL retry should happen for io-wq threads */
-		if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
+		if (in_worker && !(req->ctx->flags & IORING_SETUP_IOPOLL))
 			goto done;
 		/* no retry on NONBLOCK nor RWF_NOWAIT */
 		if (req->flags & REQ_F_NOWAIT)
@@ -3503,7 +3505,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 		ret = 0;
 	} else if (ret == -EIOCBQUEUED) {
 		goto out_free;
-	} else if (ret == req->result || ret <= 0 || !force_nonblock ||
+	} else if (ret == req->result || ret <= 0 || in_worker ||
 		   (req->flags & REQ_F_NOWAIT) || !need_read_all(req)) {
 		/* read all, failed, already did sync or don't want to retry */
 		goto done;
@@ -3581,6 +3583,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	struct iovec *iovec;
 	struct kiocb *kiocb = &req->rw.kiocb;
 	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+	bool in_worker = issue_flags & IO_URING_F_UNLOCKED;
 	ssize_t ret, ret2;
 
 	if (!req_has_async_data(req)) {
@@ -3651,7 +3654,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	/* no retry on NONBLOCK nor RWF_NOWAIT */
 	if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
 		goto done;
-	if (!force_nonblock || ret2 != -EAGAIN) {
+	if (in_worker || ret2 != -EAGAIN) {
 		/* IOPOLL retry should happen for io-wq threads */
 		if ((req->ctx->flags & IORING_SETUP_IOPOLL) && ret2 == -EAGAIN)
 			goto copy_iov;
@@ -4314,9 +4317,9 @@ static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
 	struct io_ring_ctx *ctx = req->ctx;
 	struct io_buffer *head;
 	int ret = 0;
-	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
 
-	io_ring_submit_lock(ctx, !force_nonblock);
+	io_ring_submit_lock(ctx, needs_lock);
 
 	lockdep_assert_held(&ctx->uring_lock);
 
@@ -4329,7 +4332,7 @@ static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
 
 	/* complete before unlock, IOPOLL may need the lock */
 	__io_req_complete(req, issue_flags, ret, 0);
-	io_ring_submit_unlock(ctx, !force_nonblock);
+	io_ring_submit_unlock(ctx, needs_lock);
 	return 0;
 }
 
@@ -4401,9 +4404,9 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
 	struct io_ring_ctx *ctx = req->ctx;
 	struct io_buffer *head, *list;
 	int ret = 0;
-	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
 
-	io_ring_submit_lock(ctx, !force_nonblock);
+	io_ring_submit_lock(ctx, needs_lock);
 
 	lockdep_assert_held(&ctx->uring_lock);
 
@@ -4419,7 +4422,7 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
 		req_set_fail(req);
 	/* complete before unlock, IOPOLL may need the lock */
 	__io_req_complete(req, issue_flags, ret, 0);
-	io_ring_submit_unlock(ctx, !force_nonblock);
+	io_ring_submit_unlock(ctx, needs_lock);
 	return 0;
 }
 
@@ -6279,6 +6282,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 	u64 sqe_addr = req->cancel.addr;
+	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
 	struct io_tctx_node *node;
 	int ret;
 
@@ -6287,7 +6291,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
 		goto done;
 
 	/* slow path, try all io-wq's */
-	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+	io_ring_submit_lock(ctx, needs_lock);
 	ret = -ENOENT;
 	list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
 		struct io_uring_task *tctx = node->task->io_uring;
@@ -6296,7 +6300,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
 		if (ret != -ENOENT)
 			break;
 	}
-	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+	io_ring_submit_unlock(ctx, needs_lock);
 done:
 	if (ret < 0)
 		req_set_fail(req);
@@ -6323,6 +6327,7 @@ static int io_rsrc_update_prep(struct io_kiocb *req,
 static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
 {
 	struct io_ring_ctx *ctx = req->ctx;
+	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
 	struct io_uring_rsrc_update2 up;
 	int ret;
 
@@ -6332,10 +6337,10 @@ static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
 	up.tags = 0;
 	up.resv = 0;
 
-	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+	io_ring_submit_lock(ctx, needs_lock);
 	ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE,
 					&up, req->rsrc_update.nr_args);
-	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+	io_ring_submit_unlock(ctx, needs_lock);
 
 	if (ret < 0)
 		req_set_fail(req);
@@ -6745,7 +6750,7 @@ static void io_wq_submit_work(struct io_wq_work *work)
 
 	if (!ret) {
 		do {
-			ret = io_issue_sqe(req, 0);
+			ret = io_issue_sqe(req, IO_URING_F_UNLOCKED);
 			/*
 			 * We can get EAGAIN for polled IO even though we're
 			 * forcing a sync submission from here, since we can't
@@ -8333,12 +8338,12 @@ static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
 				 unsigned int issue_flags, u32 slot_index)
 {
 	struct io_ring_ctx *ctx = req->ctx;
-	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
 	bool needs_switch = false;
 	struct io_fixed_file *file_slot;
 	int ret = -EBADF;
 
-	io_ring_submit_lock(ctx, !force_nonblock);
+	io_ring_submit_lock(ctx, needs_lock);
 	if (file->f_op == &io_uring_fops)
 		goto err;
 	ret = -ENXIO;
@@ -8379,7 +8384,7 @@ static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
 err:
 	if (needs_switch)
 		io_rsrc_node_switch(ctx, ctx->file_data);
-	io_ring_submit_unlock(ctx, !force_nonblock);
+	io_ring_submit_unlock(ctx, needs_lock);
 	if (ret)
 		fput(file);
 	return ret;
@@ -8389,11 +8394,12 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
 {
 	unsigned int offset = req->close.file_slot - 1;
 	struct io_ring_ctx *ctx = req->ctx;
+	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
 	struct io_fixed_file *file_slot;
 	struct file *file;
 	int ret, i;
 
-	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+	io_ring_submit_lock(ctx, needs_lock);
 	ret = -ENXIO;
 	if (unlikely(!ctx->file_data))
 		goto out;
@@ -8419,7 +8425,7 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
 	io_rsrc_node_switch(ctx, ctx->file_data);
 	ret = 0;
 out:
-	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+	io_ring_submit_unlock(ctx, needs_lock);
 	return ret;
 }
 
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests
  2021-10-18 11:29 [PATCH v2 0/2] async hybrid for pollable requests Hao Xu
  2021-10-18 11:29 ` [PATCH 1/2] io_uring: split logic of force_nonblock Hao Xu
@ 2021-10-18 11:29 ` Hao Xu
  2021-10-18 11:34   ` Hao Xu
  2021-10-18 12:31 ` [PATCH v2 0/2] async hybrid " Pavel Begunkov
  2 siblings, 1 reply; 12+ messages in thread
From: Hao Xu @ 2021-10-18 11:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

The current logic of requests with IOSQE_ASYNC is first queueing it to
io-worker, then execute it in a synchronous way. For unbound works like
pollable requests(e.g. read/write a socketfd), the io-worker may stuck
there waiting for events for a long time. And thus other works wait in
the list for a long time too.
Let's introduce a new way for unbound works (currently pollable
requests), with this a request will first be queued to io-worker, then
executed in a nonblock try rather than a synchronous way. Failure of
that leads it to arm poll stuff and then the worker can begin to handle
other works.
The detail process of this kind of requests is:

step1: original context:
           queue it to io-worker
step2: io-worker context:
           nonblock try(the old logic is a synchronous try here)
               |
               |--fail--> arm poll
                            |
                            |--(fail/ready)-->synchronous issue
                            |
                            |--(succeed)-->worker finish it's job, tw
                                           take over the req

This works much better than the old IOSQE_ASYNC logic in cases where
unbound max_worker is relatively small. In this case, number of
io-worker eazily increments to max_worker, new worker cannot be created
and running workers stuck there handling old works in IOSQE_ASYNC mode.

In my 64-core machine, set unbound max_worker to 20, run echo-server,
turns out:
(arguments: register_file, connetion number is 1000, message size is 12
Byte)
original IOSQE_ASYNC: 76664.151 tps
after this patch: 166934.985 tps

Suggested-by: Jens Axboe <[email protected]>
Signed-off-by: Hao Xu <[email protected]>
---
 fs/io_uring.c                 | 39 +++++++++++++++++++++++++++++++++--
 include/uapi/linux/io_uring.h |  4 +++-
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 727cad6c36fc..a3247a5dafc9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3497,7 +3497,8 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 	if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
 		req->flags &= ~REQ_F_REISSUE;
 		/* IOPOLL retry should happen for io-wq threads */
-		if (in_worker && !(req->ctx->flags & IORING_SETUP_IOPOLL))
+		if (in_worker && !force_nonblock &&
+		    !(req->ctx->flags & IORING_SETUP_IOPOLL))
 			goto done;
 		/* no retry on NONBLOCK nor RWF_NOWAIT */
 		if (req->flags & REQ_F_NOWAIT)
@@ -6749,8 +6750,18 @@ static void io_wq_submit_work(struct io_wq_work *work)
 		ret = -ECANCELED;
 
 	if (!ret) {
+		bool needs_poll = false;
+		unsigned int issue_flags = IO_URING_F_UNLOCKED;
+
+		if (req->flags & REQ_F_FORCE_ASYNC) {
+			needs_poll = req->file && file_can_poll(req->file);
+			if (needs_poll)
+				issue_flags |= IO_URING_F_NONBLOCK;
+		}
+
 		do {
-			ret = io_issue_sqe(req, IO_URING_F_UNLOCKED);
+issue_sqe:
+			ret = io_issue_sqe(req, issue_flags);
 			/*
 			 * We can get EAGAIN for polled IO even though we're
 			 * forcing a sync submission from here, since we can't
@@ -6758,6 +6769,30 @@ static void io_wq_submit_work(struct io_wq_work *work)
 			 */
 			if (ret != -EAGAIN)
 				break;
+			if (needs_poll) {
+				bool armed = false;
+
+				ret = 0;
+				needs_poll = false;
+				issue_flags &= ~IO_URING_F_NONBLOCK;
+
+				switch (io_arm_poll_handler(req)) {
+				case IO_APOLL_READY:
+					goto issue_sqe;
+				case IO_APOLL_ABORTED:
+					/*
+					 * somehow we failed to arm the poll infra,
+					 * fallback it to a normal async worker try.
+					 */
+					break;
+				case IO_APOLL_OK:
+					armed = true;
+					break;
+				}
+
+				if (armed)
+					break;
+			}
 			cond_resched();
 		} while (1);
 	}
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index c45b5e9a9387..3e49a7dbe636 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -70,6 +70,7 @@ enum {
 	IOSQE_IO_HARDLINK_BIT,
 	IOSQE_ASYNC_BIT,
 	IOSQE_BUFFER_SELECT_BIT,
+	IOSQE_ASYNC_HYBRID_BIT,
 };
 
 /*
@@ -87,7 +88,8 @@ enum {
 #define IOSQE_ASYNC		(1U << IOSQE_ASYNC_BIT)
 /* select buffer from sqe->buf_group */
 #define IOSQE_BUFFER_SELECT	(1U << IOSQE_BUFFER_SELECT_BIT)
-
+/* first force async then arm poll */
+#define IOSQE_ASYNC_HYBRID	(1U << IOSQE_ASYNC_HYBRID_BIT)
 /*
  * io_uring_setup() flags
  */
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests
  2021-10-18 11:29 ` [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests Hao Xu
@ 2021-10-18 11:34   ` Hao Xu
  2021-10-18 12:10     ` Pavel Begunkov
  0 siblings, 1 reply; 12+ messages in thread
From: Hao Xu @ 2021-10-18 11:34 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

在 2021/10/18 下午7:29, Hao Xu 写道:
> The current logic of requests with IOSQE_ASYNC is first queueing it to
> io-worker, then execute it in a synchronous way. For unbound works like
> pollable requests(e.g. read/write a socketfd), the io-worker may stuck
> there waiting for events for a long time. And thus other works wait in
> the list for a long time too.
> Let's introduce a new way for unbound works (currently pollable
> requests), with this a request will first be queued to io-worker, then
> executed in a nonblock try rather than a synchronous way. Failure of
> that leads it to arm poll stuff and then the worker can begin to handle
> other works.
> The detail process of this kind of requests is:
> 
> step1: original context:
>             queue it to io-worker
> step2: io-worker context:
>             nonblock try(the old logic is a synchronous try here)
>                 |
>                 |--fail--> arm poll
>                              |
>                              |--(fail/ready)-->synchronous issue
>                              |
>                              |--(succeed)-->worker finish it's job, tw
>                                             take over the req
> 
> This works much better than the old IOSQE_ASYNC logic in cases where
> unbound max_worker is relatively small. In this case, number of
> io-worker eazily increments to max_worker, new worker cannot be created
> and running workers stuck there handling old works in IOSQE_ASYNC mode.
> 
> In my 64-core machine, set unbound max_worker to 20, run echo-server,
> turns out:
> (arguments: register_file, connetion number is 1000, message size is 12
> Byte)
> original IOSQE_ASYNC: 76664.151 tps
> after this patch: 166934.985 tps
> 
> Suggested-by: Jens Axboe <[email protected]>
> Signed-off-by: Hao Xu <[email protected]>
An irrelevant question: why do we do linked timeout logic in
io_wq_submit_work() again regarding that we've already done it in
io_queue_async_work().

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests
  2021-10-18 11:34   ` Hao Xu
@ 2021-10-18 12:10     ` Pavel Begunkov
  2021-10-18 12:20       ` Hao Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Begunkov @ 2021-10-18 12:10 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 10/18/21 11:34, Hao Xu wrote:
> 在 2021/10/18 下午7:29, Hao Xu 写道:
>> The current logic of requests with IOSQE_ASYNC is first queueing it to
>> io-worker, then execute it in a synchronous way. For unbound works like
>> pollable requests(e.g. read/write a socketfd), the io-worker may stuck
>> there waiting for events for a long time. And thus other works wait in
>> the list for a long time too.
>> Let's introduce a new way for unbound works (currently pollable
>> requests), with this a request will first be queued to io-worker, then
>> executed in a nonblock try rather than a synchronous way. Failure of
>> that leads it to arm poll stuff and then the worker can begin to handle
>> other works.
>> The detail process of this kind of requests is:
>>
>> step1: original context:
>>             queue it to io-worker
>> step2: io-worker context:
>>             nonblock try(the old logic is a synchronous try here)
>>                 |
>>                 |--fail--> arm poll
>>                              |
>>                              |--(fail/ready)-->synchronous issue
>>                              |
>>                              |--(succeed)-->worker finish it's job, tw
>>                                             take over the req
>>
>> This works much better than the old IOSQE_ASYNC logic in cases where
>> unbound max_worker is relatively small. In this case, number of
>> io-worker eazily increments to max_worker, new worker cannot be created
>> and running workers stuck there handling old works in IOSQE_ASYNC mode.
>>
>> In my 64-core machine, set unbound max_worker to 20, run echo-server,
>> turns out:
>> (arguments: register_file, connetion number is 1000, message size is 12
>> Byte)
>> original IOSQE_ASYNC: 76664.151 tps
>> after this patch: 166934.985 tps
>>
>> Suggested-by: Jens Axboe <[email protected]>
>> Signed-off-by: Hao Xu <[email protected]>
> An irrelevant question: why do we do linked timeout logic in
> io_wq_submit_work() again regarding that we've already done it in
> io_queue_async_work().

Because io_wq_free_work() may enqueue new work (by returning it)
without going through io_queue_async_work(), and we don't care
enough to split those cases.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] io_uring: split logic of force_nonblock
  2021-10-18 11:29 ` [PATCH 1/2] io_uring: split logic of force_nonblock Hao Xu
@ 2021-10-18 12:13   ` Pavel Begunkov
  2021-10-18 12:27   ` Pavel Begunkov
  1 sibling, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2021-10-18 12:13 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 10/18/21 11:29, Hao Xu wrote:
> Currently force_nonblock stands for three meanings:
>   - nowait or not
>   - in an io-worker or not(hold uring_lock or not)

We should have done it long ago. You can send it separately if
it'd help.

One more recently added spot is missing: io_iopoll_req_issued()


> Let's split the logic to two flags, IO_URING_F_NONBLOCK and
> IO_URING_F_UNLOCKED for convenience of the next patch.
> 
> Signed-off-by: Hao Xu <[email protected]>
> ---
>   fs/io_uring.c | 50 ++++++++++++++++++++++++++++----------------------
>   1 file changed, 28 insertions(+), 22 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index b6da03c26122..727cad6c36fc 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -199,6 +199,7 @@ struct io_rings {
>   
>   enum io_uring_cmd_flags {
>   	IO_URING_F_COMPLETE_DEFER	= 1,
> +	IO_URING_F_UNLOCKED		= 2,
>   	/* int's last bit, sign checks are usually faster than a bit test */
>   	IO_URING_F_NONBLOCK		= INT_MIN,
>   };
> @@ -2926,7 +2927,7 @@ static void kiocb_done(struct kiocb *kiocb, ssize_t ret,
>   			struct io_ring_ctx *ctx = req->ctx;
>   
>   			req_set_fail(req);
> -			if (!(issue_flags & IO_URING_F_NONBLOCK)) {
> +			if (issue_flags & IO_URING_F_UNLOCKED) {
>   				mutex_lock(&ctx->uring_lock);
>   				__io_req_complete(req, issue_flags, ret, cflags);
>   				mutex_unlock(&ctx->uring_lock);
> @@ -3036,7 +3037,7 @@ static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
>   {
>   	struct io_buffer *kbuf = req->kbuf;
>   	struct io_buffer *head;
> -	bool needs_lock = !(issue_flags & IO_URING_F_NONBLOCK);
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   
>   	if (req->flags & REQ_F_BUFFER_SELECTED)
>   		return kbuf;
> @@ -3341,7 +3342,7 @@ static inline int io_rw_prep_async(struct io_kiocb *req, int rw)
>   	int ret;
>   
>   	/* submission path, ->uring_lock should already be taken */
> -	ret = io_import_iovec(rw, req, &iov, &iorw->s, IO_URING_F_NONBLOCK);
> +	ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
>   	if (unlikely(ret < 0))
>   		return ret;
>   
> @@ -3452,6 +3453,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
>   	struct iovec *iovec;
>   	struct kiocb *kiocb = &req->rw.kiocb;
>   	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool in_worker = issue_flags & IO_URING_F_UNLOCKED;
>   	struct io_async_rw *rw;
>   	ssize_t ret, ret2;
>   
> @@ -3495,7 +3497,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
>   	if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
>   		req->flags &= ~REQ_F_REISSUE;
>   		/* IOPOLL retry should happen for io-wq threads */
> -		if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
> +		if (in_worker && !(req->ctx->flags & IORING_SETUP_IOPOLL))
>   			goto done;
>   		/* no retry on NONBLOCK nor RWF_NOWAIT */
>   		if (req->flags & REQ_F_NOWAIT)
> @@ -3503,7 +3505,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
>   		ret = 0;
>   	} else if (ret == -EIOCBQUEUED) {
>   		goto out_free;
> -	} else if (ret == req->result || ret <= 0 || !force_nonblock ||
> +	} else if (ret == req->result || ret <= 0 || in_worker ||
>   		   (req->flags & REQ_F_NOWAIT) || !need_read_all(req)) {
>   		/* read all, failed, already did sync or don't want to retry */
>   		goto done;
> @@ -3581,6 +3583,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
>   	struct iovec *iovec;
>   	struct kiocb *kiocb = &req->rw.kiocb;
>   	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool in_worker = issue_flags & IO_URING_F_UNLOCKED;
>   	ssize_t ret, ret2;
>   
>   	if (!req_has_async_data(req)) {
> @@ -3651,7 +3654,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
>   	/* no retry on NONBLOCK nor RWF_NOWAIT */
>   	if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
>   		goto done;
> -	if (!force_nonblock || ret2 != -EAGAIN) {
> +	if (in_worker || ret2 != -EAGAIN) {
>   		/* IOPOLL retry should happen for io-wq threads */
>   		if ((req->ctx->flags & IORING_SETUP_IOPOLL) && ret2 == -EAGAIN)
>   			goto copy_iov;
> @@ -4314,9 +4317,9 @@ static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   	struct io_ring_ctx *ctx = req->ctx;
>   	struct io_buffer *head;
>   	int ret = 0;
> -	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   
> -	io_ring_submit_lock(ctx, !force_nonblock);
> +	io_ring_submit_lock(ctx, needs_lock);
>   
>   	lockdep_assert_held(&ctx->uring_lock);
>   
> @@ -4329,7 +4332,7 @@ static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   
>   	/* complete before unlock, IOPOLL may need the lock */
>   	__io_req_complete(req, issue_flags, ret, 0);
> -	io_ring_submit_unlock(ctx, !force_nonblock);
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	return 0;
>   }
>   
> @@ -4401,9 +4404,9 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   	struct io_ring_ctx *ctx = req->ctx;
>   	struct io_buffer *head, *list;
>   	int ret = 0;
> -	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   
> -	io_ring_submit_lock(ctx, !force_nonblock);
> +	io_ring_submit_lock(ctx, needs_lock);
>   
>   	lockdep_assert_held(&ctx->uring_lock);
>   
> @@ -4419,7 +4422,7 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   		req_set_fail(req);
>   	/* complete before unlock, IOPOLL may need the lock */
>   	__io_req_complete(req, issue_flags, ret, 0);
> -	io_ring_submit_unlock(ctx, !force_nonblock);
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	return 0;
>   }
>   
> @@ -6279,6 +6282,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
>   {
>   	struct io_ring_ctx *ctx = req->ctx;
>   	u64 sqe_addr = req->cancel.addr;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	struct io_tctx_node *node;
>   	int ret;
>   
> @@ -6287,7 +6291,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
>   		goto done;
>   
>   	/* slow path, try all io-wq's */
> -	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_lock(ctx, needs_lock);
>   	ret = -ENOENT;
>   	list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
>   		struct io_uring_task *tctx = node->task->io_uring;
> @@ -6296,7 +6300,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
>   		if (ret != -ENOENT)
>   			break;
>   	}
> -	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_unlock(ctx, needs_lock);
>   done:
>   	if (ret < 0)
>   		req_set_fail(req);
> @@ -6323,6 +6327,7 @@ static int io_rsrc_update_prep(struct io_kiocb *req,
>   static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
>   {
>   	struct io_ring_ctx *ctx = req->ctx;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	struct io_uring_rsrc_update2 up;
>   	int ret;
>   
> @@ -6332,10 +6337,10 @@ static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
>   	up.tags = 0;
>   	up.resv = 0;
>   
> -	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_lock(ctx, needs_lock);
>   	ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE,
>   					&up, req->rsrc_update.nr_args);
> -	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_unlock(ctx, needs_lock);
>   
>   	if (ret < 0)
>   		req_set_fail(req);
> @@ -6745,7 +6750,7 @@ static void io_wq_submit_work(struct io_wq_work *work)
>   
>   	if (!ret) {
>   		do {
> -			ret = io_issue_sqe(req, 0);
> +			ret = io_issue_sqe(req, IO_URING_F_UNLOCKED);
>   			/*
>   			 * We can get EAGAIN for polled IO even though we're
>   			 * forcing a sync submission from here, since we can't
> @@ -8333,12 +8338,12 @@ static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
>   				 unsigned int issue_flags, u32 slot_index)
>   {
>   	struct io_ring_ctx *ctx = req->ctx;
> -	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	bool needs_switch = false;
>   	struct io_fixed_file *file_slot;
>   	int ret = -EBADF;
>   
> -	io_ring_submit_lock(ctx, !force_nonblock);
> +	io_ring_submit_lock(ctx, needs_lock);
>   	if (file->f_op == &io_uring_fops)
>   		goto err;
>   	ret = -ENXIO;
> @@ -8379,7 +8384,7 @@ static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
>   err:
>   	if (needs_switch)
>   		io_rsrc_node_switch(ctx, ctx->file_data);
> -	io_ring_submit_unlock(ctx, !force_nonblock);
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	if (ret)
>   		fput(file);
>   	return ret;
> @@ -8389,11 +8394,12 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
>   {
>   	unsigned int offset = req->close.file_slot - 1;
>   	struct io_ring_ctx *ctx = req->ctx;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	struct io_fixed_file *file_slot;
>   	struct file *file;
>   	int ret, i;
>   
> -	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_lock(ctx, needs_lock);
>   	ret = -ENXIO;
>   	if (unlikely(!ctx->file_data))
>   		goto out;
> @@ -8419,7 +8425,7 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
>   	io_rsrc_node_switch(ctx, ctx->file_data);
>   	ret = 0;
>   out:
> -	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	return ret;
>   }
>   
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests
  2021-10-18 12:10     ` Pavel Begunkov
@ 2021-10-18 12:20       ` Hao Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Hao Xu @ 2021-10-18 12:20 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi

在 2021/10/18 下午8:10, Pavel Begunkov 写道:
> On 10/18/21 11:34, Hao Xu wrote:
>> 在 2021/10/18 下午7:29, Hao Xu 写道:
>>> The current logic of requests with IOSQE_ASYNC is first queueing it to
>>> io-worker, then execute it in a synchronous way. For unbound works like
>>> pollable requests(e.g. read/write a socketfd), the io-worker may stuck
>>> there waiting for events for a long time. And thus other works wait in
>>> the list for a long time too.
>>> Let's introduce a new way for unbound works (currently pollable
>>> requests), with this a request will first be queued to io-worker, then
>>> executed in a nonblock try rather than a synchronous way. Failure of
>>> that leads it to arm poll stuff and then the worker can begin to handle
>>> other works.
>>> The detail process of this kind of requests is:
>>>
>>> step1: original context:
>>>             queue it to io-worker
>>> step2: io-worker context:
>>>             nonblock try(the old logic is a synchronous try here)
>>>                 |
>>>                 |--fail--> arm poll
>>>                              |
>>>                              |--(fail/ready)-->synchronous issue
>>>                              |
>>>                              |--(succeed)-->worker finish it's job, tw
>>>                                             take over the req
>>>
>>> This works much better than the old IOSQE_ASYNC logic in cases where
>>> unbound max_worker is relatively small. In this case, number of
>>> io-worker eazily increments to max_worker, new worker cannot be created
>>> and running workers stuck there handling old works in IOSQE_ASYNC mode.
>>>
>>> In my 64-core machine, set unbound max_worker to 20, run echo-server,
>>> turns out:
>>> (arguments: register_file, connetion number is 1000, message size is 12
>>> Byte)
>>> original IOSQE_ASYNC: 76664.151 tps
>>> after this patch: 166934.985 tps
>>>
>>> Suggested-by: Jens Axboe <[email protected]>
>>> Signed-off-by: Hao Xu <[email protected]>
>> An irrelevant question: why do we do linked timeout logic in
>> io_wq_submit_work() again regarding that we've already done it in
>> io_queue_async_work().
> 
> Because io_wq_free_work() may enqueue new work (by returning it)
> without going through io_queue_async_work(), and we don't care
> enough to split those cases.
Make sense. Thanks.
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] io_uring: split logic of force_nonblock
  2021-10-18 11:29 ` [PATCH 1/2] io_uring: split logic of force_nonblock Hao Xu
  2021-10-18 12:13   ` Pavel Begunkov
@ 2021-10-18 12:27   ` Pavel Begunkov
  2021-10-18 13:00     ` Hao Xu
  1 sibling, 1 reply; 12+ messages in thread
From: Pavel Begunkov @ 2021-10-18 12:27 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 10/18/21 11:29, Hao Xu wrote:
> Currently force_nonblock stands for three meanings:
>   - nowait or not
>   - in an io-worker or not(hold uring_lock or not)
> 
> Let's split the logic to two flags, IO_URING_F_NONBLOCK and
> IO_URING_F_UNLOCKED for convenience of the next patch.
> 
> Signed-off-by: Hao Xu <[email protected]>
> ---
>   fs/io_uring.c | 50 ++++++++++++++++++++++++++++----------------------
>   1 file changed, 28 insertions(+), 22 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index b6da03c26122..727cad6c36fc 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -199,6 +199,7 @@ struct io_rings {
>   
>   enum io_uring_cmd_flags {
>   	IO_URING_F_COMPLETE_DEFER	= 1,
> +	IO_URING_F_UNLOCKED		= 2,
>   	/* int's last bit, sign checks are usually faster than a bit test */
>   	IO_URING_F_NONBLOCK		= INT_MIN,
>   };
> @@ -2926,7 +2927,7 @@ static void kiocb_done(struct kiocb *kiocb, ssize_t ret,
>   			struct io_ring_ctx *ctx = req->ctx;
>   
>   			req_set_fail(req);
> -			if (!(issue_flags & IO_URING_F_NONBLOCK)) {
> +			if (issue_flags & IO_URING_F_UNLOCKED) {
>   				mutex_lock(&ctx->uring_lock);
>   				__io_req_complete(req, issue_flags, ret, cflags);
>   				mutex_unlock(&ctx->uring_lock);
> @@ -3036,7 +3037,7 @@ static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
>   {
>   	struct io_buffer *kbuf = req->kbuf;
>   	struct io_buffer *head;
> -	bool needs_lock = !(issue_flags & IO_URING_F_NONBLOCK);
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   
>   	if (req->flags & REQ_F_BUFFER_SELECTED)
>   		return kbuf;
> @@ -3341,7 +3342,7 @@ static inline int io_rw_prep_async(struct io_kiocb *req, int rw)
>   	int ret;
>   
>   	/* submission path, ->uring_lock should already be taken */
> -	ret = io_import_iovec(rw, req, &iov, &iorw->s, IO_URING_F_NONBLOCK);
> +	ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
>   	if (unlikely(ret < 0))
>   		return ret;
>   
> @@ -3452,6 +3453,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
>   	struct iovec *iovec;
>   	struct kiocb *kiocb = &req->rw.kiocb;
>   	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool in_worker = issue_flags & IO_URING_F_UNLOCKED;

io_read shouldn't have notion of worker or whatever. I'd say let's
leave only force_nonblock here.

I assume 2/2 relies ot it, but if so you can make sure it ends up
in sync (!force_nonblock) at some point if all other ways fail.


>   	struct io_async_rw *rw;
>   	ssize_t ret, ret2;
>   
> @@ -3495,7 +3497,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
>   	if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
>   		req->flags &= ~REQ_F_REISSUE;
>   		/* IOPOLL retry should happen for io-wq threads */
> -		if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
> +		if (in_worker && !(req->ctx->flags & IORING_SETUP_IOPOLL))
>   			goto done;
>   		/* no retry on NONBLOCK nor RWF_NOWAIT */
>   		if (req->flags & REQ_F_NOWAIT)
> @@ -3503,7 +3505,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
>   		ret = 0;
>   	} else if (ret == -EIOCBQUEUED) {
>   		goto out_free;
> -	} else if (ret == req->result || ret <= 0 || !force_nonblock ||
> +	} else if (ret == req->result || ret <= 0 || in_worker ||
>   		   (req->flags & REQ_F_NOWAIT) || !need_read_all(req)) {
>   		/* read all, failed, already did sync or don't want to retry */
>   		goto done;
> @@ -3581,6 +3583,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
>   	struct iovec *iovec;
>   	struct kiocb *kiocb = &req->rw.kiocb;
>   	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool in_worker = issue_flags & IO_URING_F_UNLOCKED;
>   	ssize_t ret, ret2;
>   
>   	if (!req_has_async_data(req)) {
> @@ -3651,7 +3654,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
>   	/* no retry on NONBLOCK nor RWF_NOWAIT */
>   	if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
>   		goto done;
> -	if (!force_nonblock || ret2 != -EAGAIN) {
> +	if (in_worker || ret2 != -EAGAIN) {
>   		/* IOPOLL retry should happen for io-wq threads */
>   		if ((req->ctx->flags & IORING_SETUP_IOPOLL) && ret2 == -EAGAIN)
>   			goto copy_iov;
> @@ -4314,9 +4317,9 @@ static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   	struct io_ring_ctx *ctx = req->ctx;
>   	struct io_buffer *head;
>   	int ret = 0;
> -	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   
> -	io_ring_submit_lock(ctx, !force_nonblock);
> +	io_ring_submit_lock(ctx, needs_lock);
>   
>   	lockdep_assert_held(&ctx->uring_lock);
>   
> @@ -4329,7 +4332,7 @@ static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   
>   	/* complete before unlock, IOPOLL may need the lock */
>   	__io_req_complete(req, issue_flags, ret, 0);
> -	io_ring_submit_unlock(ctx, !force_nonblock);
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	return 0;
>   }
>   
> @@ -4401,9 +4404,9 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   	struct io_ring_ctx *ctx = req->ctx;
>   	struct io_buffer *head, *list;
>   	int ret = 0;
> -	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   
> -	io_ring_submit_lock(ctx, !force_nonblock);
> +	io_ring_submit_lock(ctx, needs_lock);
>   
>   	lockdep_assert_held(&ctx->uring_lock);
>   
> @@ -4419,7 +4422,7 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
>   		req_set_fail(req);
>   	/* complete before unlock, IOPOLL may need the lock */
>   	__io_req_complete(req, issue_flags, ret, 0);
> -	io_ring_submit_unlock(ctx, !force_nonblock);
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	return 0;
>   }
>   
> @@ -6279,6 +6282,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
>   {
>   	struct io_ring_ctx *ctx = req->ctx;
>   	u64 sqe_addr = req->cancel.addr;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	struct io_tctx_node *node;
>   	int ret;
>   
> @@ -6287,7 +6291,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
>   		goto done;
>   
>   	/* slow path, try all io-wq's */
> -	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_lock(ctx, needs_lock);
>   	ret = -ENOENT;
>   	list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
>   		struct io_uring_task *tctx = node->task->io_uring;
> @@ -6296,7 +6300,7 @@ static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
>   		if (ret != -ENOENT)
>   			break;
>   	}
> -	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_unlock(ctx, needs_lock);
>   done:
>   	if (ret < 0)
>   		req_set_fail(req);
> @@ -6323,6 +6327,7 @@ static int io_rsrc_update_prep(struct io_kiocb *req,
>   static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
>   {
>   	struct io_ring_ctx *ctx = req->ctx;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	struct io_uring_rsrc_update2 up;
>   	int ret;
>   
> @@ -6332,10 +6337,10 @@ static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
>   	up.tags = 0;
>   	up.resv = 0;
>   
> -	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_lock(ctx, needs_lock);
>   	ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE,
>   					&up, req->rsrc_update.nr_args);
> -	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_unlock(ctx, needs_lock);
>   
>   	if (ret < 0)
>   		req_set_fail(req);
> @@ -6745,7 +6750,7 @@ static void io_wq_submit_work(struct io_wq_work *work)
>   
>   	if (!ret) {
>   		do {
> -			ret = io_issue_sqe(req, 0);
> +			ret = io_issue_sqe(req, IO_URING_F_UNLOCKED);
>   			/*
>   			 * We can get EAGAIN for polled IO even though we're
>   			 * forcing a sync submission from here, since we can't
> @@ -8333,12 +8338,12 @@ static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
>   				 unsigned int issue_flags, u32 slot_index)
>   {
>   	struct io_ring_ctx *ctx = req->ctx;
> -	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	bool needs_switch = false;
>   	struct io_fixed_file *file_slot;
>   	int ret = -EBADF;
>   
> -	io_ring_submit_lock(ctx, !force_nonblock);
> +	io_ring_submit_lock(ctx, needs_lock);
>   	if (file->f_op == &io_uring_fops)
>   		goto err;
>   	ret = -ENXIO;
> @@ -8379,7 +8384,7 @@ static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
>   err:
>   	if (needs_switch)
>   		io_rsrc_node_switch(ctx, ctx->file_data);
> -	io_ring_submit_unlock(ctx, !force_nonblock);
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	if (ret)
>   		fput(file);
>   	return ret;
> @@ -8389,11 +8394,12 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
>   {
>   	unsigned int offset = req->close.file_slot - 1;
>   	struct io_ring_ctx *ctx = req->ctx;
> +	bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>   	struct io_fixed_file *file_slot;
>   	struct file *file;
>   	int ret, i;
>   
> -	io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_lock(ctx, needs_lock);
>   	ret = -ENXIO;
>   	if (unlikely(!ctx->file_data))
>   		goto out;
> @@ -8419,7 +8425,7 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
>   	io_rsrc_node_switch(ctx, ctx->file_data);
>   	ret = 0;
>   out:
> -	io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
> +	io_ring_submit_unlock(ctx, needs_lock);
>   	return ret;
>   }
>   
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] async hybrid for pollable requests
  2021-10-18 11:29 [PATCH v2 0/2] async hybrid for pollable requests Hao Xu
  2021-10-18 11:29 ` [PATCH 1/2] io_uring: split logic of force_nonblock Hao Xu
  2021-10-18 11:29 ` [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests Hao Xu
@ 2021-10-18 12:31 ` Pavel Begunkov
  2021-10-18 12:35   ` Hao Xu
  2021-10-18 13:17   ` Hao Xu
  2 siblings, 2 replies; 12+ messages in thread
From: Pavel Begunkov @ 2021-10-18 12:31 UTC (permalink / raw)
  To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi

On 10/18/21 11:29, Hao Xu wrote:
> 1/2 is a prep patch. 2/2 is the main one.
> The good thing: see commit message.
> the side effect: for normal io-worker path, added two if and two local
> variables. for FORCE_ASYNC path, added three if and several dereferences
> 
> I think it is fine since the io-worker path is not the fast path, and
> the benefit of this patchset is worth it.

We don't care about overhead in iowq, but would be better to get rid
of the in_worker in io_read(). See comments to 1/2.

Btw, you told that it performs better comparing to normal
IOSQE_ASYNC. I'm confused, didn't we agree that it can be
merged into IOSQE_ASYNC without extra flags?

> 
> Btw, we need to tweak the io-cancel.c a bit, not a big problem.
> liburing tests will come later.
> 
> v1-->v2:
>   - split logic of force_nonblock
>   - tweak added code in io_wq_submit_work to reduce overhead
>   from Pavel's commments.
> 
> Hao Xu (2):
>    io_uring: split logic of force_nonblock
>    io_uring: implement async hybrid mode for pollable requests
> 
>   fs/io_uring.c                 | 85 ++++++++++++++++++++++++++---------
>   include/uapi/linux/io_uring.h |  4 +-
>   2 files changed, 66 insertions(+), 23 deletions(-)
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] async hybrid for pollable requests
  2021-10-18 12:31 ` [PATCH v2 0/2] async hybrid " Pavel Begunkov
@ 2021-10-18 12:35   ` Hao Xu
  2021-10-18 13:17   ` Hao Xu
  1 sibling, 0 replies; 12+ messages in thread
From: Hao Xu @ 2021-10-18 12:35 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi

在 2021/10/18 下午8:31, Pavel Begunkov 写道:
> On 10/18/21 11:29, Hao Xu wrote:
>> 1/2 is a prep patch. 2/2 is the main one.
>> The good thing: see commit message.
>> the side effect: for normal io-worker path, added two if and two local
>> variables. for FORCE_ASYNC path, added three if and several dereferences
>>
>> I think it is fine since the io-worker path is not the fast path, and
>> the benefit of this patchset is worth it.
> 
> We don't care about overhead in iowq, but would be better to get rid
> of the in_worker in io_read(). See comments to 1/2.
> 
> Btw, you told that it performs better comparing to normal
> IOSQE_ASYNC. I'm confused, didn't we agree that it can be
> merged into IOSQE_ASYNC without extra flags?
I said 'better than the old logic IOSQE_ASYNC logic for pollable cases'..
> 
>>
>> Btw, we need to tweak the io-cancel.c a bit, not a big problem.
>> liburing tests will come later.
>>
>> v1-->v2:
>>   - split logic of force_nonblock
>>   - tweak added code in io_wq_submit_work to reduce overhead
>>   from Pavel's commments.
>>
>> Hao Xu (2):
>>    io_uring: split logic of force_nonblock
>>    io_uring: implement async hybrid mode for pollable requests
>>
>>   fs/io_uring.c                 | 85 ++++++++++++++++++++++++++---------
>>   include/uapi/linux/io_uring.h |  4 +-
>>   2 files changed, 66 insertions(+), 23 deletions(-)
>>
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] io_uring: split logic of force_nonblock
  2021-10-18 12:27   ` Pavel Begunkov
@ 2021-10-18 13:00     ` Hao Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Hao Xu @ 2021-10-18 13:00 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi

在 2021/10/18 下午8:27, Pavel Begunkov 写道:
> On 10/18/21 11:29, Hao Xu wrote:
>> Currently force_nonblock stands for three meanings:
>>   - nowait or not
>>   - in an io-worker or not(hold uring_lock or not)
>>
>> Let's split the logic to two flags, IO_URING_F_NONBLOCK and
>> IO_URING_F_UNLOCKED for convenience of the next patch.
>>
>> Signed-off-by: Hao Xu <[email protected]>
>> ---
>>   fs/io_uring.c | 50 ++++++++++++++++++++++++++++----------------------
>>   1 file changed, 28 insertions(+), 22 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index b6da03c26122..727cad6c36fc 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -199,6 +199,7 @@ struct io_rings {
>>   enum io_uring_cmd_flags {
>>       IO_URING_F_COMPLETE_DEFER    = 1,
>> +    IO_URING_F_UNLOCKED        = 2,
>>       /* int's last bit, sign checks are usually faster than a bit 
>> test */
>>       IO_URING_F_NONBLOCK        = INT_MIN,
>>   };
>> @@ -2926,7 +2927,7 @@ static void kiocb_done(struct kiocb *kiocb, 
>> ssize_t ret,
>>               struct io_ring_ctx *ctx = req->ctx;
>>               req_set_fail(req);
>> -            if (!(issue_flags & IO_URING_F_NONBLOCK)) {
>> +            if (issue_flags & IO_URING_F_UNLOCKED) {
>>                   mutex_lock(&ctx->uring_lock);
>>                   __io_req_complete(req, issue_flags, ret, cflags);
>>                   mutex_unlock(&ctx->uring_lock);
>> @@ -3036,7 +3037,7 @@ static struct io_buffer *io_buffer_select(struct 
>> io_kiocb *req, size_t *len,
>>   {
>>       struct io_buffer *kbuf = req->kbuf;
>>       struct io_buffer *head;
>> -    bool needs_lock = !(issue_flags & IO_URING_F_NONBLOCK);
>> +    bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
>>       if (req->flags & REQ_F_BUFFER_SELECTED)
>>           return kbuf;
>> @@ -3341,7 +3342,7 @@ static inline int io_rw_prep_async(struct 
>> io_kiocb *req, int rw)
>>       int ret;
>>       /* submission path, ->uring_lock should already be taken */
>> -    ret = io_import_iovec(rw, req, &iov, &iorw->s, IO_URING_F_NONBLOCK);
>> +    ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
>>       if (unlikely(ret < 0))
>>           return ret;
>> @@ -3452,6 +3453,7 @@ static int io_read(struct io_kiocb *req, 
>> unsigned int issue_flags)
>>       struct iovec *iovec;
>>       struct kiocb *kiocb = &req->rw.kiocb;
>>       bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
>> +    bool in_worker = issue_flags & IO_URING_F_UNLOCKED;
> 
> io_read shouldn't have notion of worker or whatever. I'd say let's
> leave only force_nonblock here.
> 
> I assume 2/2 relies ot it, but if so you can make sure it ends up
> in sync (!force_nonblock) at some point if all other ways fail.
I re-read the code, found you're right, will send v3.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] async hybrid for pollable requests
  2021-10-18 12:31 ` [PATCH v2 0/2] async hybrid " Pavel Begunkov
  2021-10-18 12:35   ` Hao Xu
@ 2021-10-18 13:17   ` Hao Xu
  1 sibling, 0 replies; 12+ messages in thread
From: Hao Xu @ 2021-10-18 13:17 UTC (permalink / raw)
  To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi

在 2021/10/18 下午8:31, Pavel Begunkov 写道:
> On 10/18/21 11:29, Hao Xu wrote:
>> 1/2 is a prep patch. 2/2 is the main one.
>> The good thing: see commit message.
>> the side effect: for normal io-worker path, added two if and two local
>> variables. for FORCE_ASYNC path, added three if and several dereferences
>>
>> I think it is fine since the io-worker path is not the fast path, and
>> the benefit of this patchset is worth it.
> 
> We don't care about overhead in iowq, but would be better to get rid
> of the in_worker in io_read(). See comments to 1/2.
> 
> Btw, you told that it performs better comparing to normal
> IOSQE_ASYNC. I'm confused, didn't we agree that it can be
> merged into IOSQE_ASYNC without extra flags?
I see what you are saying, forgot to remove the flag
stuff..will be more careful.
> 
>>
>> Btw, we need to tweak the io-cancel.c a bit, not a big problem.
>> liburing tests will come later.
>>
>> v1-->v2:
>>   - split logic of force_nonblock
>>   - tweak added code in io_wq_submit_work to reduce overhead
>>   from Pavel's commments.
>>
>> Hao Xu (2):
>>    io_uring: split logic of force_nonblock
>>    io_uring: implement async hybrid mode for pollable requests
>>
>>   fs/io_uring.c                 | 85 ++++++++++++++++++++++++++---------
>>   include/uapi/linux/io_uring.h |  4 +-
>>   2 files changed, 66 insertions(+), 23 deletions(-)
>>
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-10-18 13:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-18 11:29 [PATCH v2 0/2] async hybrid for pollable requests Hao Xu
2021-10-18 11:29 ` [PATCH 1/2] io_uring: split logic of force_nonblock Hao Xu
2021-10-18 12:13   ` Pavel Begunkov
2021-10-18 12:27   ` Pavel Begunkov
2021-10-18 13:00     ` Hao Xu
2021-10-18 11:29 ` [PATCH 2/2] io_uring: implement async hybrid mode for pollable requests Hao Xu
2021-10-18 11:34   ` Hao Xu
2021-10-18 12:10     ` Pavel Begunkov
2021-10-18 12:20       ` Hao Xu
2021-10-18 12:31 ` [PATCH v2 0/2] async hybrid " Pavel Begunkov
2021-10-18 12:35   ` Hao Xu
2021-10-18 13:17   ` Hao Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox