* [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests @ 2021-10-08 12:36 Hao Xu 2021-10-08 12:36 ` [PATCH 1/2] io_uring: add IOSQE_ASYNC_HYBRID flag " Hao Xu ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Hao Xu @ 2021-10-08 12:36 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi this is a new feature for pollable requests, see detail in commit message. Hao Xu (2): io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests io_uring: implementation of IOSQE_ASYNC_HYBRID logic fs/io_uring.c | 48 +++++++++++++++++++++++++++++++---- include/uapi/linux/io_uring.h | 4 ++- 2 files changed, 46 insertions(+), 6 deletions(-) -- 2.24.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests 2021-10-08 12:36 [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Hao Xu @ 2021-10-08 12:36 ` Hao Xu 2021-10-08 12:36 ` [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic Hao Xu 2021-10-09 12:51 ` [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Pavel Begunkov 2 siblings, 0 replies; 13+ messages in thread From: Hao Xu @ 2021-10-08 12:36 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi The current logic of requests with IOSQE_FORCE_ASYNC is first queueing it to io-worker, then execute it in a synchronous way. For unbound works like pollable requests(e.g. read/write a socketfd), the io-worker may stuck there waiting for events for a long time. And thus other works wait in the list for a long time too. Let's introduce a new sqe flag IOSQE_ASYNC_HIBRID for unbound works (currently pollable requests), with this a request will first be queued to io-worker, then executed in a nonblock try rather than a synchronous way. Failure of that leads it to arm poll stuff and then the worker can begin to handle other works. Suggested-by: Jens Axboe <[email protected]> Signed-off-by: Hao Xu <[email protected]> --- fs/io_uring.c | 6 +++++- include/uapi/linux/io_uring.h | 4 +++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 73135c5c6168..a99f7f46e6d4 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -104,7 +104,8 @@ #define IORING_MAX_REG_BUFFERS (1U << 14) #define SQE_COMMON_FLAGS (IOSQE_FIXED_FILE | IOSQE_IO_LINK | \ - IOSQE_IO_HARDLINK | IOSQE_ASYNC) + IOSQE_IO_HARDLINK | IOSQE_ASYNC | \ + IOSQE_ASYNC_HYBRID) #define SQE_VALID_FLAGS (SQE_COMMON_FLAGS|IOSQE_BUFFER_SELECT|IOSQE_IO_DRAIN) @@ -709,6 +710,7 @@ enum { REQ_F_HARDLINK_BIT = IOSQE_IO_HARDLINK_BIT, REQ_F_FORCE_ASYNC_BIT = IOSQE_ASYNC_BIT, REQ_F_BUFFER_SELECT_BIT = IOSQE_BUFFER_SELECT_BIT, + REQ_F_ASYNC_HYBRID_BIT = IOSQE_ASYNC_HYBRID_BIT, /* first byte is taken by user flags, shift it to not overlap */ REQ_F_FAIL_BIT = 8, @@ -747,6 +749,8 @@ enum { REQ_F_FORCE_ASYNC = BIT(REQ_F_FORCE_ASYNC_BIT), /* IOSQE_BUFFER_SELECT */ REQ_F_BUFFER_SELECT = BIT(REQ_F_BUFFER_SELECT_BIT), + /* IOSQE_ASYNC_HYBRID */ + REQ_F_ASYNC_HYBRID = BIT(REQ_F_ASYNC_HYBRID_BIT), /* fail rest of links */ REQ_F_FAIL = BIT(REQ_F_FAIL_BIT), diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index c45b5e9a9387..3e49a7dbe636 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -70,6 +70,7 @@ enum { IOSQE_IO_HARDLINK_BIT, IOSQE_ASYNC_BIT, IOSQE_BUFFER_SELECT_BIT, + IOSQE_ASYNC_HYBRID_BIT, }; /* @@ -87,7 +88,8 @@ enum { #define IOSQE_ASYNC (1U << IOSQE_ASYNC_BIT) /* select buffer from sqe->buf_group */ #define IOSQE_BUFFER_SELECT (1U << IOSQE_BUFFER_SELECT_BIT) - +/* first force async then arm poll */ +#define IOSQE_ASYNC_HYBRID (1U << IOSQE_ASYNC_HYBRID_BIT) /* * io_uring_setup() flags */ -- 2.24.4 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic 2021-10-08 12:36 [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Hao Xu 2021-10-08 12:36 ` [PATCH 1/2] io_uring: add IOSQE_ASYNC_HYBRID flag " Hao Xu @ 2021-10-08 12:36 ` Hao Xu 2021-10-09 12:46 ` Pavel Begunkov 2021-10-09 12:51 ` [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Pavel Begunkov 2 siblings, 1 reply; 13+ messages in thread From: Hao Xu @ 2021-10-08 12:36 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi The process of this kind of requests is: step1: original context: queue it to io-worker step2: io-worker context: nonblock try(the old logic is a synchronous try here) | |--fail--> arm poll | |--(fail/ready)-->synchronous issue | |--(succeed)-->worker finish it's job, tw take over the req This works much better than IOSQE_ASYNC in cases where cpu resources are scarce or unbound max_worker is small. In these cases, number of io-worker eazily increments to max_worker, new worker cannot be created and running workers stuck there handling old works in IOSQE_ASYNC mode. In my machine, set unbound max_worker to 20, run echo-server, turns out: (arguments: register_file, connetion number is 1000, message size is 12 Byte) IOSQE_ASYNC: 76664.151 tps IOSQE_ASYNC_HYBRID: 166934.985 tps Suggested-by: Jens Axboe <[email protected]> Signed-off-by: Hao Xu <[email protected]> --- fs/io_uring.c | 42 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index a99f7f46e6d4..024cef09bc12 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1409,7 +1409,7 @@ static void io_prep_async_work(struct io_kiocb *req) req->work.list.next = NULL; req->work.flags = 0; - if (req->flags & REQ_F_FORCE_ASYNC) + if (req->flags & (REQ_F_FORCE_ASYNC | REQ_F_ASYNC_HYBRID)) req->work.flags |= IO_WQ_WORK_CONCURRENT; if (req->flags & REQ_F_ISREG) { @@ -5575,7 +5575,13 @@ static int io_arm_poll_handler(struct io_kiocb *req) req->apoll = apoll; req->flags |= REQ_F_POLLED; ipt.pt._qproc = io_async_queue_proc; - io_req_set_refcount(req); + /* + * REQ_F_REFCOUNT set indicate we are in io-worker context, where we + * already explicitly set the submittion and completion ref. So no + * need to set refcount here if that is the case. + */ + if (!(req->flags & REQ_F_REFCOUNT)) + io_req_set_refcount(req); ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask, io_async_wake); @@ -6704,8 +6710,11 @@ static void io_wq_submit_work(struct io_wq_work *work) ret = -ECANCELED; if (!ret) { + bool need_poll = req->flags & REQ_F_ASYNC_HYBRID; + do { - ret = io_issue_sqe(req, 0); +issue_sqe: + ret = io_issue_sqe(req, need_poll ? IO_URING_F_NONBLOCK : 0); /* * We can get EAGAIN for polled IO even though we're * forcing a sync submission from here, since we can't @@ -6713,6 +6722,30 @@ static void io_wq_submit_work(struct io_wq_work *work) */ if (ret != -EAGAIN) break; + if (need_poll) { + bool armed = false; + + ret = 0; + need_poll = false; + + switch (io_arm_poll_handler(req)) { + case IO_APOLL_READY: + goto issue_sqe; + case IO_APOLL_ABORTED: + /* + * somehow we failed to arm the poll infra, + * fallback it to a normal async worker try. + */ + break; + case IO_APOLL_OK: + armed = true; + break; + } + + if (armed) + break; + + } cond_resched(); } while (1); } @@ -6928,7 +6961,8 @@ static void io_queue_sqe_fallback(struct io_kiocb *req) static inline void io_queue_sqe(struct io_kiocb *req) __must_hold(&req->ctx->uring_lock) { - if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL)))) + if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_ASYNC_HYBRID | + REQ_F_FAIL)))) __io_queue_sqe(req); else io_queue_sqe_fallback(req); -- 2.24.4 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic 2021-10-08 12:36 ` [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic Hao Xu @ 2021-10-09 12:46 ` Pavel Begunkov 2021-10-11 8:55 ` Hao Xu 0 siblings, 1 reply; 13+ messages in thread From: Pavel Begunkov @ 2021-10-09 12:46 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 10/8/21 13:36, Hao Xu wrote: > The process of this kind of requests is: > > step1: original context: > queue it to io-worker > step2: io-worker context: > nonblock try(the old logic is a synchronous try here) > | > |--fail--> arm poll > | > |--(fail/ready)-->synchronous issue > | > |--(succeed)-->worker finish it's job, tw > take over the req > > This works much better than IOSQE_ASYNC in cases where cpu resources > are scarce or unbound max_worker is small. In these cases, number of > io-worker eazily increments to max_worker, new worker cannot be created > and running workers stuck there handling old works in IOSQE_ASYNC mode. > > In my machine, set unbound max_worker to 20, run echo-server, turns out: > (arguments: register_file, connetion number is 1000, message size is 12 > Byte) > IOSQE_ASYNC: 76664.151 tps > IOSQE_ASYNC_HYBRID: 166934.985 tps > > Suggested-by: Jens Axboe <[email protected]> > Signed-off-by: Hao Xu <[email protected]> > --- > fs/io_uring.c | 42 ++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 38 insertions(+), 4 deletions(-) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index a99f7f46e6d4..024cef09bc12 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -1409,7 +1409,7 @@ static void io_prep_async_work(struct io_kiocb *req) > > req->work.list.next = NULL; > req->work.flags = 0; > - if (req->flags & REQ_F_FORCE_ASYNC) > + if (req->flags & (REQ_F_FORCE_ASYNC | REQ_F_ASYNC_HYBRID)) > req->work.flags |= IO_WQ_WORK_CONCURRENT; > > if (req->flags & REQ_F_ISREG) { > @@ -5575,7 +5575,13 @@ static int io_arm_poll_handler(struct io_kiocb *req) > req->apoll = apoll; > req->flags |= REQ_F_POLLED; > ipt.pt._qproc = io_async_queue_proc; > - io_req_set_refcount(req); > + /* > + * REQ_F_REFCOUNT set indicate we are in io-worker context, where we Nope, it indicates that needs more complex refcounting. It includes linked timeouts but also poll because of req_ref_get for double poll. fwiw, with some work it can be removed for polls, harder (and IMHO not necessary) to do for timeouts. > + * already explicitly set the submittion and completion ref. So no I'd say there is no notion of submission vs completion refs anymore. > + * need to set refcount here if that is the case. > + */ > + if (!(req->flags & REQ_F_REFCOUNT)) Compare it with io_req_set_refcount(), that "if" is a a no-op > + io_req_set_refcount(req); > > ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask, > io_async_wake); > @@ -6704,8 +6710,11 @@ static void io_wq_submit_work(struct io_wq_work *work) > ret = -ECANCELED; > > if (!ret) { > + bool need_poll = req->flags & REQ_F_ASYNC_HYBRID; > + > do { > - ret = io_issue_sqe(req, 0); > +issue_sqe: > + ret = io_issue_sqe(req, need_poll ? IO_URING_F_NONBLOCK : 0); It's buggy, you will get all kinds of kernel crashes and leaks. Currently IO_URING_F_NONBLOCK has dual meaning: obvious nonblock but also whether we hold uring_lock or not. You'd need to split the flag into two, i.e. IO_URING_F_LOCKED -- Pavel Begunkov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic 2021-10-09 12:46 ` Pavel Begunkov @ 2021-10-11 8:55 ` Hao Xu 2021-10-11 8:58 ` Hao Xu 0 siblings, 1 reply; 13+ messages in thread From: Hao Xu @ 2021-10-11 8:55 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/10/9 下午8:46, Pavel Begunkov 写道: > On 10/8/21 13:36, Hao Xu wrote: >> The process of this kind of requests is: >> >> step1: original context: >> queue it to io-worker >> step2: io-worker context: >> nonblock try(the old logic is a synchronous try here) >> | >> |--fail--> arm poll >> | >> |--(fail/ready)-->synchronous issue >> | >> |--(succeed)-->worker finish it's job, tw >> take over the req >> >> This works much better than IOSQE_ASYNC in cases where cpu resources >> are scarce or unbound max_worker is small. In these cases, number of >> io-worker eazily increments to max_worker, new worker cannot be created >> and running workers stuck there handling old works in IOSQE_ASYNC mode. >> >> In my machine, set unbound max_worker to 20, run echo-server, turns out: >> (arguments: register_file, connetion number is 1000, message size is 12 >> Byte) >> IOSQE_ASYNC: 76664.151 tps >> IOSQE_ASYNC_HYBRID: 166934.985 tps >> >> Suggested-by: Jens Axboe <[email protected]> >> Signed-off-by: Hao Xu <[email protected]> >> --- >> fs/io_uring.c | 42 ++++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 38 insertions(+), 4 deletions(-) >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index a99f7f46e6d4..024cef09bc12 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -1409,7 +1409,7 @@ static void io_prep_async_work(struct io_kiocb >> *req) >> req->work.list.next = NULL; >> req->work.flags = 0; >> - if (req->flags & REQ_F_FORCE_ASYNC) >> + if (req->flags & (REQ_F_FORCE_ASYNC | REQ_F_ASYNC_HYBRID)) >> req->work.flags |= IO_WQ_WORK_CONCURRENT; >> if (req->flags & REQ_F_ISREG) { >> @@ -5575,7 +5575,13 @@ static int io_arm_poll_handler(struct io_kiocb >> *req) >> req->apoll = apoll; >> req->flags |= REQ_F_POLLED; >> ipt.pt._qproc = io_async_queue_proc; >> - io_req_set_refcount(req); >> + /* >> + * REQ_F_REFCOUNT set indicate we are in io-worker context, where we > > Nope, it indicates that needs more complex refcounting. It includes linked > timeouts but also poll because of req_ref_get for double poll. fwiw, with > some work it can be removed for polls, harder (and IMHO not necessary) > to do > for timeouts.Agree, I now realize that the explanation I put here is not good at all, I actually want to say that the io-worker already set refs = 2 (also possible that prep_link_out set 1, and io-worker adds the other 1, previously I miss this situation). One will be put at completion time, the other one will be put in io_wq_free_work(). So no need to set the refcount here again. I looked into io_req_set_refcount(), since it does nothing if refcount is already not zero, I should be ok to keep this one as it was. > >> + * already explicitly set the submittion and completion ref. So no > > I'd say there is no notion of submission vs completion refs anymore. > >> + * need to set refcount here if that is the case. >> + */ >> + if (!(req->flags & REQ_F_REFCOUNT)) > > Compare it with io_req_set_refcount(), that "if" is a a no-op > >> + io_req_set_refcount(req); >> ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask, >> io_async_wake); >> @@ -6704,8 +6710,11 @@ static void io_wq_submit_work(struct io_wq_work >> *work) >> ret = -ECANCELED; >> if (!ret) { >> + bool need_poll = req->flags & REQ_F_ASYNC_HYBRID; >> + >> do { >> - ret = io_issue_sqe(req, 0); >> +issue_sqe: >> + ret = io_issue_sqe(req, need_poll ? IO_URING_F_NONBLOCK : >> 0); > > It's buggy, you will get all kinds of kernel crashes and leaks. > Currently IO_URING_F_NONBLOCK has dual meaning: obvious nonblock but > also whether we hold uring_lock or not. You'd need to split the flag > into two, i.e. IO_URING_F_LOCKED I'll look into it. I was thinking about to do the first nowait try in the original context, but then I thought it doesn't make sense to bring up a worker just for poll infra arming since thread creating and scheduling has its overhead. > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic 2021-10-11 8:55 ` Hao Xu @ 2021-10-11 8:58 ` Hao Xu 0 siblings, 0 replies; 13+ messages in thread From: Hao Xu @ 2021-10-11 8:58 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/10/11 下午4:55, Hao Xu 写道: > 在 2021/10/9 下午8:46, Pavel Begunkov 写道: >> On 10/8/21 13:36, Hao Xu wrote: >>> The process of this kind of requests is: >>> >>> step1: original context: >>> queue it to io-worker >>> step2: io-worker context: >>> nonblock try(the old logic is a synchronous try here) >>> | >>> |--fail--> arm poll >>> | >>> |--(fail/ready)-->synchronous issue >>> | >>> |--(succeed)-->worker finish it's job, tw >>> take over the req >>> >>> This works much better than IOSQE_ASYNC in cases where cpu resources >>> are scarce or unbound max_worker is small. In these cases, number of >>> io-worker eazily increments to max_worker, new worker cannot be created >>> and running workers stuck there handling old works in IOSQE_ASYNC mode. >>> >>> In my machine, set unbound max_worker to 20, run echo-server, turns out: >>> (arguments: register_file, connetion number is 1000, message size is 12 >>> Byte) >>> IOSQE_ASYNC: 76664.151 tps >>> IOSQE_ASYNC_HYBRID: 166934.985 tps >>> >>> Suggested-by: Jens Axboe <[email protected]> >>> Signed-off-by: Hao Xu <[email protected]> >>> --- >>> fs/io_uring.c | 42 ++++++++++++++++++++++++++++++++++++++---- >>> 1 file changed, 38 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>> index a99f7f46e6d4..024cef09bc12 100644 >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -1409,7 +1409,7 @@ static void io_prep_async_work(struct io_kiocb >>> *req) >>> req->work.list.next = NULL; >>> req->work.flags = 0; >>> - if (req->flags & REQ_F_FORCE_ASYNC) >>> + if (req->flags & (REQ_F_FORCE_ASYNC | REQ_F_ASYNC_HYBRID)) >>> req->work.flags |= IO_WQ_WORK_CONCURRENT; >>> if (req->flags & REQ_F_ISREG) { >>> @@ -5575,7 +5575,13 @@ static int io_arm_poll_handler(struct io_kiocb >>> *req) >>> req->apoll = apoll; >>> req->flags |= REQ_F_POLLED; >>> ipt.pt._qproc = io_async_queue_proc; >>> - io_req_set_refcount(req); >>> + /* >>> + * REQ_F_REFCOUNT set indicate we are in io-worker context, >>> where we >> >> Nope, it indicates that needs more complex refcounting. It includes >> linked >> timeouts but also poll because of req_ref_get for double poll. fwiw, with >> some work it can be removed for polls, harder (and IMHO not necessary) >> to do >> for timeouts.Agree, I now realize that the explanation I put here is ^ it is messed up here.. >> not good at all, > I actually want to say that the io-worker already set refs = 2 (also > possible that prep_link_out set 1, and io-worker adds the other 1, > previously I miss this situation). One will be put at completion time, > the other one will be put in io_wq_free_work(). So no need to set the > refcount here again. I looked into io_req_set_refcount(), since it does > nothing if refcount is already not zero, I should be ok to keep this one > as it was. >> >>> + * already explicitly set the submittion and completion ref. So no >> >> I'd say there is no notion of submission vs completion refs anymore. >> >>> + * need to set refcount here if that is the case. >>> + */ >>> + if (!(req->flags & REQ_F_REFCOUNT)) >> >> Compare it with io_req_set_refcount(), that "if" is a a no-op >> >>> + io_req_set_refcount(req); >>> ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask, >>> io_async_wake); >>> @@ -6704,8 +6710,11 @@ static void io_wq_submit_work(struct >>> io_wq_work *work) >>> ret = -ECANCELED; >>> if (!ret) { >>> + bool need_poll = req->flags & REQ_F_ASYNC_HYBRID; >>> + >>> do { >>> - ret = io_issue_sqe(req, 0); >>> +issue_sqe: >>> + ret = io_issue_sqe(req, need_poll ? IO_URING_F_NONBLOCK >>> : 0); >> >> It's buggy, you will get all kinds of kernel crashes and leaks. >> Currently IO_URING_F_NONBLOCK has dual meaning: obvious nonblock but >> also whether we hold uring_lock or not. You'd need to split the flag >> into two, i.e. IO_URING_F_LOCKED > I'll look into it. I was thinking about to do the first nowait try in > the original context, but then I thought it doesn't make sense to bring > up a worker just for poll infra arming since thread creating and > scheduling has its overhead. >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests 2021-10-08 12:36 [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Hao Xu 2021-10-08 12:36 ` [PATCH 1/2] io_uring: add IOSQE_ASYNC_HYBRID flag " Hao Xu 2021-10-08 12:36 ` [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic Hao Xu @ 2021-10-09 12:51 ` Pavel Begunkov 2021-10-11 3:08 ` Hao Xu 2 siblings, 1 reply; 13+ messages in thread From: Pavel Begunkov @ 2021-10-09 12:51 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 10/8/21 13:36, Hao Xu wrote: > this is a new feature for pollable requests, see detail in commit > message. It really sounds we should do it as a part of IOSQE_ASYNC, so what are the cons and compromises? > Hao Xu (2): > io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests btw, it doesn't make sense to split it into two patches > io_uring: implementation of IOSQE_ASYNC_HYBRID logic > > fs/io_uring.c | 48 +++++++++++++++++++++++++++++++---- > include/uapi/linux/io_uring.h | 4 ++- > 2 files changed, 46 insertions(+), 6 deletions(-) > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests 2021-10-09 12:51 ` [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Pavel Begunkov @ 2021-10-11 3:08 ` Hao Xu 2021-10-12 11:39 ` Pavel Begunkov 0 siblings, 1 reply; 13+ messages in thread From: Hao Xu @ 2021-10-11 3:08 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/10/9 下午8:51, Pavel Begunkov 写道: > On 10/8/21 13:36, Hao Xu wrote: >> this is a new feature for pollable requests, see detail in commit >> message. > > It really sounds we should do it as a part of IOSQE_ASYNC, so > what are the cons and compromises? I wrote the pros and cons here: https://github.com/axboe/liburing/issues/426#issuecomment-939221300 > >> Hao Xu (2): >> io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests > > btw, it doesn't make sense to split it into two patches Hmm, I thought we should make adding a new flag as a separate patch. Could you give me more hints about the considerration here? > >> io_uring: implementation of IOSQE_ASYNC_HYBRID logic >> >> fs/io_uring.c | 48 +++++++++++++++++++++++++++++++---- >> include/uapi/linux/io_uring.h | 4 ++- >> 2 files changed, 46 insertions(+), 6 deletions(-) >> > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests 2021-10-11 3:08 ` Hao Xu @ 2021-10-12 11:39 ` Pavel Begunkov 2021-10-14 8:53 ` Hao Xu 0 siblings, 1 reply; 13+ messages in thread From: Pavel Begunkov @ 2021-10-12 11:39 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 10/11/21 04:08, Hao Xu wrote: > 在 2021/10/9 下午8:51, Pavel Begunkov 写道: >> On 10/8/21 13:36, Hao Xu wrote: >>> this is a new feature for pollable requests, see detail in commit >>> message. >> >> It really sounds we should do it as a part of IOSQE_ASYNC, so >> what are the cons and compromises? > I wrote the pros and cons here: > https://github.com/axboe/liburing/issues/426#issuecomment-939221300 I see. The problem is as always, adding extra knobs, which users should tune and it's not exactly clear where to use what. Not specific to the new flag, there is enough confusion around IOSQE_ASYNC, but it only makes it worse. It would be nice to have it applied "automatically". Say, with IOSQE_ASYNC the copy is always (almost) done by io-wq but there is that polling optimisation on top. Do we care enough about copying specifically in task context to have a different flag? a quick question, what is "tps" in "IOSQE_ASYNC: 76664.151 tps"? >>> Hao Xu (2): >>> io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests >> >> btw, it doesn't make sense to split it into two patches > Hmm, I thought we should make adding a new flag as a separate patch. > Could you give me more hints about the considerration here? You can easily ignore it, just looked weird to me. Let's try to phrase it: 1) 1/2 doesn't do anything useful w/o 2/2, iow it doesn't feel like an atomic change. And it would be breaking the userspace, if it's not just a hint flag. 2) it's harder to read, you search the git history, find the implementation (and the flag is already there), you think what's happening here, where the flag was used and so to find out that it was added separately a commit ago. 3) sometimes it's done similarly because the API change is not simple, but it's not the case here. By similarly I mean the other way around, first implement it internally, but not exposing any mean to use it, and adding the userspace API in next commits. >>> io_uring: implementation of IOSQE_ASYNC_HYBRID logic >>> >>> fs/io_uring.c | 48 +++++++++++++++++++++++++++++++---- >>> include/uapi/linux/io_uring.h | 4 ++- >>> 2 files changed, 46 insertions(+), 6 deletions(-) >>> >> > -- Pavel Begunkov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests 2021-10-12 11:39 ` Pavel Begunkov @ 2021-10-14 8:53 ` Hao Xu 2021-10-14 9:20 ` Hao Xu ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Hao Xu @ 2021-10-14 8:53 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/10/12 下午7:39, Pavel Begunkov 写道: > On 10/11/21 04:08, Hao Xu wrote: >> 在 2021/10/9 下午8:51, Pavel Begunkov 写道: >>> On 10/8/21 13:36, Hao Xu wrote: >>>> this is a new feature for pollable requests, see detail in commit >>>> message. >>> >>> It really sounds we should do it as a part of IOSQE_ASYNC, so >>> what are the cons and compromises? >> I wrote the pros and cons here: >> https://github.com/axboe/liburing/issues/426#issuecomment-939221300 > > I see. The problem is as always, adding extra knobs, which users > should tune and it's not exactly clear where to use what. Not specific > to the new flag, there is enough confusion around IOSQE_ASYNC, but it > only makes it worse. It would be nice to have it applied > "automatically". > > Say, with IOSQE_ASYNC the copy is always (almost) done by io-wq but > there is that polling optimisation on top. Do we care enough about > copying specifically in task context to have a different flag? > I did more tests in a 64 cores machine. test command is: nice -n -15 taskset -c 10-20 ./io_uring_echo_server -p 8084 -f -n con_nr -l 1024 where -n means the number of connections, -l means size of load. the results of tps and cpu usage under different IO pressure is: (didn't find the way to make it look better, you may need a markdown renderer :) ) tps | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | | -------- | -------- | -------- | -------- | -------- | -------- | -------- | | ASYNC | 123.000 | 295.203 | 67390.432 | 132686.361 | 186084.114 | 319550.540 | | ASYNC_HYBRID | 122.000 | 299.401 | 168321.092 | 188870.283 | 226427.166 | 371580.062 | cpu | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | | -------- | -------- | -------- | -------- | -------- | -------- | -------- | | ASYNC | 0.3% | 1.0% | 62.5% | 111.3% | 198.3% | 420.9% | | ASYNC_HYBRID | 0.3% | 1.0% | 360% | 435.5% | 516.6% | 1100% | when con_nr is 1000 or more, we leveraged all the 10 cpus. hybrid is surely better than async. when con_nr is 1 or 2, in theory async should be better since it use more cpu resource, but it didn't, it is because the copying in tw is not a bottleneck. So I tried bigger workload, no difference. So I think it should be ok to just apply this feature on top of IOSQE_ASYNC, for all pollable requests in all condition. Regards, Hao > a quick question, what is "tps" in "IOSQE_ASYNC: 76664.151 tps"? > >>>> Hao Xu (2): >>>> io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests >>> >>> btw, it doesn't make sense to split it into two patches >> Hmm, I thought we should make adding a new flag as a separate patch. >> Could you give me more hints about the considerration here? > > You can easily ignore it, just looked weird to me. Let's try to > phrase it: > > 1) 1/2 doesn't do anything useful w/o 2/2, iow it doesn't feel like > an atomic change. And it would be breaking the userspace, if it's > not just a hint flag. > > 2) it's harder to read, you search the git history, find the > implementation (and the flag is already there), you think what's > happening here, where the flag was used and so to find out that > it was added separately a commit ago. > > 3) sometimes it's done similarly because the API change is not > simple, but it's not the case here. > By similarly I mean the other way around, first implement it > internally, but not exposing any mean to use it, and adding > the userspace API in next commits. > >>>> io_uring: implementation of IOSQE_ASYNC_HYBRID logic >>>> >>>> fs/io_uring.c | 48 >>>> +++++++++++++++++++++++++++++++---- >>>> include/uapi/linux/io_uring.h | 4 ++- >>>> 2 files changed, 46 insertions(+), 6 deletions(-) >>>> >>> >> > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests 2021-10-14 8:53 ` Hao Xu @ 2021-10-14 9:20 ` Hao Xu 2021-10-14 13:53 ` Hao Xu 2021-10-14 14:17 ` Pavel Begunkov 2 siblings, 0 replies; 13+ messages in thread From: Hao Xu @ 2021-10-14 9:20 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/10/14 下午4:53, Hao Xu 写道: > 在 2021/10/12 下午7:39, Pavel Begunkov 写道: >> On 10/11/21 04:08, Hao Xu wrote: >>> 在 2021/10/9 下午8:51, Pavel Begunkov 写道: >>>> On 10/8/21 13:36, Hao Xu wrote: >>>>> this is a new feature for pollable requests, see detail in commit >>>>> message. >>>> >>>> It really sounds we should do it as a part of IOSQE_ASYNC, so >>>> what are the cons and compromises? >>> I wrote the pros and cons here: >>> https://github.com/axboe/liburing/issues/426#issuecomment-939221300 >> >> I see. The problem is as always, adding extra knobs, which users >> should tune and it's not exactly clear where to use what. Not specific >> to the new flag, there is enough confusion around IOSQE_ASYNC, but it >> only makes it worse. It would be nice to have it applied >> "automatically". >> >> Say, with IOSQE_ASYNC the copy is always (almost) done by io-wq but >> there is that polling optimisation on top. Do we care enough about >> copying specifically in task context to have a different flag? >> > I did more tests in a 64 cores machine. > test command is: nice -n -15 taskset -c 10-20 ./io_uring_echo_server -p > 8084 -f -n con_nr -l 1024 > where -n means the number of connections, -l means size of load. > the results of tps and cpu usage under different IO pressure is: > (didn't find the way to make it look better, you may need a markdown > renderer :) ) > tps > > | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | > | -------- | -------- | -------- | -------- | -------- | -------- | > -------- | > | ASYNC | 123.000 | 295.203 | 67390.432 | 132686.361 | > 186084.114 | 319550.540 | > | ASYNC_HYBRID | 122.000 | 299.401 | 168321.092 | > 188870.283 | 226427.166 | 371580.062 | > > > cpu > > | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | > | -------- | -------- | -------- | -------- | -------- | -------- | > -------- | > | ASYNC | 0.3% | 1.0% | 62.5% | 111.3% | 198.3% | > 420.9% | > | ASYNC_HYBRID | 0.3% | 1.0% | 360% | 435.5% | 516.6% > | 1100% | the data changes from time to time, so maybe not precise, but the comparison between ASYNC and ASYNC_HYBRID shows something. > > when con_nr is 1000 or more, we leveraged all the 10 cpus. hybrid is > surely better than async. when con_nr is 1 or 2, in theory async should > be better since it use more cpu resource, but it didn't, it is because > the copying in tw is not a bottleneck. So I tried bigger workload, no > difference. So I think it should be ok to just apply this feature on top > of IOSQE_ASYNC, for all pollable requests in all condition. > > Regards, > Hao >> a quick question, what is "tps" in "IOSQE_ASYNC: 76664.151 tps"? >> >>>>> Hao Xu (2): >>>>> io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests >>>> >>>> btw, it doesn't make sense to split it into two patches >>> Hmm, I thought we should make adding a new flag as a separate patch. >>> Could you give me more hints about the considerration here? >> >> You can easily ignore it, just looked weird to me. Let's try to >> phrase it: >> >> 1) 1/2 doesn't do anything useful w/o 2/2, iow it doesn't feel like >> an atomic change. And it would be breaking the userspace, if it's >> not just a hint flag. >> >> 2) it's harder to read, you search the git history, find the >> implementation (and the flag is already there), you think what's >> happening here, where the flag was used and so to find out that >> it was added separately a commit ago. >> >> 3) sometimes it's done similarly because the API change is not >> simple, but it's not the case here. >> By similarly I mean the other way around, first implement it >> internally, but not exposing any mean to use it, and adding >> the userspace API in next commits. >> >>>>> io_uring: implementation of IOSQE_ASYNC_HYBRID logic >>>>> >>>>> fs/io_uring.c | 48 >>>>> +++++++++++++++++++++++++++++++---- >>>>> include/uapi/linux/io_uring.h | 4 ++- >>>>> 2 files changed, 46 insertions(+), 6 deletions(-) >>>>> >>>> >>> >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests 2021-10-14 8:53 ` Hao Xu 2021-10-14 9:20 ` Hao Xu @ 2021-10-14 13:53 ` Hao Xu 2021-10-14 14:17 ` Pavel Begunkov 2 siblings, 0 replies; 13+ messages in thread From: Hao Xu @ 2021-10-14 13:53 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe; +Cc: io-uring, Joseph Qi 在 2021/10/14 下午4:53, Hao Xu 写道: > 在 2021/10/12 下午7:39, Pavel Begunkov 写道: >> On 10/11/21 04:08, Hao Xu wrote: >>> 在 2021/10/9 下午8:51, Pavel Begunkov 写道: >>>> On 10/8/21 13:36, Hao Xu wrote: >>>>> this is a new feature for pollable requests, see detail in commit >>>>> message. >>>> >>>> It really sounds we should do it as a part of IOSQE_ASYNC, so >>>> what are the cons and compromises? >>> I wrote the pros and cons here: >>> https://github.com/axboe/liburing/issues/426#issuecomment-939221300 >> >> I see. The problem is as always, adding extra knobs, which users >> should tune and it's not exactly clear where to use what. Not specific >> to the new flag, there is enough confusion around IOSQE_ASYNC, but it >> only makes it worse. It would be nice to have it applied >> "automatically". >> >> Say, with IOSQE_ASYNC the copy is always (almost) done by io-wq but >> there is that polling optimisation on top. Do we care enough about >> copying specifically in task context to have a different flag? >> > I did more tests in a 64 cores machine. > test command is: nice -n -15 taskset -c 10-20 ./io_uring_echo_server -p > 8084 -f -n con_nr -l 1024 > where -n means the number of connections, -l means size of load. > the results of tps and cpu usage under different IO pressure is: > (didn't find the way to make it look better, you may need a markdown > renderer :) ) > tps > > | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | > | -------- | -------- | -------- | -------- | -------- | -------- | > -------- | > | ASYNC | 123.000 | 295.203 | 67390.432 | 132686.361 | > 186084.114 | 319550.540 | > | ASYNC_HYBRID | 122.000 | 299.401 | 168321.092 | > 188870.283 | 226427.166 | 371580.062 | > > > cpu > > | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | > | -------- | -------- | -------- | -------- | -------- | -------- | > -------- | > | ASYNC | 0.3% | 1.0% | 62.5% | 111.3% | 198.3% | > 420.9% | > | ASYNC_HYBRID | 0.3% | 1.0% | 360% | 435.5% | 516.6% > | 1100% | > > when con_nr is 1000 or more, we leveraged all the 10 cpus. hybrid is > surely better than async. when con_nr is 1 or 2, in theory async should > be better since it use more cpu resource, but it didn't, it is because > the copying in tw is not a bottleneck. So I tried bigger workload, no > difference. So I think it should be ok to just apply this feature on top > of IOSQE_ASYNC, for all pollable requests in all condition. > > Regards, > Hao >> a quick question, what is "tps" in "IOSQE_ASYNC: 76664.151 tps"? Forgot this one.. tps = transaction per second, here a transaction means the client send the workload to the server and receive it from the server. >> >>>>> Hao Xu (2): >>>>> io_uring: add IOSQE_ASYNC_HYBRID flag for pollable requests >>>> >>>> btw, it doesn't make sense to split it into two patches >>> Hmm, I thought we should make adding a new flag as a separate patch. >>> Could you give me more hints about the considerration here? >> >> You can easily ignore it, just looked weird to me. Let's try to >> phrase it: >> >> 1) 1/2 doesn't do anything useful w/o 2/2, iow it doesn't feel like >> an atomic change. And it would be breaking the userspace, if it's >> not just a hint flag. >> >> 2) it's harder to read, you search the git history, find the >> implementation (and the flag is already there), you think what's >> happening here, where the flag was used and so to find out that >> it was added separately a commit ago. >> >> 3) sometimes it's done similarly because the API change is not >> simple, but it's not the case here. >> By similarly I mean the other way around, first implement it >> internally, but not exposing any mean to use it, and adding >> the userspace API in next commits. >> >>>>> io_uring: implementation of IOSQE_ASYNC_HYBRID logic >>>>> >>>>> fs/io_uring.c | 48 >>>>> +++++++++++++++++++++++++++++++---- >>>>> include/uapi/linux/io_uring.h | 4 ++- >>>>> 2 files changed, 46 insertions(+), 6 deletions(-) >>>>> >>>> >>> >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests 2021-10-14 8:53 ` Hao Xu 2021-10-14 9:20 ` Hao Xu 2021-10-14 13:53 ` Hao Xu @ 2021-10-14 14:17 ` Pavel Begunkov 2 siblings, 0 replies; 13+ messages in thread From: Pavel Begunkov @ 2021-10-14 14:17 UTC (permalink / raw) To: Hao Xu, Jens Axboe; +Cc: io-uring, Joseph Qi On 10/14/21 09:53, Hao Xu wrote: > 在 2021/10/12 下午7:39, Pavel Begunkov 写道: >> On 10/11/21 04:08, Hao Xu wrote: >>> 在 2021/10/9 下午8:51, Pavel Begunkov 写道: >>>> On 10/8/21 13:36, Hao Xu wrote: >>>>> this is a new feature for pollable requests, see detail in commit >>>>> message. >>>> >>>> It really sounds we should do it as a part of IOSQE_ASYNC, so >>>> what are the cons and compromises? >>> I wrote the pros and cons here: >>> https://github.com/axboe/liburing/issues/426#issuecomment-939221300 >> >> I see. The problem is as always, adding extra knobs, which users >> should tune and it's not exactly clear where to use what. Not specific >> to the new flag, there is enough confusion around IOSQE_ASYNC, but it >> only makes it worse. It would be nice to have it applied >> "automatically". >> >> Say, with IOSQE_ASYNC the copy is always (almost) done by io-wq but >> there is that polling optimisation on top. Do we care enough about >> copying specifically in task context to have a different flag? >> > I did more tests in a 64 cores machine. > test command is: nice -n -15 taskset -c 10-20 ./io_uring_echo_server -p 8084 -f -n con_nr -l 1024 > where -n means the number of connections, -l means size of load. > the results of tps and cpu usage under different IO pressure is: > (didn't find the way to make it look better, you may need a markdown > renderer :) ) > tps > > | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | > | -------- | -------- | -------- | -------- | -------- | -------- | -------- | > | ASYNC | 123.000 | 295.203 | 67390.432 | 132686.361 | 186084.114 | 319550.540 | > | ASYNC_HYBRID | 122.000 | 299.401 | 168321.092 | 188870.283 | 226427.166 | 371580.062 | > > > cpu > > | feature | 1 | 2 | 1000 | 2000 | 3000 | 5800 | > | -------- | -------- | -------- | -------- | -------- | -------- | -------- | > | ASYNC | 0.3% | 1.0% | 62.5% | 111.3% | 198.3% | 420.9% | > | ASYNC_HYBRID | 0.3% | 1.0% | 360% | 435.5% | 516.6% | 1100% | > > when con_nr is 1000 or more, we leveraged all the 10 cpus. hybrid is > surely better than async. when con_nr is 1 or 2, in theory async should > be better since it use more cpu resource, but it didn't, it is because > the copying in tw is not a bottleneck. So I tried bigger workload, no > difference. So I think it should be ok to just apply this feature on top > of IOSQE_ASYNC, for all pollable requests in all condition. Sounds great. And IOSQE_ASYNC is a hint flag, so if things change we can return it back the behaviour of IOSQE_ASYNC and add that new flag (or do something else). -- Pavel Begunkov ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-10-14 14:18 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-10-08 12:36 [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Hao Xu 2021-10-08 12:36 ` [PATCH 1/2] io_uring: add IOSQE_ASYNC_HYBRID flag " Hao Xu 2021-10-08 12:36 ` [PATCH 2/2] io_uring: implementation of IOSQE_ASYNC_HYBRID logic Hao Xu 2021-10-09 12:46 ` Pavel Begunkov 2021-10-11 8:55 ` Hao Xu 2021-10-11 8:58 ` Hao Xu 2021-10-09 12:51 ` [PATCH for-5.16 0/2] async hybrid, a new way for pollable requests Pavel Begunkov 2021-10-11 3:08 ` Hao Xu 2021-10-12 11:39 ` Pavel Begunkov 2021-10-14 8:53 ` Hao Xu 2021-10-14 9:20 ` Hao Xu 2021-10-14 13:53 ` Hao Xu 2021-10-14 14:17 ` Pavel Begunkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox