* [PATCH 0/2] cancel_hash per entry lock @ 2022-05-29 16:19 Hao Xu 2022-05-29 16:19 ` [PATCH 1/2] io_uring: add an argument for io_poll_disarm() Hao Xu 2022-05-29 16:20 ` [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock Hao Xu 0 siblings, 2 replies; 15+ messages in thread From: Hao Xu @ 2022-05-29 16:19 UTC (permalink / raw) To: io-uring; +Cc: Jens Axboe, Pavel Begunkov From: Hao Xu <[email protected]> Make per entry lock for cancel_hash array, this reduces usage of completion_lock and contension between cancel_hash entries. Hao Xu (2): io_uring: add an argument for io_poll_disarm() io_uring: switch cancel_hash to use per list spinlock io_uring/cancel.c | 12 ++++++++++-- io_uring/cancel.h | 1 + io_uring/io_uring.c | 9 +++++++++ io_uring/io_uring_types.h | 1 + io_uring/poll.c | 38 +++++++++++++++++++++----------------- 5 files changed, 42 insertions(+), 19 deletions(-) base-commit: cbbf9a4865526c55a6a984ef69578def9944f319 -- 2.25.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] io_uring: add an argument for io_poll_disarm() 2022-05-29 16:19 [PATCH 0/2] cancel_hash per entry lock Hao Xu @ 2022-05-29 16:19 ` Hao Xu 2022-05-29 16:20 ` [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock Hao Xu 1 sibling, 0 replies; 15+ messages in thread From: Hao Xu @ 2022-05-29 16:19 UTC (permalink / raw) To: io-uring; +Cc: Jens Axboe, Pavel Begunkov From: Hao Xu <[email protected]> From: Hao Xu <[email protected]> Add an argument for io_poll_disarm() for later use. Signed-off-by: Hao Xu <[email protected]> --- io_uring/poll.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/io_uring/poll.c b/io_uring/poll.c index 728f6e7b47c5..c8982c5ef0fa 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -561,8 +561,9 @@ static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, bool poll_only, { struct hlist_head *list; struct io_kiocb *req; + u32 index = hash_long(cd->data, ctx->cancel_hash_bits); - list = &ctx->cancel_hash[hash_long(cd->data, ctx->cancel_hash_bits)]; + list = &ctx->cancel_hash[index]; hlist_for_each_entry(req, list, hash_node) { if (cd->data != req->cqe.user_data) continue; @@ -573,6 +574,7 @@ static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, bool poll_only, continue; req->work.cancel_seq = cd->seq; } + cd->flags = index; return req; } return NULL; @@ -602,7 +604,7 @@ static struct io_kiocb *io_poll_file_find(struct io_ring_ctx *ctx, return NULL; } -static bool io_poll_disarm(struct io_kiocb *req) +static bool io_poll_disarm(struct io_kiocb *req, u32 index) __must_hold(&ctx->completion_lock) { if (!io_poll_get_ownership(req)) @@ -724,7 +726,7 @@ int io_poll_remove(struct io_kiocb *req, unsigned int issue_flags) spin_lock(&ctx->completion_lock); preq = io_poll_find(ctx, true, &cd); - if (!preq || !io_poll_disarm(preq)) { + if (!preq || !io_poll_disarm(preq, cd.flags)) { spin_unlock(&ctx->completion_lock); ret = preq ? -EALREADY : -ENOENT; goto out; -- 2.25.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 16:19 [PATCH 0/2] cancel_hash per entry lock Hao Xu 2022-05-29 16:19 ` [PATCH 1/2] io_uring: add an argument for io_poll_disarm() Hao Xu @ 2022-05-29 16:20 ` Hao Xu 2022-05-29 16:25 ` Jens Axboe 1 sibling, 1 reply; 15+ messages in thread From: Hao Xu @ 2022-05-29 16:20 UTC (permalink / raw) To: io-uring; +Cc: Jens Axboe, Pavel Begunkov From: Hao Xu <[email protected]> From: Hao Xu <[email protected]> Use per list lock for cancel_hash, this removes some completion lock invocation and remove contension between different cancel_hash entries Signed-off-by: Hao Xu <[email protected]> --- io_uring/cancel.c | 12 ++++++++++-- io_uring/cancel.h | 1 + io_uring/io_uring.c | 9 +++++++++ io_uring/io_uring_types.h | 1 + io_uring/poll.c | 30 ++++++++++++++++-------------- 5 files changed, 37 insertions(+), 16 deletions(-) diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 83cceb52d82d..0b1aa3ab7664 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -93,14 +93,14 @@ int io_try_cancel(struct io_kiocb *req, struct io_cancel_data *cd) if (!ret) return 0; - spin_lock(&ctx->completion_lock); ret = io_poll_cancel(ctx, cd); if (ret != -ENOENT) goto out; + spin_lock(&ctx->completion_lock); if (!(cd->flags & IORING_ASYNC_CANCEL_FD)) ret = io_timeout_cancel(ctx, cd); -out: spin_unlock(&ctx->completion_lock); +out: return ret; } @@ -192,3 +192,11 @@ int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } + +inline void init_cancel_hash_locks(spinlock_t *cancel_hash_locks, unsigned size) +{ + int i; + + for (i = 0; i < size; i++) + spin_lock_init(&cancel_hash_locks[i]); +} diff --git a/io_uring/cancel.h b/io_uring/cancel.h index 4f35d8696325..fdec2595797e 100644 --- a/io_uring/cancel.h +++ b/io_uring/cancel.h @@ -4,3 +4,4 @@ int io_async_cancel_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags); int io_try_cancel(struct io_kiocb *req, struct io_cancel_data *cd); +inline void init_cancel_hash_locks(spinlock_t *cancel_hash_locks, unsigned size); diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index f31d3446dcbf..6eaa27aea197 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -706,7 +706,14 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) GFP_KERNEL); if (!ctx->cancel_hash) goto err; + ctx->cancel_hash_locks = + kmalloc((1U << hash_bits) * sizeof(spinlock_t), + GFP_KERNEL); + if (!ctx->cancel_hash_locks) + goto err; + __hash_init(ctx->cancel_hash, 1U << hash_bits); + init_cancel_hash_locks(ctx->cancel_hash_locks, 1U << hash_bits); ctx->dummy_ubuf = kzalloc(sizeof(*ctx->dummy_ubuf), GFP_KERNEL); if (!ctx->dummy_ubuf) @@ -749,6 +756,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) err: kfree(ctx->dummy_ubuf); kfree(ctx->cancel_hash); + kfree(ctx->cancel_hash_locks); kfree(ctx->io_bl); xa_destroy(&ctx->io_bl_xa); kfree(ctx); @@ -3045,6 +3053,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) if (ctx->hash_map) io_wq_put_hash(ctx->hash_map); kfree(ctx->cancel_hash); + kfree(ctx->cancel_hash_locks); kfree(ctx->dummy_ubuf); kfree(ctx->io_bl); xa_destroy(&ctx->io_bl_xa); diff --git a/io_uring/io_uring_types.h b/io_uring/io_uring_types.h index 7c22cf35a7e2..4619a46f7ecd 100644 --- a/io_uring/io_uring_types.h +++ b/io_uring/io_uring_types.h @@ -231,6 +231,7 @@ struct io_ring_ctx { */ struct io_wq_work_list iopoll_list; struct hlist_head *cancel_hash; + spinlock_t *cancel_hash_locks; unsigned cancel_hash_bits; bool poll_multi_queue; diff --git a/io_uring/poll.c b/io_uring/poll.c index c8982c5ef0fa..e1b6dd282860 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -73,10 +73,11 @@ static struct io_poll *io_poll_get_single(struct io_kiocb *req) static void io_poll_req_insert(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; - struct hlist_head *list; + u32 index = hash_long(req->cqe.user_data, ctx->cancel_hash_bits); - list = &ctx->cancel_hash[hash_long(req->cqe.user_data, ctx->cancel_hash_bits)]; - hlist_add_head(&req->hash_node, list); + spin_lock(&ctx->cancel_hash_locks[index]); + hlist_add_head(&req->hash_node, &ctx->cancel_hash[index]); + spin_unlock(&ctx->cancel_hash_locks[index]); } static void io_init_poll_iocb(struct io_poll *poll, __poll_t events, @@ -439,9 +440,7 @@ static int __io_arm_poll_handler(struct io_kiocb *req, return 0; } - spin_lock(&ctx->completion_lock); io_poll_req_insert(req); - spin_unlock(&ctx->completion_lock); if (mask && (poll->events & EPOLLET)) { /* can't multishot if failed, just queue the event we've got */ @@ -538,10 +537,10 @@ __cold bool io_poll_remove_all(struct io_ring_ctx *ctx, struct task_struct *tsk, bool found = false; int i; - spin_lock(&ctx->completion_lock); for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) { struct hlist_head *list; + spin_lock(&ctx->cancel_hash_locks[i]); list = &ctx->cancel_hash[i]; hlist_for_each_entry_safe(req, tmp, list, hash_node) { if (io_match_task_safe(req, tsk, cancel_all)) { @@ -550,19 +549,19 @@ __cold bool io_poll_remove_all(struct io_ring_ctx *ctx, struct task_struct *tsk, found = true; } } + spin_unlock(&ctx->cancel_hash_locks[i]); } - spin_unlock(&ctx->completion_lock); return found; } static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, bool poll_only, struct io_cancel_data *cd) - __must_hold(&ctx->completion_lock) { struct hlist_head *list; struct io_kiocb *req; u32 index = hash_long(cd->data, ctx->cancel_hash_bits); + spin_lock(&ctx->cancel_hash_locks[index]); list = &ctx->cancel_hash[index]; hlist_for_each_entry(req, list, hash_node) { if (cd->data != req->cqe.user_data) @@ -574,15 +573,16 @@ static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, bool poll_only, continue; req->work.cancel_seq = cd->seq; } + spin_unlock(&ctx->cancel_hash_locks[index]); cd->flags = index; return req; } + spin_unlock(&ctx->cancel_hash_locks[index]); return NULL; } static struct io_kiocb *io_poll_file_find(struct io_ring_ctx *ctx, struct io_cancel_data *cd) - __must_hold(&ctx->completion_lock) { struct io_kiocb *req; int i; @@ -590,6 +590,7 @@ static struct io_kiocb *io_poll_file_find(struct io_ring_ctx *ctx, for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) { struct hlist_head *list; + spin_lock(&ctx->cancel_hash_locks[i]); list = &ctx->cancel_hash[i]; hlist_for_each_entry(req, list, hash_node) { if (!(cd->flags & IORING_ASYNC_CANCEL_ANY) && @@ -598,24 +599,28 @@ static struct io_kiocb *io_poll_file_find(struct io_ring_ctx *ctx, if (cd->seq == req->work.cancel_seq) continue; req->work.cancel_seq = cd->seq; + spin_unlock(&ctx->cancel_hash_locks[i]); return req; } + spin_unlock(&ctx->cancel_hash_locks[i]); } return NULL; } static bool io_poll_disarm(struct io_kiocb *req, u32 index) - __must_hold(&ctx->completion_lock) { + struct io_ring_ctx *ctx = req->ctx; + if (!io_poll_get_ownership(req)) return false; io_poll_remove_entries(req); + spin_lock(&ctx->cancel_hash_locks[index]); hash_del(&req->hash_node); + spin_unlock(&ctx->cancel_hash_locks[index]); return true; } int io_poll_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd) - __must_hold(&ctx->completion_lock) { struct io_kiocb *req; @@ -724,14 +729,11 @@ int io_poll_remove(struct io_kiocb *req, unsigned int issue_flags) int ret2, ret = 0; bool locked; - spin_lock(&ctx->completion_lock); preq = io_poll_find(ctx, true, &cd); if (!preq || !io_poll_disarm(preq, cd.flags)) { - spin_unlock(&ctx->completion_lock); ret = preq ? -EALREADY : -ENOENT; goto out; } - spin_unlock(&ctx->completion_lock); if (poll_update->update_events || poll_update->update_user_data) { /* only mask one event flags, keep behavior flags */ -- 2.25.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 16:20 ` [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock Hao Xu @ 2022-05-29 16:25 ` Jens Axboe 2022-05-29 18:07 ` Hao Xu 2022-05-30 13:33 ` Hao Xu 0 siblings, 2 replies; 15+ messages in thread From: Jens Axboe @ 2022-05-29 16:25 UTC (permalink / raw) To: Hao Xu, io-uring; +Cc: Pavel Begunkov On 5/29/22 10:20 AM, Hao Xu wrote: > From: Hao Xu <[email protected]> > > From: Hao Xu <[email protected]> > > Use per list lock for cancel_hash, this removes some completion lock > invocation and remove contension between different cancel_hash entries Interesting, do you have any numbers on this? Also, I'd make a hash bucket struct: struct io_hash_bucket { spinlock_t lock; struct hlist_head list; }; rather than two separate structs, that'll have nicer memory locality too and should further improve it. Could be done as a prep patch with the old locking in place, making the end patch doing the per-bucket lock simpler as well. Hmm? -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 16:25 ` Jens Axboe @ 2022-05-29 18:07 ` Hao Xu 2022-05-29 18:40 ` Jens Axboe 2022-05-30 13:33 ` Hao Xu 1 sibling, 1 reply; 15+ messages in thread From: Hao Xu @ 2022-05-29 18:07 UTC (permalink / raw) To: Jens Axboe, io-uring; +Cc: Pavel Begunkov On 5/30/22 00:25, Jens Axboe wrote: > On 5/29/22 10:20 AM, Hao Xu wrote: >> From: Hao Xu <[email protected]> >> >> From: Hao Xu <[email protected]> >> >> Use per list lock for cancel_hash, this removes some completion lock >> invocation and remove contension between different cancel_hash entries > > Interesting, do you have any numbers on this? Just Theoretically for now, I'll do some tests tomorrow. This is actually RFC, forgot to change the subject. > > Also, I'd make a hash bucket struct: > > struct io_hash_bucket { > spinlock_t lock; > struct hlist_head list; > }; > > rather than two separate structs, that'll have nicer memory locality too > and should further improve it. Could be done as a prep patch with the > old locking in place, making the end patch doing the per-bucket lock > simpler as well. Sure, if the test number make sense, I'll send v2. I'll test the hlist_bl list as well(the comment of it says it is much slower than normal spin_lock, but we may not care the efficiency of poll cancellation very much?). > > Hmm? > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 18:07 ` Hao Xu @ 2022-05-29 18:40 ` Jens Axboe 2022-05-29 22:50 ` Pavel Begunkov 0 siblings, 1 reply; 15+ messages in thread From: Jens Axboe @ 2022-05-29 18:40 UTC (permalink / raw) To: Hao Xu, io-uring; +Cc: Pavel Begunkov On 5/29/22 12:07 PM, Hao Xu wrote: > On 5/30/22 00:25, Jens Axboe wrote: >> On 5/29/22 10:20 AM, Hao Xu wrote: >>> From: Hao Xu <[email protected]> >>> >>> From: Hao Xu <[email protected]> >>> >>> Use per list lock for cancel_hash, this removes some completion lock >>> invocation and remove contension between different cancel_hash entries >> >> Interesting, do you have any numbers on this? > > Just Theoretically for now, I'll do some tests tomorrow. This is > actually RFC, forgot to change the subject. > >> >> Also, I'd make a hash bucket struct: >> >> struct io_hash_bucket { >> spinlock_t lock; >> struct hlist_head list; >> }; >> >> rather than two separate structs, that'll have nicer memory locality too >> and should further improve it. Could be done as a prep patch with the >> old locking in place, making the end patch doing the per-bucket lock >> simpler as well. > > Sure, if the test number make sense, I'll send v2. I'll test the > hlist_bl list as well(the comment of it says it is much slower than > normal spin_lock, but we may not care the efficiency of poll > cancellation very much?). I don't think the bit spinlocks are going to be useful, we should stick with a spinlock for this. They are indeed slower and generally not used for that reason. For a use case where you need a ton of locks and saving the 4 bytes for a spinlock would make sense (or maybe not changing some struct?), maybe they have a purpose. But not for this. -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 18:40 ` Jens Axboe @ 2022-05-29 22:50 ` Pavel Begunkov 2022-05-29 23:34 ` Jens Axboe 2022-05-30 6:38 ` Hao Xu 0 siblings, 2 replies; 15+ messages in thread From: Pavel Begunkov @ 2022-05-29 22:50 UTC (permalink / raw) To: Jens Axboe, Hao Xu, io-uring On 5/29/22 19:40, Jens Axboe wrote: > On 5/29/22 12:07 PM, Hao Xu wrote: >> On 5/30/22 00:25, Jens Axboe wrote: >>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>> From: Hao Xu <[email protected]> >>>> >>>> From: Hao Xu <[email protected]> >>>> >>>> Use per list lock for cancel_hash, this removes some completion lock >>>> invocation and remove contension between different cancel_hash entries >>> >>> Interesting, do you have any numbers on this? >> >> Just Theoretically for now, I'll do some tests tomorrow. This is >> actually RFC, forgot to change the subject. >> >>> >>> Also, I'd make a hash bucket struct: >>> >>> struct io_hash_bucket { >>> spinlock_t lock; >>> struct hlist_head list; >>> }; >>> >>> rather than two separate structs, that'll have nicer memory locality too >>> and should further improve it. Could be done as a prep patch with the >>> old locking in place, making the end patch doing the per-bucket lock >>> simpler as well. >> >> Sure, if the test number make sense, I'll send v2. I'll test the >> hlist_bl list as well(the comment of it says it is much slower than >> normal spin_lock, but we may not care the efficiency of poll >> cancellation very much?). > > I don't think the bit spinlocks are going to be useful, we should > stick with a spinlock for this. They are indeed slower and generally not > used for that reason. For a use case where you need a ton of locks and > saving the 4 bytes for a spinlock would make sense (or maybe not > changing some struct?), maybe they have a purpose. But not for this. We can put the cancel hashes under uring_lock and completely kill the hash spinlocking (2 lock/unlock pairs per single-shot). The code below won't even compile and missing cancellation bits, I'll pick it up in a week. Even better would be to have two hash tables, and auto-magically apply the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) and apoll (need uring_lock after anyway). diff --git a/fs/io_uring.c b/fs/io_uring.c index 6be21967959d..191fa7f31610 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7120,12 +7120,20 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked) } io_poll_remove_entries(req); - spin_lock(&ctx->completion_lock); - hash_del(&req->hash_node); - __io_req_complete_post(req, req->cqe.res, 0); - io_commit_cqring(ctx); - spin_unlock(&ctx->completion_lock); - io_cqring_ev_posted(ctx); + + if (ctx->flags & IORING_MUTEX_HASH) { + io_tw_lock(ctx, locked); + hash_del(&req->hash_node); + io_req_complete_state(req, req->cqe.res, 0); + io_req_add_compl_list(req); + } else { + spin_lock(&ctx->completion_lock); + hash_del(&req->hash_node); + __io_req_complete_post(req, req->cqe.res, 0); + io_commit_cqring(ctx); + spin_unlock(&ctx->completion_lock); + io_cqring_ev_posted(ctx); + } } static void io_apoll_task_func(struct io_kiocb *req, bool *locked) @@ -7138,9 +7146,14 @@ static void io_apoll_task_func(struct io_kiocb *req, bool *locked) return; io_poll_remove_entries(req); - spin_lock(&ctx->completion_lock); - hash_del(&req->hash_node); - spin_unlock(&ctx->completion_lock); + if (ctx->flags & IORING_MUTEX_HASH) { + io_tw_lock(ctx, locked); + hash_del(&req->hash_node); + } else { + spin_lock(&ctx->completion_lock); + hash_del(&req->hash_node); + spin_unlock(&ctx->completion_lock); + } if (!ret) io_req_task_submit(req, locked); @@ -7332,9 +7345,13 @@ static int __io_arm_poll_handler(struct io_kiocb *req, return 0; } - spin_lock(&ctx->completion_lock); - io_poll_req_insert(req); - spin_unlock(&ctx->completion_lock); + if (ctx->flags & IORING_MUTEX_HASH) { + io_poll_req_insert(req); + } else { + spin_lock(&ctx->completion_lock); + io_poll_req_insert(req); + spin_unlock(&ctx->completion_lock); + } if (mask) { /* can't multishot if failed, just queue the event we've got */ ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 22:50 ` Pavel Begunkov @ 2022-05-29 23:34 ` Jens Axboe 2022-05-30 0:18 ` Pavel Begunkov 2022-05-30 6:38 ` Hao Xu 1 sibling, 1 reply; 15+ messages in thread From: Jens Axboe @ 2022-05-29 23:34 UTC (permalink / raw) To: Pavel Begunkov, Hao Xu, io-uring On 5/29/22 4:50 PM, Pavel Begunkov wrote: > On 5/29/22 19:40, Jens Axboe wrote: >> On 5/29/22 12:07 PM, Hao Xu wrote: >>> On 5/30/22 00:25, Jens Axboe wrote: >>>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>>> From: Hao Xu <[email protected]> >>>>> >>>>> From: Hao Xu <[email protected]> >>>>> >>>>> Use per list lock for cancel_hash, this removes some completion lock >>>>> invocation and remove contension between different cancel_hash entries >>>> >>>> Interesting, do you have any numbers on this? >>> >>> Just Theoretically for now, I'll do some tests tomorrow. This is >>> actually RFC, forgot to change the subject. >>> >>>> >>>> Also, I'd make a hash bucket struct: >>>> >>>> struct io_hash_bucket { >>>> spinlock_t lock; >>>> struct hlist_head list; >>>> }; >>>> >>>> rather than two separate structs, that'll have nicer memory locality too >>>> and should further improve it. Could be done as a prep patch with the >>>> old locking in place, making the end patch doing the per-bucket lock >>>> simpler as well. >>> >>> Sure, if the test number make sense, I'll send v2. I'll test the >>> hlist_bl list as well(the comment of it says it is much slower than >>> normal spin_lock, but we may not care the efficiency of poll >>> cancellation very much?). >> >> I don't think the bit spinlocks are going to be useful, we should >> stick with a spinlock for this. They are indeed slower and generally not >> used for that reason. For a use case where you need a ton of locks and >> saving the 4 bytes for a spinlock would make sense (or maybe not >> changing some struct?), maybe they have a purpose. But not for this. > > We can put the cancel hashes under uring_lock and completely kill > the hash spinlocking (2 lock/unlock pairs per single-shot). The code > below won't even compile and missing cancellation bits, I'll pick it > up in a week. > > Even better would be to have two hash tables, and auto-magically apply > the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) > and apoll (need uring_lock after anyway). My hope was that it'd take us closer to being able to use more granular locking for hashing in general. I don't care too much about the cancelation, but the normal hash locking would be useful to do. However, for cancelations, under uring_lock would indeed be preferable to doing per-bucket locks there. Guess I'll wait and see what your final patch looks like, not sure why it'd be a ctx conditional? What about io_poll_remove_all()? -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 23:34 ` Jens Axboe @ 2022-05-30 0:18 ` Pavel Begunkov 2022-05-30 6:52 ` Hao Xu 0 siblings, 1 reply; 15+ messages in thread From: Pavel Begunkov @ 2022-05-30 0:18 UTC (permalink / raw) To: Jens Axboe, Hao Xu, io-uring On 5/30/22 00:34, Jens Axboe wrote: > On 5/29/22 4:50 PM, Pavel Begunkov wrote: >> On 5/29/22 19:40, Jens Axboe wrote: >>> On 5/29/22 12:07 PM, Hao Xu wrote: >>>> On 5/30/22 00:25, Jens Axboe wrote: >>>>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>>>> From: Hao Xu <[email protected]> >>>>>> >>>>>> From: Hao Xu <[email protected]> >>>>>> >>>>>> Use per list lock for cancel_hash, this removes some completion lock >>>>>> invocation and remove contension between different cancel_hash entries >>>>> >>>>> Interesting, do you have any numbers on this? >>>> >>>> Just Theoretically for now, I'll do some tests tomorrow. This is >>>> actually RFC, forgot to change the subject. >>>> >>>>> >>>>> Also, I'd make a hash bucket struct: >>>>> >>>>> struct io_hash_bucket { >>>>> spinlock_t lock; >>>>> struct hlist_head list; >>>>> }; >>>>> >>>>> rather than two separate structs, that'll have nicer memory locality too >>>>> and should further improve it. Could be done as a prep patch with the >>>>> old locking in place, making the end patch doing the per-bucket lock >>>>> simpler as well. >>>> >>>> Sure, if the test number make sense, I'll send v2. I'll test the >>>> hlist_bl list as well(the comment of it says it is much slower than >>>> normal spin_lock, but we may not care the efficiency of poll >>>> cancellation very much?). >>> >>> I don't think the bit spinlocks are going to be useful, we should >>> stick with a spinlock for this. They are indeed slower and generally not >>> used for that reason. For a use case where you need a ton of locks and >>> saving the 4 bytes for a spinlock would make sense (or maybe not >>> changing some struct?), maybe they have a purpose. But not for this. >> >> We can put the cancel hashes under uring_lock and completely kill >> the hash spinlocking (2 lock/unlock pairs per single-shot). The code >> below won't even compile and missing cancellation bits, I'll pick it >> up in a week. >> >> Even better would be to have two hash tables, and auto-magically apply >> the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) >> and apoll (need uring_lock after anyway). > > My hope was that it'd take us closer to being able to use more granular > locking for hashing in general. I don't care too much about the > cancelation, but the normal hash locking would be useful to do. > > However, for cancelations, under uring_lock would indeed be preferable > to doing per-bucket locks there. Guess I'll wait and see what your final > patch looks like, not sure why it'd be a ctx conditional? It replaces 2 spin lock/unlock with one io_tw_lock() in the completion path, which is done once per tw batch and grabbed anyway if there is no contention (see handle_tw_list()). It could be unconditional, but I'd say for those 3 cases we have non-existing chance to regress perf/latency, but I can think of some cases where it might screw latencies, all share io_uring b/w threads. Should benefit the cancellation path as well, but I don't care about it as well. > What about io_poll_remove_all()? As mentioned, it's not handled in the diff, but easily doable, it should just traverse both hash tables. -- Pavel Begunkov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-30 0:18 ` Pavel Begunkov @ 2022-05-30 6:52 ` Hao Xu 2022-05-30 9:35 ` Pavel Begunkov 0 siblings, 1 reply; 15+ messages in thread From: Hao Xu @ 2022-05-30 6:52 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe, io-uring On 5/30/22 08:18, Pavel Begunkov wrote: > On 5/30/22 00:34, Jens Axboe wrote: >> On 5/29/22 4:50 PM, Pavel Begunkov wrote: >>> On 5/29/22 19:40, Jens Axboe wrote: >>>> On 5/29/22 12:07 PM, Hao Xu wrote: >>>>> On 5/30/22 00:25, Jens Axboe wrote: >>>>>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>>>>> From: Hao Xu <[email protected]> >>>>>>> >>>>>>> From: Hao Xu <[email protected]> >>>>>>> >>>>>>> Use per list lock for cancel_hash, this removes some completion lock >>>>>>> invocation and remove contension between different cancel_hash >>>>>>> entries >>>>>> >>>>>> Interesting, do you have any numbers on this? >>>>> >>>>> Just Theoretically for now, I'll do some tests tomorrow. This is >>>>> actually RFC, forgot to change the subject. >>>>> >>>>>> >>>>>> Also, I'd make a hash bucket struct: >>>>>> >>>>>> struct io_hash_bucket { >>>>>> spinlock_t lock; >>>>>> struct hlist_head list; >>>>>> }; >>>>>> >>>>>> rather than two separate structs, that'll have nicer memory >>>>>> locality too >>>>>> and should further improve it. Could be done as a prep patch with the >>>>>> old locking in place, making the end patch doing the per-bucket lock >>>>>> simpler as well. >>>>> >>>>> Sure, if the test number make sense, I'll send v2. I'll test the >>>>> hlist_bl list as well(the comment of it says it is much slower than >>>>> normal spin_lock, but we may not care the efficiency of poll >>>>> cancellation very much?). >>>> >>>> I don't think the bit spinlocks are going to be useful, we should >>>> stick with a spinlock for this. They are indeed slower and generally >>>> not >>>> used for that reason. For a use case where you need a ton of locks and >>>> saving the 4 bytes for a spinlock would make sense (or maybe not >>>> changing some struct?), maybe they have a purpose. But not for this. >>> >>> We can put the cancel hashes under uring_lock and completely kill >>> the hash spinlocking (2 lock/unlock pairs per single-shot). The code >>> below won't even compile and missing cancellation bits, I'll pick it >>> up in a week. >>> >>> Even better would be to have two hash tables, and auto-magically apply >>> the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) >>> and apoll (need uring_lock after anyway). >> >> My hope was that it'd take us closer to being able to use more granular >> locking for hashing in general. I don't care too much about the >> cancelation, but the normal hash locking would be useful to do. >> >> However, for cancelations, under uring_lock would indeed be preferable >> to doing per-bucket locks there. Guess I'll wait and see what your final >> patch looks like, not sure why it'd be a ctx conditional? > > It replaces 2 spin lock/unlock with one io_tw_lock() in the completion > path, which is done once per tw batch and grabbed anyway if > there is no contention (see handle_tw_list()). > > It could be unconditional, but I'd say for those 3 cases we have > non-existing chance to regress perf/latency, but I can think of > some cases where it might screw latencies, all share io_uring > b/w threads. > > Should benefit the cancellation path as well, but I don't care > about it as well. > >> What about io_poll_remove_all()? > > As mentioned, it's not handled in the diff, but easily doable, > it should just traverse both hash tables. > Two hash tables looks good to me. If I don't get you wrong, one table under uring_lock, the other one for normal handling(like per bucket locking)? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-30 6:52 ` Hao Xu @ 2022-05-30 9:35 ` Pavel Begunkov 0 siblings, 0 replies; 15+ messages in thread From: Pavel Begunkov @ 2022-05-30 9:35 UTC (permalink / raw) To: Hao Xu, Jens Axboe, io-uring On 5/30/22 07:52, Hao Xu wrote: > On 5/30/22 08:18, Pavel Begunkov wrote: >> On 5/30/22 00:34, Jens Axboe wrote: >>> On 5/29/22 4:50 PM, Pavel Begunkov wrote: >>>> On 5/29/22 19:40, Jens Axboe wrote: >>>>> On 5/29/22 12:07 PM, Hao Xu wrote: >>>>>> On 5/30/22 00:25, Jens Axboe wrote: >>>>>>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>>>>>> From: Hao Xu <[email protected]> >>>>>>>> >>>>>>>> From: Hao Xu <[email protected]> >>>>>>>> >>>>>>>> Use per list lock for cancel_hash, this removes some completion lock >>>>>>>> invocation and remove contension between different cancel_hash entries >>>>>>> >>>>>>> Interesting, do you have any numbers on this? >>>>>> >>>>>> Just Theoretically for now, I'll do some tests tomorrow. This is >>>>>> actually RFC, forgot to change the subject. >>>>>> >>>>>>> >>>>>>> Also, I'd make a hash bucket struct: >>>>>>> >>>>>>> struct io_hash_bucket { >>>>>>> spinlock_t lock; >>>>>>> struct hlist_head list; >>>>>>> }; >>>>>>> >>>>>>> rather than two separate structs, that'll have nicer memory locality too >>>>>>> and should further improve it. Could be done as a prep patch with the >>>>>>> old locking in place, making the end patch doing the per-bucket lock >>>>>>> simpler as well. >>>>>> >>>>>> Sure, if the test number make sense, I'll send v2. I'll test the >>>>>> hlist_bl list as well(the comment of it says it is much slower than >>>>>> normal spin_lock, but we may not care the efficiency of poll >>>>>> cancellation very much?). >>>>> >>>>> I don't think the bit spinlocks are going to be useful, we should >>>>> stick with a spinlock for this. They are indeed slower and generally not >>>>> used for that reason. For a use case where you need a ton of locks and >>>>> saving the 4 bytes for a spinlock would make sense (or maybe not >>>>> changing some struct?), maybe they have a purpose. But not for this. >>>> >>>> We can put the cancel hashes under uring_lock and completely kill >>>> the hash spinlocking (2 lock/unlock pairs per single-shot). The code >>>> below won't even compile and missing cancellation bits, I'll pick it >>>> up in a week. >>>> >>>> Even better would be to have two hash tables, and auto-magically apply >>>> the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) >>>> and apoll (need uring_lock after anyway). >>> >>> My hope was that it'd take us closer to being able to use more granular >>> locking for hashing in general. I don't care too much about the >>> cancelation, but the normal hash locking would be useful to do. >>> >>> However, for cancelations, under uring_lock would indeed be preferable >>> to doing per-bucket locks there. Guess I'll wait and see what your final >>> patch looks like, not sure why it'd be a ctx conditional? >> >> It replaces 2 spin lock/unlock with one io_tw_lock() in the completion >> path, which is done once per tw batch and grabbed anyway if >> there is no contention (see handle_tw_list()). >> >> It could be unconditional, but I'd say for those 3 cases we have >> non-existing chance to regress perf/latency, but I can think of >> some cases where it might screw latencies, all share io_uring >> b/w threads. >> >> Should benefit the cancellation path as well, but I don't care >> about it as well. >> >>> What about io_poll_remove_all()? >> >> As mentioned, it's not handled in the diff, but easily doable, >> it should just traverse both hash tables. >> > > Two hash tables looks good to me. If I don't get you wrong, one table > under uring_lock, the other one for normal handling(like per bucket > locking)? Right, that's the plan -- Pavel Begunkov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 22:50 ` Pavel Begunkov 2022-05-29 23:34 ` Jens Axboe @ 2022-05-30 6:38 ` Hao Xu 2022-05-30 6:59 ` Hao Xu 2022-05-30 9:39 ` Pavel Begunkov 1 sibling, 2 replies; 15+ messages in thread From: Hao Xu @ 2022-05-30 6:38 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe, io-uring On 5/30/22 06:50, Pavel Begunkov wrote: > On 5/29/22 19:40, Jens Axboe wrote: >> On 5/29/22 12:07 PM, Hao Xu wrote: >>> On 5/30/22 00:25, Jens Axboe wrote: >>>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>>> From: Hao Xu <[email protected]> >>>>> >>>>> From: Hao Xu <[email protected]> >>>>> >>>>> Use per list lock for cancel_hash, this removes some completion lock >>>>> invocation and remove contension between different cancel_hash entries >>>> >>>> Interesting, do you have any numbers on this? >>> >>> Just Theoretically for now, I'll do some tests tomorrow. This is >>> actually RFC, forgot to change the subject. >>> >>>> >>>> Also, I'd make a hash bucket struct: >>>> >>>> struct io_hash_bucket { >>>> spinlock_t lock; >>>> struct hlist_head list; >>>> }; >>>> >>>> rather than two separate structs, that'll have nicer memory locality >>>> too >>>> and should further improve it. Could be done as a prep patch with the >>>> old locking in place, making the end patch doing the per-bucket lock >>>> simpler as well. >>> >>> Sure, if the test number make sense, I'll send v2. I'll test the >>> hlist_bl list as well(the comment of it says it is much slower than >>> normal spin_lock, but we may not care the efficiency of poll >>> cancellation very much?). >> >> I don't think the bit spinlocks are going to be useful, we should >> stick with a spinlock for this. They are indeed slower and generally not >> used for that reason. For a use case where you need a ton of locks and >> saving the 4 bytes for a spinlock would make sense (or maybe not >> changing some struct?), maybe they have a purpose. But not for this. > > We can put the cancel hashes under uring_lock and completely kill > the hash spinlocking (2 lock/unlock pairs per single-shot). The code > below won't even compile and missing cancellation bits, I'll pick it > up in a week. > > Even better would be to have two hash tables, and auto-magically apply > the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) > and apoll (need uring_lock after anyway). > > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 6be21967959d..191fa7f31610 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -7120,12 +7120,20 @@ static void io_poll_task_func(struct io_kiocb > *req, bool *locked) > } > > io_poll_remove_entries(req); > - spin_lock(&ctx->completion_lock); > - hash_del(&req->hash_node); > - __io_req_complete_post(req, req->cqe.res, 0); > - io_commit_cqring(ctx); > - spin_unlock(&ctx->completion_lock); > - io_cqring_ev_posted(ctx); > + > + if (ctx->flags & IORING_MUTEX_HASH) { > + io_tw_lock(ctx, locked); > + hash_del(&req->hash_node); > + io_req_complete_state(req, req->cqe.res, 0); > + io_req_add_compl_list(req); > + } else { > + spin_lock(&ctx->completion_lock); > + hash_del(&req->hash_node); > + __io_req_complete_post(req, req->cqe.res, 0); > + io_commit_cqring(ctx); > + spin_unlock(&ctx->completion_lock); > + io_cqring_ev_posted(ctx); > + } > } > > static void io_apoll_task_func(struct io_kiocb *req, bool *locked) > @@ -7138,9 +7146,14 @@ static void io_apoll_task_func(struct io_kiocb > *req, bool *locked) > return; > > io_poll_remove_entries(req); > - spin_lock(&ctx->completion_lock); > - hash_del(&req->hash_node); > - spin_unlock(&ctx->completion_lock); > + if (ctx->flags & IORING_MUTEX_HASH) { > + io_tw_lock(ctx, locked); > + hash_del(&req->hash_node); > + } else { > + spin_lock(&ctx->completion_lock); > + hash_del(&req->hash_node); > + spin_unlock(&ctx->completion_lock); > + } > > if (!ret) > io_req_task_submit(req, locked); > @@ -7332,9 +7345,13 @@ static int __io_arm_poll_handler(struct io_kiocb > *req, > return 0; > } > > - spin_lock(&ctx->completion_lock); > - io_poll_req_insert(req); > - spin_unlock(&ctx->completion_lock); > + if (ctx->flags & IORING_MUTEX_HASH) { Does IORING_MUTEX_HASH exclude IOSQE_ASYNC as well? though IOSQE_ASYNC is uncommon but users may do that. > + io_poll_req_insert(req); > + } else { > + spin_lock(&ctx->completion_lock); > + io_poll_req_insert(req); > + spin_unlock(&ctx->completion_lock); > + } > > if (mask) { > /* can't multishot if failed, just queue the event we've got */ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-30 6:38 ` Hao Xu @ 2022-05-30 6:59 ` Hao Xu 2022-05-30 9:39 ` Pavel Begunkov 1 sibling, 0 replies; 15+ messages in thread From: Hao Xu @ 2022-05-30 6:59 UTC (permalink / raw) To: Pavel Begunkov, Jens Axboe, io-uring On 5/30/22 14:38, Hao Xu wrote: > On 5/30/22 06:50, Pavel Begunkov wrote: >> On 5/29/22 19:40, Jens Axboe wrote: >>> On 5/29/22 12:07 PM, Hao Xu wrote: >>>> On 5/30/22 00:25, Jens Axboe wrote: >>>>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>>>> From: Hao Xu <[email protected]> >>>>>> >>>>>> From: Hao Xu <[email protected]> >>>>>> >>>>>> Use per list lock for cancel_hash, this removes some completion lock >>>>>> invocation and remove contension between different cancel_hash >>>>>> entries >>>>> >>>>> Interesting, do you have any numbers on this? >>>> >>>> Just Theoretically for now, I'll do some tests tomorrow. This is >>>> actually RFC, forgot to change the subject. >>>> >>>>> >>>>> Also, I'd make a hash bucket struct: >>>>> >>>>> struct io_hash_bucket { >>>>> spinlock_t lock; >>>>> struct hlist_head list; >>>>> }; >>>>> >>>>> rather than two separate structs, that'll have nicer memory >>>>> locality too >>>>> and should further improve it. Could be done as a prep patch with the >>>>> old locking in place, making the end patch doing the per-bucket lock >>>>> simpler as well. >>>> >>>> Sure, if the test number make sense, I'll send v2. I'll test the >>>> hlist_bl list as well(the comment of it says it is much slower than >>>> normal spin_lock, but we may not care the efficiency of poll >>>> cancellation very much?). >>> >>> I don't think the bit spinlocks are going to be useful, we should >>> stick with a spinlock for this. They are indeed slower and generally not >>> used for that reason. For a use case where you need a ton of locks and >>> saving the 4 bytes for a spinlock would make sense (or maybe not >>> changing some struct?), maybe they have a purpose. But not for this. >> >> We can put the cancel hashes under uring_lock and completely kill >> the hash spinlocking (2 lock/unlock pairs per single-shot). The code >> below won't even compile and missing cancellation bits, I'll pick it >> up in a week. >> >> Even better would be to have two hash tables, and auto-magically apply >> the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) >> and apoll (need uring_lock after anyway). >> >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index 6be21967959d..191fa7f31610 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -7120,12 +7120,20 @@ static void io_poll_task_func(struct io_kiocb >> *req, bool *locked) >> } >> >> io_poll_remove_entries(req); >> - spin_lock(&ctx->completion_lock); >> - hash_del(&req->hash_node); >> - __io_req_complete_post(req, req->cqe.res, 0); >> - io_commit_cqring(ctx); >> - spin_unlock(&ctx->completion_lock); >> - io_cqring_ev_posted(ctx); >> + >> + if (ctx->flags & IORING_MUTEX_HASH) { >> + io_tw_lock(ctx, locked); >> + hash_del(&req->hash_node); >> + io_req_complete_state(req, req->cqe.res, 0); >> + io_req_add_compl_list(req); >> + } else { >> + spin_lock(&ctx->completion_lock); >> + hash_del(&req->hash_node); >> + __io_req_complete_post(req, req->cqe.res, 0); >> + io_commit_cqring(ctx); >> + spin_unlock(&ctx->completion_lock); >> + io_cqring_ev_posted(ctx); >> + } >> } >> >> static void io_apoll_task_func(struct io_kiocb *req, bool *locked) >> @@ -7138,9 +7146,14 @@ static void io_apoll_task_func(struct io_kiocb >> *req, bool *locked) >> return; >> >> io_poll_remove_entries(req); >> - spin_lock(&ctx->completion_lock); >> - hash_del(&req->hash_node); >> - spin_unlock(&ctx->completion_lock); >> + if (ctx->flags & IORING_MUTEX_HASH) { >> + io_tw_lock(ctx, locked); >> + hash_del(&req->hash_node); >> + } else { >> + spin_lock(&ctx->completion_lock); >> + hash_del(&req->hash_node); >> + spin_unlock(&ctx->completion_lock); >> + } >> >> if (!ret) >> io_req_task_submit(req, locked); >> @@ -7332,9 +7345,13 @@ static int __io_arm_poll_handler(struct >> io_kiocb *req, >> return 0; >> } >> >> - spin_lock(&ctx->completion_lock); >> - io_poll_req_insert(req); >> - spin_unlock(&ctx->completion_lock); >> + if (ctx->flags & IORING_MUTEX_HASH) { > > Does IORING_MUTEX_HASH exclude IOSQE_ASYNC as well? though IOSQE_ASYNC > is uncommon but users may do that. I see, cases like IOSQE_ASYNC definitely goes into the else branch since it's in iowq context. > >> + io_poll_req_insert(req); >> + } else { >> + spin_lock(&ctx->completion_lock); >> + io_poll_req_insert(req); >> + spin_unlock(&ctx->completion_lock); >> + } >> >> if (mask) { >> /* can't multishot if failed, just queue the event we've got */ > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-30 6:38 ` Hao Xu 2022-05-30 6:59 ` Hao Xu @ 2022-05-30 9:39 ` Pavel Begunkov 1 sibling, 0 replies; 15+ messages in thread From: Pavel Begunkov @ 2022-05-30 9:39 UTC (permalink / raw) To: Hao Xu, Jens Axboe, io-uring On 5/30/22 07:38, Hao Xu wrote: > On 5/30/22 06:50, Pavel Begunkov wrote: >> On 5/29/22 19:40, Jens Axboe wrote: >>> On 5/29/22 12:07 PM, Hao Xu wrote: >>>> On 5/30/22 00:25, Jens Axboe wrote: >>>>> On 5/29/22 10:20 AM, Hao Xu wrote: >>>>>> From: Hao Xu <[email protected]> >>>>>> >>>>>> From: Hao Xu <[email protected]> >>>>>> >>>>>> Use per list lock for cancel_hash, this removes some completion lock >>>>>> invocation and remove contension between different cancel_hash entries >>>>> >>>>> Interesting, do you have any numbers on this? >>>> >>>> Just Theoretically for now, I'll do some tests tomorrow. This is >>>> actually RFC, forgot to change the subject. >>>> >>>>> >>>>> Also, I'd make a hash bucket struct: >>>>> >>>>> struct io_hash_bucket { >>>>> spinlock_t lock; >>>>> struct hlist_head list; >>>>> }; >>>>> >>>>> rather than two separate structs, that'll have nicer memory locality too >>>>> and should further improve it. Could be done as a prep patch with the >>>>> old locking in place, making the end patch doing the per-bucket lock >>>>> simpler as well. >>>> >>>> Sure, if the test number make sense, I'll send v2. I'll test the >>>> hlist_bl list as well(the comment of it says it is much slower than >>>> normal spin_lock, but we may not care the efficiency of poll >>>> cancellation very much?). >>> >>> I don't think the bit spinlocks are going to be useful, we should >>> stick with a spinlock for this. They are indeed slower and generally not >>> used for that reason. For a use case where you need a ton of locks and >>> saving the 4 bytes for a spinlock would make sense (or maybe not >>> changing some struct?), maybe they have a purpose. But not for this. >> >> We can put the cancel hashes under uring_lock and completely kill >> the hash spinlocking (2 lock/unlock pairs per single-shot). The code >> below won't even compile and missing cancellation bits, I'll pick it >> up in a week. >> >> Even better would be to have two hash tables, and auto-magically apply >> the feature to SINGLE_SUBMITTER, SQPOLL (both will have uring_lock held) >> and apoll (need uring_lock after anyway). >> >> @@ -7332,9 +7345,13 @@ static int __io_arm_poll_handler(struct io_kiocb *req, >> return 0; >> } >> >> - spin_lock(&ctx->completion_lock); >> - io_poll_req_insert(req); >> - spin_unlock(&ctx->completion_lock); >> + if (ctx->flags & IORING_MUTEX_HASH) { > > Does IORING_MUTEX_HASH exclude IOSQE_ASYNC as well? though IOSQE_ASYNC > is uncommon but users may do that. It should, io-wq -> arm_poll() doesn't hold uring_lock, good point -- Pavel Begunkov ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock 2022-05-29 16:25 ` Jens Axboe 2022-05-29 18:07 ` Hao Xu @ 2022-05-30 13:33 ` Hao Xu 1 sibling, 0 replies; 15+ messages in thread From: Hao Xu @ 2022-05-30 13:33 UTC (permalink / raw) To: Jens Axboe, io-uring; +Cc: Pavel Begunkov On 5/30/22 00:25, Jens Axboe wrote: > On 5/29/22 10:20 AM, Hao Xu wrote: >> From: Hao Xu <[email protected]> >> >> From: Hao Xu <[email protected]> >> >> Use per list lock for cancel_hash, this removes some completion lock >> invocation and remove contension between different cancel_hash entries > > Interesting, do you have any numbers on this? > > Also, I'd make a hash bucket struct: > > struct io_hash_bucket { > spinlock_t lock; > struct hlist_head list; > }; > > rather than two separate structs, that'll have nicer memory locality too > and should further improve it. Could be done as a prep patch with the > old locking in place, making the end patch doing the per-bucket lock > simpler as well. > > Hmm? > I've done a v2 here, also a test which issues async poll densely to make high frequency cancel_hash[] visits. But I won't have a real box with big number of cpu processors which is suitable for testing until tomorrow, so I'd test it tomorrow. https://github.com/HowHsu/linux/commits/for-5.20/io_uring_hash_lock https://github.com/HowHsu/liburing/commit/b9fb4d20a5dfe7c7bd62fe36c37aea3b261d4499 ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-05-30 13:48 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-05-29 16:19 [PATCH 0/2] cancel_hash per entry lock Hao Xu 2022-05-29 16:19 ` [PATCH 1/2] io_uring: add an argument for io_poll_disarm() Hao Xu 2022-05-29 16:20 ` [PATCH 2/2] io_uring: switch cancel_hash to use per list spinlock Hao Xu 2022-05-29 16:25 ` Jens Axboe 2022-05-29 18:07 ` Hao Xu 2022-05-29 18:40 ` Jens Axboe 2022-05-29 22:50 ` Pavel Begunkov 2022-05-29 23:34 ` Jens Axboe 2022-05-30 0:18 ` Pavel Begunkov 2022-05-30 6:52 ` Hao Xu 2022-05-30 9:35 ` Pavel Begunkov 2022-05-30 6:38 ` Hao Xu 2022-05-30 6:59 ` Hao Xu 2022-05-30 9:39 ` Pavel Begunkov 2022-05-30 13:33 ` Hao Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox