From: Jens Axboe <[email protected]>
To: Matthew Wilcox <[email protected]>
Cc: yangerkun <[email protected]>,
Pavel Begunkov <[email protected]>,
[email protected], Stefan Metzmacher <[email protected]>,
[email protected]
Subject: Re: [PATCH 5.12] io_uring: Convert personality_idr to XArray
Date: Sat, 13 Mar 2021 13:22:24 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 3/13/21 1:13 PM, Jens Axboe wrote:
> On 3/13/21 12:54 PM, Matthew Wilcox wrote:
>> On Sat, Mar 13, 2021 at 12:30:14PM -0700, Jens Axboe wrote:
>>> @@ -2851,7 +2852,7 @@ static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
>>> list_del(&kbuf->list);
>>> } else {
>>> kbuf = head;
>>> - idr_remove(&req->ctx->io_buffer_idr, bgid);
>>> + __xa_erase(&req->ctx->io_buffer, bgid);
>>
>> Umm ... __xa_erase()? Did you enable all the lockdep infrastructure?
>> This should have tripped some of the debugging code because I don't think
>> you're holding the xa_lock.
>
> Not run with lockdep - and probably my misunderstanding, do we need xa_lock()
> if we provide our own locking?
>
>>> @@ -3993,21 +3994,20 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
>>>
>>> lockdep_assert_held(&ctx->uring_lock);
>>>
>>> - list = head = idr_find(&ctx->io_buffer_idr, p->bgid);
>>> + list = head = xa_load(&ctx->io_buffer, p->bgid);
>>>
>>> ret = io_add_buffers(p, &head);
>>> - if (ret < 0)
>>> - goto out;
>>> + if (ret >= 0 && !list) {
>>> + u32 id = -1U;
>>>
>>> - if (!list) {
>>> - ret = idr_alloc(&ctx->io_buffer_idr, head, p->bgid, p->bgid + 1,
>>> - GFP_KERNEL);
>>> - if (ret < 0) {
>>> + ret = __xa_alloc_cyclic(&ctx->io_buffer, &id, head,
>>> + XA_LIMIT(0, USHRT_MAX),
>>> + &ctx->io_buffer_next, GFP_KERNEL);
>>
>> I don't understand why this works. The equivalent transformation here
>> would have been:
>>
>> ret = xa_insert(&ctx->io_buffers, p->bgid, head, GFP_KERNEL);
>>
>> with various options to handle it differently.
>
> True, that does look kinda weird (and wrong). I'll fix that up.
>
>>> static void io_destroy_buffers(struct io_ring_ctx *ctx)
>>> {
>>> - idr_for_each(&ctx->io_buffer_idr, __io_destroy_buffers, ctx);
>>> - idr_destroy(&ctx->io_buffer_idr);
>>> + struct io_buffer *buf;
>>> + unsigned long index;
>>> +
>>> + xa_for_each(&ctx->io_buffer, index, buf)
>>> + __io_remove_buffers(ctx, buf, index, -1U);
>>> + xa_destroy(&ctx->io_buffer);
>>
>> Honestly, I'd do BUG_ON(!xa_empty(&ctx->io_buffers)) if anything. If that
>> loop didn't empty the array, something is terribly wrong and we should
>> know about it somehow instead of making the memory leak harder to find.
>
> Probably also my misunderstanding - do I not need to call xa_destroy()
> if I prune all the members? Assumed we needed it to free some internal
> state, but maybe that's not the case?
Here's a v2. Verified no leaks with the killed xa_destroy(), and that
lockdep is happy. BTW, much better API, which is evident from the fact
that a conversion like this ends up with the below diffstat:
io_uring.c | 43 +++++++++++++++----------------------------
1 file changed, 15 insertions(+), 28 deletions(-)
commit 51c681e3487d091b447175088bcf546f5ce1bf35
Author: Jens Axboe <[email protected]>
Date: Sat Mar 13 12:29:43 2021 -0700
io_uring: convert io_buffer_idr to XArray
Like we did for the personality idr, convert the IO buffer idr to use
XArray. This avoids a use-after-free on removal of entries, since idr
doesn't like doing so from inside an iterator.
Fixes: 5a2e745d4d43 ("io_uring: buffer registration infrastructure")
Cc: [email protected]
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 05adc4887ef3..642ad08d8964 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -402,7 +402,7 @@ struct io_ring_ctx {
struct socket *ring_sock;
#endif
- struct idr io_buffer_idr;
+ struct xarray io_buffer;
struct xarray personalities;
u32 pers_next;
@@ -1135,7 +1135,7 @@ static struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
init_waitqueue_head(&ctx->cq_wait);
INIT_LIST_HEAD(&ctx->cq_overflow_list);
init_completion(&ctx->ref_comp);
- idr_init(&ctx->io_buffer_idr);
+ xa_init_flags(&ctx->io_buffer, XA_FLAGS_ALLOC1);
xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1);
mutex_init(&ctx->uring_lock);
init_waitqueue_head(&ctx->wait);
@@ -2843,7 +2843,7 @@ static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
lockdep_assert_held(&req->ctx->uring_lock);
- head = idr_find(&req->ctx->io_buffer_idr, bgid);
+ head = xa_load(&req->ctx->io_buffer, bgid);
if (head) {
if (!list_empty(&head->list)) {
kbuf = list_last_entry(&head->list, struct io_buffer,
@@ -2851,7 +2851,7 @@ static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
list_del(&kbuf->list);
} else {
kbuf = head;
- idr_remove(&req->ctx->io_buffer_idr, bgid);
+ xa_erase(&req->ctx->io_buffer, bgid);
}
if (*len > kbuf->len)
*len = kbuf->len;
@@ -3892,7 +3892,7 @@ static int __io_remove_buffers(struct io_ring_ctx *ctx, struct io_buffer *buf,
}
i++;
kfree(buf);
- idr_remove(&ctx->io_buffer_idr, bgid);
+ xa_erase(&ctx->io_buffer, bgid);
return i;
}
@@ -3910,7 +3910,7 @@ static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
lockdep_assert_held(&ctx->uring_lock);
ret = -ENOENT;
- head = idr_find(&ctx->io_buffer_idr, p->bgid);
+ head = xa_load(&ctx->io_buffer, p->bgid);
if (head)
ret = __io_remove_buffers(ctx, head, p->bgid, p->nbufs);
if (ret < 0)
@@ -3993,21 +3993,14 @@ static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
lockdep_assert_held(&ctx->uring_lock);
- list = head = idr_find(&ctx->io_buffer_idr, p->bgid);
+ list = head = xa_load(&ctx->io_buffer, p->bgid);
ret = io_add_buffers(p, &head);
- if (ret < 0)
- goto out;
-
- if (!list) {
- ret = idr_alloc(&ctx->io_buffer_idr, head, p->bgid, p->bgid + 1,
- GFP_KERNEL);
- if (ret < 0) {
+ if (ret >= 0 && !list) {
+ ret = xa_insert(&ctx->io_buffer, p->bgid, head, GFP_KERNEL);
+ if (ret < 0)
__io_remove_buffers(ctx, head, p->bgid, -1U);
- goto out;
- }
}
-out:
if (ret < 0)
req_set_fail_links(req);
@@ -8333,19 +8326,13 @@ static int io_eventfd_unregister(struct io_ring_ctx *ctx)
return -ENXIO;
}
-static int __io_destroy_buffers(int id, void *p, void *data)
-{
- struct io_ring_ctx *ctx = data;
- struct io_buffer *buf = p;
-
- __io_remove_buffers(ctx, buf, id, -1U);
- return 0;
-}
-
static void io_destroy_buffers(struct io_ring_ctx *ctx)
{
- idr_for_each(&ctx->io_buffer_idr, __io_destroy_buffers, ctx);
- idr_destroy(&ctx->io_buffer_idr);
+ struct io_buffer *buf;
+ unsigned long index;
+
+ xa_for_each(&ctx->io_buffer, index, buf)
+ __io_remove_buffers(ctx, buf, index, -1U);
}
static void io_req_cache_free(struct list_head *list, struct task_struct *tsk)
--
Jens Axboe
next prev parent reply other threads:[~2021-03-13 20:23 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-08 14:16 [PATCH 5.12] io_uring: Convert personality_idr to XArray Pavel Begunkov
2021-03-08 14:22 ` Pavel Begunkov
2021-03-08 16:16 ` Matthew Wilcox
2021-03-09 11:23 ` yangerkun
2021-03-13 8:02 ` yangerkun
2021-03-13 15:34 ` Jens Axboe
2021-03-13 19:01 ` Jens Axboe
2021-03-13 19:30 ` Jens Axboe
2021-03-13 19:54 ` Matthew Wilcox
2021-03-13 20:13 ` Jens Axboe
2021-03-13 20:22 ` Jens Axboe [this message]
2021-03-09 20:53 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox