From: Olivier Langlois <[email protected]>
To: Jens Axboe <[email protected]>,
Pavel Begunkov <[email protected]>,
[email protected]
Subject: Re: [PATCH] io_uring/napi: remove duplicate io_napi_entry timeout assignation
Date: Mon, 12 Aug 2024 17:39:43 -0400 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On Mon, 2024-08-12 at 14:40 -0600, Jens Axboe wrote:
>
> >
> > 1. in __io_napi_remove_stale(),
> >
> > is it ok to use hash_for_each() instead of hash_for_each_safe()?
> >
> > it might be ok because it is a hash_del_rcu() and not a simple
> > hash_del() but I have little experience with possible RCU shortcuts
> > so
> > I am unsure on this one...
>
> Should use hash_for_each_rcu(), and I think for good measure, we
> sould
> just keep it inside the RCU region. Then we know for a fact that the
> deletion doesn't run.
ok, after confirming that it was NOT paranoia, I did push further the
investigation.
hash_for_each_rcu() is meant for the readers. A writer can modify the
container as long as it protects the container against concurrent
writes with locks.
in include/linux/hashtable.h:
static inline void hash_del_rcu(struct hlist_node *node)
{
hlist_del_init_rcu(node);
}
in include/linux/rculist.h:
static inline void hlist_del_init_rcu(struct hlist_node *n)
{
if (!hlist_unhashed(n)) {
__hlist_del(n);
WRITE_ONCE(n->pprev, NULL);
}
}
in include/linux/list.h:
static inline void __hlist_del(struct hlist_node *n)
{
struct hlist_node *next = n->next;
struct hlist_node **pprev = n->pprev;
WRITE_ONCE(*pprev, next);
if (next)
WRITE_ONCE(next->pprev, pprev);
}
it seems we are ok since the deleted node next pointer is not reset.
the problem might arise if the next action after removing the node is
to kfree it. we are fine because kfree_rcu() must be defering the
actual free.
That being said, I recommend to replace hash_for_each_rcu() with
hash_for_each_safe() to make everyone sleep better at night...
based on this new info, I'll let you decide which macro to use and I
volunteer to make that change.
>
> > 2. in io_napi_free()
> >
> > list_del(&e->list); is not called. Can the only reason be that
> > io_napi_free() is called as part of the ring destruction so it is
> > an
> > optimization to not clear the list since it is not expected to be
> > reused?
> >
> > would calling INIT_LIST_HEAD() before exiting as an extra
> > precaution to
> > make the function is future proof in case it is reused in another
> > context than the ring destruction be a good idea?
>
> I think that's just an oversight, and doesn't matter since it's all
> going away anyway. But it would be prudent to delete it regardless!
ok. I did ask because I am in the process of adding a call to
io_napi_free() in the context of a super new cool patch that I am
working on.. I'll call INIT_LIST_HEAD() to reset the list (or its RCU
variant if it exists).
>
> > 3. I am surprised to notice that in __io_napi_do_busy_loop(),
> > list_for_each_entry_rcu() is called to traverse the list but the
> > regular methods list_del() and list_add_tail() are called to update
> > the
> > list instead of their RCU variant.
>
> Should all just use rcu variants.
>
> Here's a mashup of the changes. Would be great if you can test - I'll
> do
> some too, but always good with more than one person testing as it
> tends
> to hit more cases.
I am so glag to have asked. It is going to be a pleasure to test this
and report back the result.
but it appears that my napi_list is very *static* (wink, wink)
next prev parent reply other threads:[~2024-08-12 21:39 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-12 0:34 [PATCH] io_uring/napi: remove duplicate io_napi_entry timeout assignation Olivier Langlois
2024-08-12 1:00 ` Olivier Langlois
2024-08-12 18:10 ` Jens Axboe
2024-08-12 18:11 ` Jens Axboe
2024-08-12 20:15 ` Olivier Langlois
2024-08-12 20:40 ` Jens Axboe
2024-08-12 21:39 ` Olivier Langlois [this message]
2024-08-12 21:45 ` Olivier Langlois
2024-08-12 21:50 ` Jens Axboe
2024-08-13 17:22 ` Olivier Langlois
2024-08-13 18:35 ` Jens Axboe
2024-08-14 0:09 ` Olivier Langlois
2024-08-14 0:31 ` Jens Axboe
2024-08-14 0:44 ` Pavel Begunkov
2024-08-12 18:11 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=01e64ae408b55f2c35e2ae7229e2af8ddde220d7.camel@trillion01.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox