public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Stefan Roesch <[email protected]>, [email protected]
Cc: [email protected], [email protected],
	[email protected], [email protected]
Subject: Re: [PATCH v5 1/3] io_uring: add napi busy polling support
Date: Mon, 21 Nov 2022 16:59:26 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 11/21/22 12:45?PM, Jens Axboe wrote:
> On 11/21/22 12:14?PM, Stefan Roesch wrote:
>> +/*
>> + * io_napi_add() - Add napi id to the busy poll list
>> + * @file: file pointer for socket
>> + * @ctx:  io-uring context
>> + *
>> + * Add the napi id of the socket to the napi busy poll list.
>> + */
>> +void io_napi_add(struct file *file, struct io_ring_ctx *ctx)
>> +{
>> +	unsigned int napi_id;
>> +	struct socket *sock;
>> +	struct sock *sk;
>> +	struct io_napi_entry *ne;
>> +
>> +	if (!io_napi_busy_loop_on(ctx))
>> +		return;
>> +
>> +	sock = sock_from_file(file);
>> +	if (!sock)
>> +		return;
>> +
>> +	sk = sock->sk;
>> +	if (!sk)
>> +		return;
>> +
>> +	napi_id = READ_ONCE(sk->sk_napi_id);
>> +
>> +	/* Non-NAPI IDs can be rejected */
>> +	if (napi_id < MIN_NAPI_ID)
>> +		return;
>> +
>> +	spin_lock(&ctx->napi_lock);
>> +	list_for_each_entry(ne, &ctx->napi_list, list) {
>> +		if (ne->napi_id == napi_id) {
>> +			ne->timeout = jiffies + NAPI_TIMEOUT;
>> +			goto out;
>> +		}
>> +	}
>> +
>> +	ne = kmalloc(sizeof(*ne), GFP_NOWAIT);
>> +	if (!ne)
>> +		goto out;
>> +
>> +	ne->napi_id = napi_id;
>> +	ne->timeout = jiffies + NAPI_TIMEOUT;
>> +	list_add_tail(&ne->list, &ctx->napi_list);
>> +
>> +out:
>> +	spin_unlock(&ctx->napi_lock);
>> +}
> 
> I think this all looks good now, just one minor comment on the above. Is
> the expectation here that we'll basically always add to the napi list?
> If so, then I think allocating 'ne' outside the spinlock would be a lot
> saner, and then just kfree() it for the unlikely case where we find a
> duplicate.

After thinking about this a bit more, I don't think this is done in the
most optimal fashion. If the list is longer than a few entries, this
check (or check-alloc-insert) is pretty expensive and it'll add
substantial overhead to the poll path for sockets if napi is enabled.

I think we should do something ala:

1) When arming poll AND napi has been enabled for the ring, then
    alloc io_napi_entry upfront and store it in ->async_data.

2) Maintain the state in the io_napi_entry. If we're on the list,
    that can be checked with just list_empty(), for example. If not
    on the list, assign timeout and add.

3) Have regular request cleanup free it.

This could be combined with an alloc cache, I would not do that for the
first iteration though.

This would make io_napi_add() cheap - no more list iteration, and no
more allocations. And that is arguably the most important part, as that
is called everytime the poll is woken up. Particularly for multishot
that makes a big difference.

It's also designed much better imho, moving the more expensive bits to
the setup side.

-- 
Jens Axboe

  reply	other threads:[~2022-11-21 23:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 19:14 [PATCH v5 0/3] io_uring: add napi busy polling support Stefan Roesch
2022-11-21 19:14 ` [PATCH v5 1/3] " Stefan Roesch
2022-11-21 19:45   ` Jens Axboe
2022-11-21 23:59     ` Jens Axboe [this message]
2022-11-21 19:14 ` [PATCH v5 2/3] io_uring: add api to set / get napi configuration Stefan Roesch
2022-11-21 19:46   ` Jens Axboe
2022-11-22 13:13     ` Ammar Faizi
2022-11-22 13:19       ` Jens Axboe
2022-11-25 21:43   ` Ammar Faizi
2022-11-28 20:22     ` Stefan Roesch
2022-11-21 19:14 ` [PATCH v5 3/3] io_uring: add api to set napi prefer busy poll Stefan Roesch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox