public inbox for [email protected]
 help / color / mirror / Atom feed
From: Olivier Langlois <[email protected]>
To: Hao Xu <[email protected]>, Jens Axboe <[email protected]>,
	Pavel Begunkov <[email protected]>
Cc: io-uring <[email protected]>,
	linux-kernel <[email protected]>
Subject: Re: [PATCH v4 2/2] io_uring: Add support for napi_busy_poll
Date: Tue, 01 Mar 2022 15:06:48 -0500	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On Wed, 2022-03-02 at 02:31 +0800, Hao Xu wrote:
> 
> > +       ne = kmalloc(sizeof(*ne), GFP_NOWAIT);
> > +       if (!ne)
> > +               goto out;
> 
> IMHO, we need to handle -ENOMEM here, I cut off the error handling
> when
> 
> I did the quick coding. Sorry for misleading.

If you are correct, I would be shocked about this.

I did return in my 'Linux Device Drivers' book and nowhere it is
mentionned that the kmalloc() can return something else than a pointer

No mention at all about the return value

in man page:
https://www.kernel.org/doc/htmldocs/kernel-api/API-kmalloc.html
API doc:

https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=kmalloc#c.kmalloc

header file:
https://elixir.bootlin.com/linux/latest/source/include/linux/slab.h#L522

I did browse into the kmalloc code. There is a lot of paths to cover
but from preliminary reading, it pretty much seems that kmalloc only
returns a valid pointer or NULL...

/**
 * kmem_cache_alloc - Allocate an object
 * @cachep: The cache to allocate from.
 * @flags: See kmalloc().
 *
 * Allocate an object from this cache.  The flags are only relevant
 * if the cache has no available objects.
 *
 * Return: pointer to the new object or %NULL in case of error
 */
 
 /**
 * __do_kmalloc - allocate memory
 * @size: how many bytes of memory are required.
 * @flags: the type of memory to allocate (see kmalloc).
 * @caller: function caller for debug tracking of the caller
 *
 * Return: pointer to the allocated memory or %NULL in case of error
 */

I'll need someone else to confirm about possible kmalloc() return
values with perhaps an example

I am a bit skeptic that something special needs to be done here...

Or perhaps you are suggesting that io_add_napi() returns an error code
when allocation fails.

as done here:
https://elixir.bootlin.com/linux/latest/source/arch/alpha/kernel/core_marvel.c#L867

If that is what you suggest, what would this info do for the caller?

IMHO, it wouldn't help in any way...
> 
> > 
> > @@ -7519,7 +7633,11 @@ static int __io_sq_thread(struct io_ring_ctx
> > *ctx, bool cap_entries)
> >                     !(ctx->flags & IORING_SETUP_R_DISABLED))
> >                         ret = io_submit_sqes(ctx, to_submit);
> >                 mutex_unlock(&ctx->uring_lock);
> > -
> > +#ifdef CONFIG_NET_RX_BUSY_POLL
> > +               if (!list_empty(&ctx->napi_list) &&
> > +                   io_napi_busy_loop(&ctx->napi_list))
> 
> I'm afraid we may need lock for sqpoll too, since io_add_napi() could
> be 
> in iowq context.
> 
> I'll take a look at the lock stuff of this patch tomorrow, too late
> now 
> in my timezone.

Ok, please do. I'm not a big user of io workers. I may have omitted to
consider this possibility.

If that is the case, I think that this would be very easy to fix by
locking the spinlock while __io_sq_thread() is using napi_list.
> 
> How about:
> 
> if (list is singular) {
> 
>      do something;
> 
>      return;
> 
> }
> 
> while (!io_busy_loop_end() && io_napi_busy_loop())
> 
>      ;
> 

is there a concern with the current code?
What would be the benefit of your suggestion over current code?

To me, it seems that if io_blocking_napi_busy_loop() is called, a
reasonable expectation would be that some busy looping is done or else
you could return the function without doing anything which would, IMHO,
be misleading.

By definition, napi_busy_loop() is not blocking and if you desire the
device to be in busy poll mode, you need to do it once in a while or
else, after a certain time, the device will return back to its
interrupt mode.

IOW, io_blocking_napi_busy_loop() follows the same logic used by
napi_busy_loop() that does not call loop_end() before having perform 1
loop iteration.

> Btw, start_time seems not used in singular branch.

I know. This is why it is conditionally initialized.

Greetings,


  reply	other threads:[~2022-03-01 20:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01 13:47 [PATCH v4 0/2] io_uring: Add support for napi_busy_poll Olivier Langlois
2022-03-01 13:47 ` [PATCH v4 1/2] io_uring: minor io_cqring_wait() optimization Olivier Langlois
2022-03-01 13:47 ` [PATCH v4 2/2] io_uring: Add support for napi_busy_poll Olivier Langlois
2022-03-01 18:31   ` Hao Xu
2022-03-01 20:06     ` Olivier Langlois [this message]
2022-03-01 20:14       ` Olivier Langlois
2022-03-02  6:27       ` Hao Xu
2022-03-02  6:38         ` Hao Xu
2022-03-02 22:03           ` Olivier Langlois
2022-03-03  7:12             ` Hao Xu
2022-03-02  5:12     ` Olivier Langlois
2022-03-02  6:35       ` Hao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f01857ca757ab4f0995420e6b1a6e3668a40da5.camel@trillion01.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox