public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Mina Almasry <[email protected]>, David Wei <[email protected]>
Cc: [email protected], [email protected],
	Jens Axboe <[email protected]>, Jakub Kicinski <[email protected]>,
	Paolo Abeni <[email protected]>,
	"David S. Miller" <[email protected]>,
	Eric Dumazet <[email protected]>,
	Jesper Dangaard Brouer <[email protected]>,
	David Ahern <[email protected]>,
	Stanislav Fomichev <[email protected]>,
	Joe Damato <[email protected]>,
	Pedro Tammela <[email protected]>
Subject: Re: [PATCH net-next v9 14/20] io_uring/zcrx: dma-map area for the device
Date: Thu, 9 Jan 2025 16:35:22 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAHS8izMKM_if=jZj3Cw0XAaKrfhX31EoqzRR9Dh+7MbiUkUS1w@mail.gmail.com>

On 1/7/25 20:23, Mina Almasry wrote:
> On Tue, Dec 17, 2024 at 4:38 PM David Wei <[email protected]> wrote:
...
>> +
>> +               if (unlikely(niov_idx >= area->nia.num_niovs))
>> +                       continue;
>> +               niov_idx = array_index_nospec(niov_idx, area->nia.num_niovs);
>> +
>> +               niov = &area->nia.niovs[niov_idx];
>> +               if (!io_zcrx_put_niov_uref(niov))
>> +                       continue;
> 
> I have a suspicion that uref is now redundant in this series, although

It's not. You can't lose track of buffers given to the user. It plays
a similar role to devmem's ->sk_user_frags, think what happens if you
don't have it and don't track buffers in any other way.

> I'm not 100% sure. You seem to acquire a uref and pp_ref in tandem in
> io_zcrx_recv_frag and drop both in tandem in this function, which
> makes me think the uref maybe is redundant now.
> 
> io_zcrx_copy_chunk acquires a uref but not a pp_ref. I wonder if
> copy_chunk can do a page_pool_ref_netmem() instead of a uref, maybe

It takes both references.

> you would be able to make do without urefs at all. I have not looked
> at the copy fallback code closely.

If we're talking about optimisations, there is a way I described
and going to pursue, but that's not for the initial set.

>> +
>> +               netmem = net_iov_to_netmem(niov);
>> +               if (page_pool_unref_netmem(netmem, 1) != 0)
>> +                       continue;
>> +
>> +               if (unlikely(niov->pp != pp)) {
> 
>  From niov->pp != pp I surmise in this iteration one io_zcrx_area can
> serve niovs to multiple RX queues?

It should, but the main goal was rather to support multiple pools
per queue because of queue api shortcomings, even if it almost
never happens.

> The last 5 lines or so is basically doing  what page_pool_put_netmem()
> does, except there is a pp != niov->pp check in the middle. Can we
> call page_pool_put_netmem() directly if pp != niov->pp? It would just
> reduce the code duplication a bit and reduce the amount of custom
> reffing code we need to add for this mp.

Right, that sub path is basically page_pool_put_netmem(). Can be
replaced, but it's not going to de-duplicate code as the path is
shared with page_pool_mp_return_in_cache(). And it'd likely bloat
the binary a bit, though it's not that important.

>> +                       continue;
>> +               }
>> +
>> +               page_pool_mp_return_in_cache(pp, netmem);
> 
> So if niov->pp != pp, we end up basically doing a
> page_pool_put_netmem(), which is the 'correct' way to return a netmem
> to the page_pool, or at least is the way to return a netmem that all
> the other devmem/pages memory types uses. However if niov->pp == pp,
> we end up page_pool_mp_return_in_cache(), which is basically the same
> as page_pool_put_unrefed_netmem but skips the ptr_ring, so it's
> slightly faster and less overhead.

Jumping through the loops is surely not great, but there are bigger
semantical reasons. page_pool_put_netmem() has always been called by
users from the outside, this one is off the page pool allocation path.
For example, it'd nest io_uring with ptr_ring, which is not a bug and
not so bad, but that's something you'd need to always consider while
patching generic page pool. On top with the ptr_ring path in there,
either providers would need to make implicit assumptions that it'd
never happen, which is shabby, or the code should be prepared to that.
I'd say it should be more convenient to have a separate and simple
for all that.

We can play with the API more, hopefully after it's merged, but
just replacing it with page_pool_put_netmem() would do a disservice.

> I would honestly elect to page_pool_put_netmem() regardless of
> niov->pp/pp check. Sure it would be a bit more overhead than the code
> here, but it would reduce the custom pp refing code we need to add for
> this mp and it will replenish the ptr_ring in both cases, which may be
> even faster by reducing the number of times we need to replenish. We
> can always add micro optimizations like skipping the ptr_ring for
> slightly faster code if there is evidence there is significant perf
> improvement.
> 
>> +       } while (--entries);
>> +
>> +       smp_store_release(&ifq->rq_ring->head, ifq->cached_rq_head);
>> +       spin_unlock_bh(&ifq->rq_lock);
>> +}
>> +
...
>> +static bool io_pp_zc_release_netmem(struct page_pool *pp, netmem_ref netmem)
>> +{
>> +       if (WARN_ON_ONCE(!netmem_is_net_iov(netmem)))
>> +               return false;
>> +
>> +       if (page_pool_unref_netmem(netmem, 1) == 0)
> 
> Check is redundant, AFAICT. pp would never release a netmem unless the
> pp refcount is 1.

Good catch, it was applied to v10

-- 
Pavel Begunkov


  reply	other threads:[~2025-01-09 16:34 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-18  0:37 [PATCH RESEND net-next v9 00/21] io_uring zero copy rx David Wei
2024-12-18  0:37 ` [PATCH net-next v9 01/20] net: page_pool: don't cast mp param to devmem David Wei
2024-12-20 22:04   ` Jakub Kicinski
2025-01-06 20:45   ` Mina Almasry
2024-12-18  0:37 ` [PATCH net-next v9 02/20] net: prefix devmem specific helpers David Wei
2024-12-20 22:05   ` Jakub Kicinski
2024-12-18  0:37 ` [PATCH net-next v9 03/20] net: generalise net_iov chunk owners David Wei
2024-12-20 22:14   ` Jakub Kicinski
2024-12-21  0:50     ` Pavel Begunkov
2024-12-21  2:17       ` Jakub Kicinski
2025-01-02 15:52         ` Pavel Begunkov
2025-01-06 21:05   ` Mina Almasry
2024-12-18  0:37 ` [PATCH net-next v9 04/20] net: page_pool: create hooks for custom page providers David Wei
2025-01-02 15:54   ` Pavel Begunkov
2024-12-18  0:37 ` [PATCH net-next v9 05/20] net: page_pool: add mp op for netlink reporting David Wei
2024-12-20 22:16   ` Jakub Kicinski
2025-01-06 23:21     ` David Wei
2025-01-06 21:24   ` Mina Almasry
2024-12-18  0:37 ` [PATCH net-next v9 06/20] net: page_pool: add a mp hook to unregister_netdevice* David Wei
2024-12-20 22:18   ` Jakub Kicinski
2025-01-06 21:44   ` Mina Almasry
2025-01-06 23:34     ` David Wei
2025-01-06 23:39     ` Pavel Begunkov
2024-12-18  0:37 ` [PATCH net-next v9 07/20] net: prepare for non devmem TCP memory providers David Wei
2024-12-20 22:18   ` Jakub Kicinski
2024-12-18  0:37 ` [PATCH net-next v9 08/20] net: expose page_pool_{set,clear}_pp_info David Wei
2024-12-20 22:31   ` Jakub Kicinski
2024-12-21  1:07     ` Pavel Begunkov
2024-12-21  2:23       ` Jakub Kicinski
2025-01-02 16:21         ` Pavel Begunkov
2025-01-06 22:17           ` Mina Almasry
2025-01-06 23:48             ` Pavel Begunkov
2024-12-18  0:37 ` [PATCH net-next v9 09/20] net: page_pool: introduce page_pool_mp_return_in_cache David Wei
2024-12-18  0:37 ` [PATCH net-next v9 10/20] io_uring/zcrx: add interface queue and refill queue David Wei
2024-12-18  0:37 ` [PATCH net-next v9 11/20] io_uring/zcrx: add io_zcrx_area David Wei
2025-01-06 22:46   ` Mina Almasry
2025-01-07  0:04     ` Pavel Begunkov
2024-12-18  0:37 ` [PATCH net-next v9 12/20] io_uring/zcrx: grab a net device David Wei
2024-12-18  0:37 ` [PATCH net-next v9 13/20] net: page pool: export page_pool_set_dma_addr_netmem() David Wei
2024-12-18  0:37 ` [PATCH net-next v9 14/20] io_uring/zcrx: dma-map area for the device David Wei
2024-12-20 22:38   ` Jakub Kicinski
2024-12-21  1:04     ` Pavel Begunkov
2025-01-07 20:23   ` Mina Almasry
2025-01-09 16:35     ` Pavel Begunkov [this message]
2024-12-18  0:37 ` [PATCH net-next v9 15/20] io_uring/zcrx: add io_recvzc request David Wei
2024-12-18  0:37 ` [PATCH net-next v9 16/20] io_uring/zcrx: set pp memory provider for an rx queue David Wei
2024-12-18  0:37 ` [PATCH net-next v9 17/20] io_uring/zcrx: throttle receive requests David Wei
2024-12-18  0:37 ` [PATCH net-next v9 18/20] io_uring/zcrx: add copy fallback David Wei
2024-12-18  0:37 ` [PATCH net-next v9 19/20] net: add documentation for io_uring zcrx David Wei
2024-12-18  0:37 ` [PATCH net-next v9 20/20] io_uring/zcrx: add selftest David Wei
2024-12-18 14:40 ` [PATCH RESEND net-next v9 00/21] io_uring zero copy rx Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox