From: Pavel Begunkov <[email protected]>
To: Mina Almasry <[email protected]>, David Wei <[email protected]>
Cc: [email protected], [email protected],
Jens Axboe <[email protected]>, Jakub Kicinski <[email protected]>,
Paolo Abeni <[email protected]>,
"David S. Miller" <[email protected]>,
Eric Dumazet <[email protected]>,
Jesper Dangaard Brouer <[email protected]>,
David Ahern <[email protected]>,
Stanislav Fomichev <[email protected]>,
Joe Damato <[email protected]>,
Pedro Tammela <[email protected]>
Subject: Re: [PATCH net-next v13 04/11] io_uring/zcrx: implement zerocopy receive pp memory provider
Date: Thu, 13 Feb 2025 22:37:07 +0000 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAHS8izMOrPWx5X_i+xxjJ8XJyP0Kn-WEcgvK096-WEw1afQ75w@mail.gmail.com>
On 2/13/25 20:57, Mina Almasry wrote:
...
>> +static void io_zcrx_scrub(struct io_zcrx_ifq *ifq)
>> +{
>> + struct io_zcrx_area *area = ifq->area;
>> + int i;
>> +
>> + if (!area)
>> + return;
>> +
>> + /* Reclaim back all buffers given to the user space. */
>> + for (i = 0; i < area->nia.num_niovs; i++) {
>> + struct net_iov *niov = &area->nia.niovs[i];
>> + int nr;
>> +
>> + if (!atomic_read(io_get_user_counter(niov)))
>> + continue;
>> + nr = atomic_xchg(io_get_user_counter(niov), 0);
>> + if (nr && !page_pool_unref_netmem(net_iov_to_netmem(niov), nr))
>> + io_zcrx_return_niov(niov);
>
> I assume nr can be > 1?
Right
If it's always 1, then page_pool_put_netmem()
> does the page_pool_unref_netmem() + page_pool_put_unrefed_netmem() a
> bit more succinctly.
...
>> + entries = io_zcrx_rqring_entries(ifq);
>> + entries = min_t(unsigned, entries, PP_ALLOC_CACHE_REFILL - pp->alloc.count);
>> + if (unlikely(!entries)) {
>> + spin_unlock_bh(&ifq->rq_lock);
>> + return;
>> + }
>> +
>> + do {
>> + struct io_uring_zcrx_rqe *rqe = io_zcrx_get_rqe(ifq, mask);
>> + struct io_zcrx_area *area;
>> + struct net_iov *niov;
>> + unsigned niov_idx, area_idx;
>> +
>> + area_idx = rqe->off >> IORING_ZCRX_AREA_SHIFT;
>> + niov_idx = (rqe->off & ~IORING_ZCRX_AREA_MASK) >> PAGE_SHIFT;
>> +
>> + if (unlikely(rqe->__pad || area_idx))
>> + continue;
>
> nit: I believe a lot of the unlikely in the file are redundant. AFAIU
> the compiler always treats the condition inside the if as unlikely by
> default if there is no else statement.
That'd be too presumptious of the compiler. Sections can be reshuffled,
but even without that, the code generation often looks different. The
annotation is in the right place.
...
>> +static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
>> +{
>> + struct io_zcrx_ifq *ifq = pp->mp_priv;
>> +
>> + /* pp should already be ensuring that */
>> + if (unlikely(pp->alloc.count))
>> + goto out_return;
>> +
>
> As the comment notes, this is a very defensive check that can be
> removed. We pp should never invoke alloc_netmems if it has items in
> the cache.
Maybe I'll kill it in the future, but it might be a good idea to
leave it be as even page_pool.c itself doesn't trust it too much,
see __page_pool_alloc_pages_slow().
>> + io_zcrx_ring_refill(pp, ifq);
>> + if (likely(pp->alloc.count))
>> + goto out_return;
>> +
>> + io_zcrx_refill_slow(pp, ifq);
>> + if (!pp->alloc.count)
>> + return 0;
>> +out_return:
>> + return pp->alloc.cache[--pp->alloc.count];
>> +}
>> +
>> +static bool io_pp_zc_release_netmem(struct page_pool *pp, netmem_ref netmem)
>> +{
>> + struct net_iov *niov;
>> +
>> + if (WARN_ON_ONCE(!netmem_is_net_iov(netmem)))
>> + return false;
>> +
>
> Also a very defensive check that can be removed. There should be no
> way for the pp to release a netmem to the provider that didn't come
Agree, but it's a warning and I don't care about performance
of this chunk to that extent. Maybe we'll remove it later.
> from this provider. netmem should be guaranteed to be a net_iov, and
Not like it matters for now, but I wouldn't say it should be
net_iov, those callback were initially proposed for huge pages.
> also an io_uring net_iov (not dma-buf one), and specifically be a
> net_iov from this particular memory provider.
>
>> + niov = netmem_to_net_iov(netmem);
>> + net_mp_niov_clear_page_pool(niov);
>> + io_zcrx_return_niov_freelist(niov);
>> + return false;
>> +}
>> +
>> +static int io_pp_zc_init(struct page_pool *pp)
>> +{
>> + struct io_zcrx_ifq *ifq = pp->mp_priv;
>> +
>> + if (WARN_ON_ONCE(!ifq))
>> + return -EINVAL;
>> + if (pp->dma_map)
>> + return -EOPNOTSUPP;
>
> This condition should be flipped actually. pp->dma_map should be true,
> otherwise the provider isn't supported.
It's not implemented in this patch, which is why rejected.
You can think of it as an unconditional failure, even though
io_pp_zc_init is not reachable just yet.
> From the netmem.rst docs we require that netmem page_pools are
> configured with PP_FLAG_DMA_MAP.
>
> And we actually check that pp->dma_map == true before invoking
> mp_ops->init(). This code shouldn't be working unless I missed
> something.
>
> Also arguably this check is defensive. The pp should confirm that
Sure, and I have nothing against defensive checks in cold paths
> pp->dma_map is true before invoking any memory provider, you should
> assume it is true here (and the devmem provider doesn't check it
> IIRU).
--
Pavel Begunkov
next prev parent reply other threads:[~2025-02-13 22:36 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-12 18:57 [PATCH v13 00/11] io_uring zero copy rx David Wei
2025-02-12 18:57 ` [PATCH net-next v13 01/11] io_uring/zcrx: add interface queue and refill queue David Wei
2025-02-12 18:57 ` [PATCH net-next v13 02/11] io_uring/zcrx: add io_zcrx_area David Wei
2025-02-12 18:57 ` [PATCH net-next v13 03/11] io_uring/zcrx: grab a net device David Wei
2025-02-12 18:57 ` [PATCH net-next v13 04/11] io_uring/zcrx: implement zerocopy receive pp memory provider David Wei
2025-02-13 20:57 ` Mina Almasry
2025-02-13 22:37 ` Pavel Begunkov [this message]
2025-02-13 22:46 ` Mina Almasry
2025-02-13 23:05 ` Pavel Begunkov
2025-02-12 18:57 ` [PATCH net-next v13 05/11] io_uring/zcrx: dma-map area for the device David Wei
2025-02-12 18:57 ` [PATCH net-next v13 06/11] io_uring/zcrx: add io_recvzc request David Wei
2025-02-12 18:57 ` [PATCH net-next v13 07/11] io_uring/zcrx: set pp memory provider for an rx queue David Wei
2025-02-12 18:57 ` [PATCH net-next v13 08/11] io_uring/zcrx: throttle receive requests David Wei
2025-02-12 18:57 ` [PATCH net-next v13 09/11] io_uring/zcrx: add copy fallback David Wei
2025-02-12 18:58 ` [PATCH net-next v13 10/11] net: add documentation for io_uring zcrx David Wei
2025-02-12 18:58 ` [PATCH net-next v13 11/11] io_uring/zcrx: add selftest David Wei
2025-02-14 2:27 ` lizetao
2025-02-14 15:52 ` David Wei
2025-02-14 21:57 ` David Wei
2025-02-14 22:31 ` David Wei
2025-02-13 17:44 ` [PATCH v13 00/11] io_uring zero copy rx Jens Axboe
2025-02-13 23:02 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox