* [PATCH net-next v12 01/10] net: page_pool: don't cast mp param to devmem
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 02/10] net: prefix devmem specific helpers Pavel Begunkov
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
page_pool_check_memory_provider() is a generic path and shouldn't assume
anything about the actual type of the memory provider argument. It's
fine while devmem is the only provider, but cast away the devmem
specific binding types to avoid confusion.
Reviewed-by: Jakub Kicinski <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
net/core/page_pool_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
index 48335766c1bf..8d31c71bea1a 100644
--- a/net/core/page_pool_user.c
+++ b/net/core/page_pool_user.c
@@ -353,7 +353,7 @@ void page_pool_unlist(struct page_pool *pool)
int page_pool_check_memory_provider(struct net_device *dev,
struct netdev_rx_queue *rxq)
{
- struct net_devmem_dmabuf_binding *binding = rxq->mp_params.mp_priv;
+ void *binding = rxq->mp_params.mp_priv;
struct page_pool *pool;
struct hlist_node *n;
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 02/10] net: prefix devmem specific helpers
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 01/10] net: page_pool: don't cast mp param to devmem Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 03/10] net: generalise net_iov chunk owners Pavel Begunkov
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
Add prefixes to all helpers that are specific to devmem TCP, i.e.
net_iov_binding[_id].
Reviewed-by: Jakub Kicinski <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
net/core/devmem.c | 2 +-
net/core/devmem.h | 14 +++++++-------
net/ipv4/tcp.c | 2 +-
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/net/core/devmem.c b/net/core/devmem.c
index c971b8aceac8..acd3e390a3da 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -94,7 +94,7 @@ net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
void net_devmem_free_dmabuf(struct net_iov *niov)
{
- struct net_devmem_dmabuf_binding *binding = net_iov_binding(niov);
+ struct net_devmem_dmabuf_binding *binding = net_devmem_iov_binding(niov);
unsigned long dma_addr = net_devmem_get_dma_addr(niov);
if (WARN_ON(!gen_pool_has_addr(binding->chunk_pool, dma_addr,
diff --git a/net/core/devmem.h b/net/core/devmem.h
index 76099ef9c482..99782ddeca40 100644
--- a/net/core/devmem.h
+++ b/net/core/devmem.h
@@ -86,11 +86,16 @@ static inline unsigned int net_iov_idx(const struct net_iov *niov)
}
static inline struct net_devmem_dmabuf_binding *
-net_iov_binding(const struct net_iov *niov)
+net_devmem_iov_binding(const struct net_iov *niov)
{
return net_iov_owner(niov)->binding;
}
+static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov)
+{
+ return net_devmem_iov_binding(niov)->id;
+}
+
static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov)
{
struct dmabuf_genpool_chunk_owner *owner = net_iov_owner(niov);
@@ -99,11 +104,6 @@ static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov)
((unsigned long)net_iov_idx(niov) << PAGE_SHIFT);
}
-static inline u32 net_iov_binding_id(const struct net_iov *niov)
-{
- return net_iov_owner(niov)->binding->id;
-}
-
static inline void
net_devmem_dmabuf_binding_get(struct net_devmem_dmabuf_binding *binding)
{
@@ -171,7 +171,7 @@ static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov)
return 0;
}
-static inline u32 net_iov_binding_id(const struct net_iov *niov)
+static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov)
{
return 0;
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0d704bda6c41..b872de9a8271 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2494,7 +2494,7 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
/* Will perform the exchange later */
dmabuf_cmsg.frag_token = tcp_xa_pool.tokens[tcp_xa_pool.idx];
- dmabuf_cmsg.dmabuf_id = net_iov_binding_id(niov);
+ dmabuf_cmsg.dmabuf_id = net_devmem_iov_binding_id(niov);
offset += copy;
remaining_len -= copy;
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 03/10] net: generalise net_iov chunk owners
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 01/10] net: page_pool: don't cast mp param to devmem Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 02/10] net: prefix devmem specific helpers Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 04/10] net: page_pool: create hooks for custom memory providers Pavel Begunkov
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
Currently net_iov stores a pointer to struct dmabuf_genpool_chunk_owner,
which serves as a useful abstraction to share data and provide a
context. However, it's too devmem specific, and we want to reuse it for
other memory providers, and for that we need to decouple net_iov from
devmem. Make net_iov to point to a new base structure called
net_iov_area, which dmabuf_genpool_chunk_owner extends.
Reviewed-by: Mina Almasry <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
include/net/netmem.h | 21 ++++++++++++++++++++-
net/core/devmem.c | 25 +++++++++++++------------
net/core/devmem.h | 25 +++++++++----------------
3 files changed, 42 insertions(+), 29 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 1b58faa4f20f..c61d5b21e7b4 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -24,11 +24,20 @@ struct net_iov {
unsigned long __unused_padding;
unsigned long pp_magic;
struct page_pool *pp;
- struct dmabuf_genpool_chunk_owner *owner;
+ struct net_iov_area *owner;
unsigned long dma_addr;
atomic_long_t pp_ref_count;
};
+struct net_iov_area {
+ /* Array of net_iovs for this area. */
+ struct net_iov *niovs;
+ size_t num_niovs;
+
+ /* Offset into the dma-buf where this chunk starts. */
+ unsigned long base_virtual;
+};
+
/* These fields in struct page are used by the page_pool and net stack:
*
* struct {
@@ -54,6 +63,16 @@ NET_IOV_ASSERT_OFFSET(dma_addr, dma_addr);
NET_IOV_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
#undef NET_IOV_ASSERT_OFFSET
+static inline struct net_iov_area *net_iov_owner(const struct net_iov *niov)
+{
+ return niov->owner;
+}
+
+static inline unsigned int net_iov_idx(const struct net_iov *niov)
+{
+ return niov - net_iov_owner(niov)->niovs;
+}
+
/* netmem */
/**
diff --git a/net/core/devmem.c b/net/core/devmem.c
index acd3e390a3da..3d91fba2bd26 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -33,14 +33,15 @@ static void net_devmem_dmabuf_free_chunk_owner(struct gen_pool *genpool,
{
struct dmabuf_genpool_chunk_owner *owner = chunk->owner;
- kvfree(owner->niovs);
+ kvfree(owner->area.niovs);
kfree(owner);
}
static dma_addr_t net_devmem_get_dma_addr(const struct net_iov *niov)
{
- struct dmabuf_genpool_chunk_owner *owner = net_iov_owner(niov);
+ struct dmabuf_genpool_chunk_owner *owner;
+ owner = net_devmem_iov_to_chunk_owner(niov);
return owner->base_dma_addr +
((dma_addr_t)net_iov_idx(niov) << PAGE_SHIFT);
}
@@ -83,7 +84,7 @@ net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
offset = dma_addr - owner->base_dma_addr;
index = offset / PAGE_SIZE;
- niov = &owner->niovs[index];
+ niov = &owner->area.niovs[index];
niov->pp_magic = 0;
niov->pp = NULL;
@@ -261,9 +262,9 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
goto err_free_chunks;
}
- owner->base_virtual = virtual;
+ owner->area.base_virtual = virtual;
owner->base_dma_addr = dma_addr;
- owner->num_niovs = len / PAGE_SIZE;
+ owner->area.num_niovs = len / PAGE_SIZE;
owner->binding = binding;
err = gen_pool_add_owner(binding->chunk_pool, dma_addr,
@@ -275,17 +276,17 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
goto err_free_chunks;
}
- owner->niovs = kvmalloc_array(owner->num_niovs,
- sizeof(*owner->niovs),
- GFP_KERNEL);
- if (!owner->niovs) {
+ owner->area.niovs = kvmalloc_array(owner->area.num_niovs,
+ sizeof(*owner->area.niovs),
+ GFP_KERNEL);
+ if (!owner->area.niovs) {
err = -ENOMEM;
goto err_free_chunks;
}
- for (i = 0; i < owner->num_niovs; i++) {
- niov = &owner->niovs[i];
- niov->owner = owner;
+ for (i = 0; i < owner->area.num_niovs; i++) {
+ niov = &owner->area.niovs[i];
+ niov->owner = &owner->area;
page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
net_devmem_get_dma_addr(niov));
}
diff --git a/net/core/devmem.h b/net/core/devmem.h
index 99782ddeca40..a2b9913e9a17 100644
--- a/net/core/devmem.h
+++ b/net/core/devmem.h
@@ -10,6 +10,8 @@
#ifndef _NET_DEVMEM_H
#define _NET_DEVMEM_H
+#include <net/netmem.h>
+
struct netlink_ext_ack;
struct net_devmem_dmabuf_binding {
@@ -51,17 +53,11 @@ struct net_devmem_dmabuf_binding {
* allocations from this chunk.
*/
struct dmabuf_genpool_chunk_owner {
- /* Offset into the dma-buf where this chunk starts. */
- unsigned long base_virtual;
+ struct net_iov_area area;
+ struct net_devmem_dmabuf_binding *binding;
/* dma_addr of the start of the chunk. */
dma_addr_t base_dma_addr;
-
- /* Array of net_iovs for this chunk. */
- struct net_iov *niovs;
- size_t num_niovs;
-
- struct net_devmem_dmabuf_binding *binding;
};
void __net_devmem_dmabuf_binding_free(struct net_devmem_dmabuf_binding *binding);
@@ -75,20 +71,17 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
void dev_dmabuf_uninstall(struct net_device *dev);
static inline struct dmabuf_genpool_chunk_owner *
-net_iov_owner(const struct net_iov *niov)
+net_devmem_iov_to_chunk_owner(const struct net_iov *niov)
{
- return niov->owner;
-}
+ struct net_iov_area *owner = net_iov_owner(niov);
-static inline unsigned int net_iov_idx(const struct net_iov *niov)
-{
- return niov - net_iov_owner(niov)->niovs;
+ return container_of(owner, struct dmabuf_genpool_chunk_owner, area);
}
static inline struct net_devmem_dmabuf_binding *
net_devmem_iov_binding(const struct net_iov *niov)
{
- return net_iov_owner(niov)->binding;
+ return net_devmem_iov_to_chunk_owner(niov)->binding;
}
static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov)
@@ -98,7 +91,7 @@ static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov)
static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov)
{
- struct dmabuf_genpool_chunk_owner *owner = net_iov_owner(niov);
+ struct net_iov_area *owner = net_iov_owner(niov);
return owner->base_virtual +
((unsigned long)net_iov_idx(niov) << PAGE_SHIFT);
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 04/10] net: page_pool: create hooks for custom memory providers
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (2 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 03/10] net: generalise net_iov chunk owners Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 05/10] netdev: add io_uring memory provider info Pavel Begunkov
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
A spin off from the original page pool memory providers patch by Jakub,
which allows extending page pools with custom allocators. One of such
providers is devmem TCP, and the other is io_uring zerocopy added in
following patches.
Link: https://lore.kernel.org/netdev/[email protected]/
Co-developed-by: Jakub Kicinski <[email protected]> # initial mp proposal
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
include/net/page_pool/memory_provider.h | 15 +++++++++++++++
include/net/page_pool/types.h | 4 ++++
net/core/devmem.c | 15 ++++++++++++++-
net/core/page_pool.c | 23 +++++++++++++++--------
4 files changed, 48 insertions(+), 9 deletions(-)
create mode 100644 include/net/page_pool/memory_provider.h
diff --git a/include/net/page_pool/memory_provider.h b/include/net/page_pool/memory_provider.h
new file mode 100644
index 000000000000..e49d0a52629d
--- /dev/null
+++ b/include/net/page_pool/memory_provider.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _NET_PAGE_POOL_MEMORY_PROVIDER_H
+#define _NET_PAGE_POOL_MEMORY_PROVIDER_H
+
+#include <net/netmem.h>
+#include <net/page_pool/types.h>
+
+struct memory_provider_ops {
+ netmem_ref (*alloc_netmems)(struct page_pool *pool, gfp_t gfp);
+ bool (*release_netmem)(struct page_pool *pool, netmem_ref netmem);
+ int (*init)(struct page_pool *pool);
+ void (*destroy)(struct page_pool *pool);
+};
+
+#endif
diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index ed4cd114180a..88f65c3e2ad9 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -152,8 +152,11 @@ struct page_pool_stats {
*/
#define PAGE_POOL_FRAG_GROUP_ALIGN (4 * sizeof(long))
+struct memory_provider_ops;
+
struct pp_memory_provider_params {
void *mp_priv;
+ const struct memory_provider_ops *mp_ops;
};
struct page_pool {
@@ -216,6 +219,7 @@ struct page_pool {
struct ptr_ring ring;
void *mp_priv;
+ const struct memory_provider_ops *mp_ops;
#ifdef CONFIG_PAGE_POOL_STATS
/* recycle stats are per-cpu to avoid locking */
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 3d91fba2bd26..1a88ab6faf06 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -16,6 +16,7 @@
#include <net/netdev_queues.h>
#include <net/netdev_rx_queue.h>
#include <net/page_pool/helpers.h>
+#include <net/page_pool/memory_provider.h>
#include <trace/events/page_pool.h>
#include "devmem.h"
@@ -27,6 +28,8 @@
/* Protected by rtnl_lock() */
static DEFINE_XARRAY_FLAGS(net_devmem_dmabuf_bindings, XA_FLAGS_ALLOC1);
+static const struct memory_provider_ops dmabuf_devmem_ops;
+
static void net_devmem_dmabuf_free_chunk_owner(struct gen_pool *genpool,
struct gen_pool_chunk *chunk,
void *not_used)
@@ -118,6 +121,7 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding)
WARN_ON(rxq->mp_params.mp_priv != binding);
rxq->mp_params.mp_priv = NULL;
+ rxq->mp_params.mp_ops = NULL;
rxq_idx = get_netdev_rx_queue_index(rxq);
@@ -153,7 +157,7 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
}
rxq = __netif_get_rx_queue(dev, rxq_idx);
- if (rxq->mp_params.mp_priv) {
+ if (rxq->mp_params.mp_ops) {
NL_SET_ERR_MSG(extack, "designated queue already memory provider bound");
return -EEXIST;
}
@@ -171,6 +175,7 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
return err;
rxq->mp_params.mp_priv = binding;
+ rxq->mp_params.mp_ops = &dmabuf_devmem_ops;
err = netdev_rx_queue_restart(dev, rxq_idx);
if (err)
@@ -180,6 +185,7 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
err_xa_erase:
rxq->mp_params.mp_priv = NULL;
+ rxq->mp_params.mp_ops = NULL;
xa_erase(&binding->bound_rxqs, xa_idx);
return err;
@@ -399,3 +405,10 @@ bool mp_dmabuf_devmem_release_page(struct page_pool *pool, netmem_ref netmem)
/* We don't want the page pool put_page()ing our net_iovs. */
return false;
}
+
+static const struct memory_provider_ops dmabuf_devmem_ops = {
+ .init = mp_dmabuf_devmem_init,
+ .destroy = mp_dmabuf_devmem_destroy,
+ .alloc_netmems = mp_dmabuf_devmem_alloc_netmems,
+ .release_netmem = mp_dmabuf_devmem_release_page,
+};
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index a3de752c5178..199564b03533 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -13,6 +13,7 @@
#include <net/netdev_rx_queue.h>
#include <net/page_pool/helpers.h>
+#include <net/page_pool/memory_provider.h>
#include <net/xdp.h>
#include <linux/dma-direction.h>
@@ -285,13 +286,19 @@ static int page_pool_init(struct page_pool *pool,
rxq = __netif_get_rx_queue(pool->slow.netdev,
pool->slow.queue_idx);
pool->mp_priv = rxq->mp_params.mp_priv;
+ pool->mp_ops = rxq->mp_params.mp_ops;
}
- if (pool->mp_priv) {
+ if (pool->mp_ops) {
if (!pool->dma_map || !pool->dma_sync)
return -EOPNOTSUPP;
- err = mp_dmabuf_devmem_init(pool);
+ if (WARN_ON(!is_kernel_rodata((unsigned long)pool->mp_ops))) {
+ err = -EFAULT;
+ goto free_ptr_ring;
+ }
+
+ err = pool->mp_ops->init(pool);
if (err) {
pr_warn("%s() mem-provider init failed %d\n", __func__,
err);
@@ -588,8 +595,8 @@ netmem_ref page_pool_alloc_netmems(struct page_pool *pool, gfp_t gfp)
return netmem;
/* Slow-path: cache empty, do real allocation */
- if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_priv)
- netmem = mp_dmabuf_devmem_alloc_netmems(pool, gfp);
+ if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
+ netmem = pool->mp_ops->alloc_netmems(pool, gfp);
else
netmem = __page_pool_alloc_pages_slow(pool, gfp);
return netmem;
@@ -680,8 +687,8 @@ void page_pool_return_page(struct page_pool *pool, netmem_ref netmem)
bool put;
put = true;
- if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_priv)
- put = mp_dmabuf_devmem_release_page(pool, netmem);
+ if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
+ put = pool->mp_ops->release_netmem(pool, netmem);
else
__page_pool_release_page_dma(pool, netmem);
@@ -1049,8 +1056,8 @@ static void __page_pool_destroy(struct page_pool *pool)
page_pool_unlist(pool);
page_pool_uninit(pool);
- if (pool->mp_priv) {
- mp_dmabuf_devmem_destroy(pool);
+ if (pool->mp_ops) {
+ pool->mp_ops->destroy(pool);
static_branch_dec(&page_pool_mem_providers);
}
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 05/10] netdev: add io_uring memory provider info
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (3 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 04/10] net: page_pool: create hooks for custom memory providers Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 06/10] net: page_pool: add callback for mp info printing Pavel Begunkov
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
From: David Wei <[email protected]>
Add a nested attribute for io_uring memory provider info. For now it is
empty and its presence indicates that a particular page pool or queue
has an io_uring memory provider attached.
$ ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get
[{'id': 80,
'ifindex': 2,
'inflight': 64,
'inflight-mem': 262144,
'napi-id': 525},
{'id': 79,
'ifindex': 2,
'inflight': 320,
'inflight-mem': 1310720,
'io_uring': {},
'napi-id': 525},
...
$ ./cli.py --spec netlink/specs/netdev.yaml --dump queue-get
[{'id': 0, 'ifindex': 1, 'type': 'rx'},
{'id': 0, 'ifindex': 1, 'type': 'tx'},
{'id': 0, 'ifindex': 2, 'napi-id': 513, 'type': 'rx'},
{'id': 1, 'ifindex': 2, 'napi-id': 514, 'type': 'rx'},
...
{'id': 12, 'ifindex': 2, 'io_uring': {}, 'napi-id': 525, 'type': 'rx'},
...
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: David Wei <[email protected]>
---
Documentation/netlink/specs/netdev.yaml | 15 +++++++++++++++
include/uapi/linux/netdev.h | 8 ++++++++
tools/include/uapi/linux/netdev.h | 8 ++++++++
3 files changed, 31 insertions(+)
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index cbb544bd6c84..288923e965ae 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -114,6 +114,9 @@ attribute-sets:
doc: Bitmask of enabled AF_XDP features.
type: u64
enum: xsk-flags
+ -
+ name: io-uring-provider-info
+ attributes: []
-
name: page-pool
attributes:
@@ -171,6 +174,11 @@ attribute-sets:
name: dmabuf
doc: ID of the dmabuf this page-pool is attached to.
type: u32
+ -
+ name: io-uring
+ doc: io-uring memory provider information.
+ type: nest
+ nested-attributes: io-uring-provider-info
-
name: page-pool-info
subset-of: page-pool
@@ -296,6 +304,11 @@ attribute-sets:
name: dmabuf
doc: ID of the dmabuf attached to this queue, if any.
type: u32
+ -
+ name: io-uring
+ doc: io_uring memory provider information.
+ type: nest
+ nested-attributes: io-uring-provider-info
-
name: qstats
@@ -572,6 +585,7 @@ operations:
- inflight-mem
- detach-time
- dmabuf
+ - io-uring
dump:
reply: *pp-reply
config-cond: page-pool
@@ -637,6 +651,7 @@ operations:
- napi-id
- ifindex
- dmabuf
+ - io-uring
dump:
request:
attributes:
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index e4be227d3ad6..684090732068 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -86,6 +86,12 @@ enum {
NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
};
+enum {
+
+ __NETDEV_A_IO_URING_PROVIDER_INFO_MAX,
+ NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
+};
+
enum {
NETDEV_A_PAGE_POOL_ID = 1,
NETDEV_A_PAGE_POOL_IFINDEX,
@@ -94,6 +100,7 @@ enum {
NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
NETDEV_A_PAGE_POOL_DETACH_TIME,
NETDEV_A_PAGE_POOL_DMABUF,
+ NETDEV_A_PAGE_POOL_IO_URING,
__NETDEV_A_PAGE_POOL_MAX,
NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
@@ -136,6 +143,7 @@ enum {
NETDEV_A_QUEUE_TYPE,
NETDEV_A_QUEUE_NAPI_ID,
NETDEV_A_QUEUE_DMABUF,
+ NETDEV_A_QUEUE_IO_URING,
__NETDEV_A_QUEUE_MAX,
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index e4be227d3ad6..684090732068 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -86,6 +86,12 @@ enum {
NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
};
+enum {
+
+ __NETDEV_A_IO_URING_PROVIDER_INFO_MAX,
+ NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
+};
+
enum {
NETDEV_A_PAGE_POOL_ID = 1,
NETDEV_A_PAGE_POOL_IFINDEX,
@@ -94,6 +100,7 @@ enum {
NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
NETDEV_A_PAGE_POOL_DETACH_TIME,
NETDEV_A_PAGE_POOL_DMABUF,
+ NETDEV_A_PAGE_POOL_IO_URING,
__NETDEV_A_PAGE_POOL_MAX,
NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
@@ -136,6 +143,7 @@ enum {
NETDEV_A_QUEUE_TYPE,
NETDEV_A_QUEUE_NAPI_ID,
NETDEV_A_QUEUE_DMABUF,
+ NETDEV_A_QUEUE_IO_URING,
__NETDEV_A_QUEUE_MAX,
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 06/10] net: page_pool: add callback for mp info printing
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (4 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 05/10] netdev: add io_uring memory provider info Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 07/10] net: page_pool: add a mp hook to unregister_netdevice* Pavel Begunkov
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
Add a mandatory callback that prints information about the memory
provider to netlink.
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
include/net/page_pool/memory_provider.h | 5 +++++
net/core/devmem.c | 10 ++++++++++
net/core/netdev-genl.c | 11 ++++++-----
net/core/page_pool_user.c | 5 ++---
4 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/include/net/page_pool/memory_provider.h b/include/net/page_pool/memory_provider.h
index e49d0a52629d..6d10a0959d00 100644
--- a/include/net/page_pool/memory_provider.h
+++ b/include/net/page_pool/memory_provider.h
@@ -5,11 +5,16 @@
#include <net/netmem.h>
#include <net/page_pool/types.h>
+struct netdev_rx_queue;
+struct sk_buff;
+
struct memory_provider_ops {
netmem_ref (*alloc_netmems)(struct page_pool *pool, gfp_t gfp);
bool (*release_netmem)(struct page_pool *pool, netmem_ref netmem);
int (*init)(struct page_pool *pool);
void (*destroy)(struct page_pool *pool);
+ int (*nl_fill)(void *mp_priv, struct sk_buff *rsp,
+ struct netdev_rx_queue *rxq);
};
#endif
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 1a88ab6faf06..b33b978fa28f 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -406,9 +406,19 @@ bool mp_dmabuf_devmem_release_page(struct page_pool *pool, netmem_ref netmem)
return false;
}
+static int mp_dmabuf_devmem_nl_fill(void *mp_priv, struct sk_buff *rsp,
+ struct netdev_rx_queue *rxq)
+{
+ const struct net_devmem_dmabuf_binding *binding = mp_priv;
+ int type = rxq ? NETDEV_A_QUEUE_DMABUF : NETDEV_A_PAGE_POOL_DMABUF;
+
+ return nla_put_u32(rsp, type, binding->id);
+}
+
static const struct memory_provider_ops dmabuf_devmem_ops = {
.init = mp_dmabuf_devmem_init,
.destroy = mp_dmabuf_devmem_destroy,
.alloc_netmems = mp_dmabuf_devmem_alloc_netmems,
.release_netmem = mp_dmabuf_devmem_release_page,
+ .nl_fill = mp_dmabuf_devmem_nl_fill,
};
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 715f85c6b62e..5b459b4fef46 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -10,6 +10,7 @@
#include <net/sock.h>
#include <net/xdp.h>
#include <net/xdp_sock.h>
+#include <net/page_pool/memory_provider.h>
#include "dev.h"
#include "devmem.h"
@@ -368,7 +369,7 @@ static int
netdev_nl_queue_fill_one(struct sk_buff *rsp, struct net_device *netdev,
u32 q_idx, u32 q_type, const struct genl_info *info)
{
- struct net_devmem_dmabuf_binding *binding;
+ struct pp_memory_provider_params *params;
struct netdev_rx_queue *rxq;
struct netdev_queue *txq;
void *hdr;
@@ -385,15 +386,15 @@ netdev_nl_queue_fill_one(struct sk_buff *rsp, struct net_device *netdev,
switch (q_type) {
case NETDEV_QUEUE_TYPE_RX:
rxq = __netif_get_rx_queue(netdev, q_idx);
+
if (rxq->napi && nla_put_u32(rsp, NETDEV_A_QUEUE_NAPI_ID,
rxq->napi->napi_id))
goto nla_put_failure;
- binding = rxq->mp_params.mp_priv;
- if (binding &&
- nla_put_u32(rsp, NETDEV_A_QUEUE_DMABUF, binding->id))
+ params = &rxq->mp_params;
+ if (params->mp_ops &&
+ params->mp_ops->nl_fill(params->mp_priv, rsp, rxq))
goto nla_put_failure;
-
break;
case NETDEV_QUEUE_TYPE_TX:
txq = netdev_get_tx_queue(netdev, q_idx);
diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
index 8d31c71bea1a..bd017537fa80 100644
--- a/net/core/page_pool_user.c
+++ b/net/core/page_pool_user.c
@@ -7,9 +7,9 @@
#include <net/netdev_rx_queue.h>
#include <net/page_pool/helpers.h>
#include <net/page_pool/types.h>
+#include <net/page_pool/memory_provider.h>
#include <net/sock.h>
-#include "devmem.h"
#include "page_pool_priv.h"
#include "netdev-genl-gen.h"
@@ -214,7 +214,6 @@ static int
page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
const struct genl_info *info)
{
- struct net_devmem_dmabuf_binding *binding = pool->mp_priv;
size_t inflight, refsz;
void *hdr;
@@ -244,7 +243,7 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
pool->user.detach_time))
goto err_cancel;
- if (binding && nla_put_u32(rsp, NETDEV_A_PAGE_POOL_DMABUF, binding->id))
+ if (pool->mp_ops && pool->mp_ops->nl_fill(pool->mp_priv, rsp, NULL))
goto err_cancel;
genlmsg_end(rsp, hdr);
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 07/10] net: page_pool: add a mp hook to unregister_netdevice*
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (5 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 06/10] net: page_pool: add callback for mp info printing Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 08/10] net: prepare for non devmem TCP memory providers Pavel Begunkov
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
Devmem TCP needs a hook in unregister_netdevice_many_notify() to upkeep
the set tracking queues it's bound to, i.e. ->bound_rxqs. Instead of
devmem sticking directly out of the genetic path, add a mp function.
Reviewed-by: Jakub Kicinski <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
include/net/page_pool/memory_provider.h | 1 +
net/core/dev.c | 16 ++++++++++-
net/core/devmem.c | 36 +++++++++++--------------
net/core/devmem.h | 5 ----
4 files changed, 32 insertions(+), 26 deletions(-)
diff --git a/include/net/page_pool/memory_provider.h b/include/net/page_pool/memory_provider.h
index 6d10a0959d00..36469a7e649f 100644
--- a/include/net/page_pool/memory_provider.h
+++ b/include/net/page_pool/memory_provider.h
@@ -15,6 +15,7 @@ struct memory_provider_ops {
void (*destroy)(struct page_pool *pool);
int (*nl_fill)(void *mp_priv, struct sk_buff *rsp,
struct netdev_rx_queue *rxq);
+ void (*uninstall)(void *mp_priv, struct netdev_rx_queue *rxq);
};
#endif
diff --git a/net/core/dev.c b/net/core/dev.c
index fe5f5855593d..e5a4ba3fc24f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -158,6 +158,7 @@
#include <net/netdev_rx_queue.h>
#include <net/page_pool/types.h>
#include <net/page_pool/helpers.h>
+#include <net/page_pool/memory_provider.h>
#include <net/rps.h>
#include <linux/phy_link_topology.h>
@@ -11721,6 +11722,19 @@ void unregister_netdevice_queue(struct net_device *dev, struct list_head *head)
}
EXPORT_SYMBOL(unregister_netdevice_queue);
+static void dev_memory_provider_uninstall(struct net_device *dev)
+{
+ unsigned int i;
+
+ for (i = 0; i < dev->real_num_rx_queues; i++) {
+ struct netdev_rx_queue *rxq = &dev->_rx[i];
+ struct pp_memory_provider_params *p = &rxq->mp_params;
+
+ if (p->mp_ops && p->mp_ops->uninstall)
+ p->mp_ops->uninstall(rxq->mp_params.mp_priv, rxq);
+ }
+}
+
void unregister_netdevice_many_notify(struct list_head *head,
u32 portid, const struct nlmsghdr *nlh)
{
@@ -11777,7 +11791,7 @@ void unregister_netdevice_many_notify(struct list_head *head,
dev_tcx_uninstall(dev);
dev_xdp_uninstall(dev);
bpf_dev_bound_netdev_unregister(dev);
- dev_dmabuf_uninstall(dev);
+ dev_memory_provider_uninstall(dev);
netdev_offload_xstats_disable_all(dev);
diff --git a/net/core/devmem.c b/net/core/devmem.c
index b33b978fa28f..ebb77d2f30f4 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -320,26 +320,6 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
return ERR_PTR(err);
}
-void dev_dmabuf_uninstall(struct net_device *dev)
-{
- struct net_devmem_dmabuf_binding *binding;
- struct netdev_rx_queue *rxq;
- unsigned long xa_idx;
- unsigned int i;
-
- for (i = 0; i < dev->real_num_rx_queues; i++) {
- binding = dev->_rx[i].mp_params.mp_priv;
- if (!binding)
- continue;
-
- xa_for_each(&binding->bound_rxqs, xa_idx, rxq)
- if (rxq == &dev->_rx[i]) {
- xa_erase(&binding->bound_rxqs, xa_idx);
- break;
- }
- }
-}
-
/*** "Dmabuf devmem memory provider" ***/
int mp_dmabuf_devmem_init(struct page_pool *pool)
@@ -415,10 +395,26 @@ static int mp_dmabuf_devmem_nl_fill(void *mp_priv, struct sk_buff *rsp,
return nla_put_u32(rsp, type, binding->id);
}
+static void mp_dmabuf_devmem_uninstall(void *mp_priv,
+ struct netdev_rx_queue *rxq)
+{
+ struct net_devmem_dmabuf_binding *binding = mp_priv;
+ struct netdev_rx_queue *bound_rxq;
+ unsigned long xa_idx;
+
+ xa_for_each(&binding->bound_rxqs, xa_idx, bound_rxq) {
+ if (bound_rxq == rxq) {
+ xa_erase(&binding->bound_rxqs, xa_idx);
+ break;
+ }
+ }
+}
+
static const struct memory_provider_ops dmabuf_devmem_ops = {
.init = mp_dmabuf_devmem_init,
.destroy = mp_dmabuf_devmem_destroy,
.alloc_netmems = mp_dmabuf_devmem_alloc_netmems,
.release_netmem = mp_dmabuf_devmem_release_page,
.nl_fill = mp_dmabuf_devmem_nl_fill,
+ .uninstall = mp_dmabuf_devmem_uninstall,
};
diff --git a/net/core/devmem.h b/net/core/devmem.h
index a2b9913e9a17..8e999fe2ae67 100644
--- a/net/core/devmem.h
+++ b/net/core/devmem.h
@@ -68,7 +68,6 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding);
int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
struct net_devmem_dmabuf_binding *binding,
struct netlink_ext_ack *extack);
-void dev_dmabuf_uninstall(struct net_device *dev);
static inline struct dmabuf_genpool_chunk_owner *
net_devmem_iov_to_chunk_owner(const struct net_iov *niov)
@@ -145,10 +144,6 @@ net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
return -EOPNOTSUPP;
}
-static inline void dev_dmabuf_uninstall(struct net_device *dev)
-{
-}
-
static inline struct net_iov *
net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
{
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 08/10] net: prepare for non devmem TCP memory providers
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (6 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 07/10] net: page_pool: add a mp hook to unregister_netdevice* Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 09/10] net: page_pool: add memory provider helpers Pavel Begunkov
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
There is a good bunch of places in generic paths assuming that the only
page pool memory provider is devmem TCP. As we want to reuse the net_iov
and provider infrastructure, we need to patch it up and explicitly check
the provider type when we branch into devmem TCP code.
Reviewed-by: Mina Almasry <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
net/core/devmem.c | 5 +++++
net/core/devmem.h | 7 +++++++
net/ipv4/tcp.c | 5 +++++
3 files changed, 17 insertions(+)
diff --git a/net/core/devmem.c b/net/core/devmem.c
index ebb77d2f30f4..2f46f271b80e 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -30,6 +30,11 @@ static DEFINE_XARRAY_FLAGS(net_devmem_dmabuf_bindings, XA_FLAGS_ALLOC1);
static const struct memory_provider_ops dmabuf_devmem_ops;
+bool net_is_devmem_iov(struct net_iov *niov)
+{
+ return niov->pp->mp_ops == &dmabuf_devmem_ops;
+}
+
static void net_devmem_dmabuf_free_chunk_owner(struct gen_pool *genpool,
struct gen_pool_chunk *chunk,
void *not_used)
diff --git a/net/core/devmem.h b/net/core/devmem.h
index 8e999fe2ae67..7fc158d52729 100644
--- a/net/core/devmem.h
+++ b/net/core/devmem.h
@@ -115,6 +115,8 @@ struct net_iov *
net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding);
void net_devmem_free_dmabuf(struct net_iov *ppiov);
+bool net_is_devmem_iov(struct net_iov *niov);
+
#else
struct net_devmem_dmabuf_binding;
@@ -163,6 +165,11 @@ static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov)
{
return 0;
}
+
+static inline bool net_is_devmem_iov(struct net_iov *niov)
+{
+ return false;
+}
#endif
#endif /* _NET_DEVMEM_H */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b872de9a8271..7f43d31c9400 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2476,6 +2476,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
}
niov = skb_frag_net_iov(frag);
+ if (!net_is_devmem_iov(niov)) {
+ err = -ENODEV;
+ goto out;
+ }
+
end = start + skb_frag_size(frag);
copy = end - offset;
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 09/10] net: page_pool: add memory provider helpers
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (7 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 08/10] net: prepare for non devmem TCP memory providers Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 16:11 ` [PATCH net-next v12 10/10] net: add helpers for setting a memory provider on an rx queue Pavel Begunkov
2025-01-17 22:16 ` [PATCH net-next v12 00/10] io_uring zero copy rx Jakub Kicinski
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
Add helpers for memory providers to interact with page pools.
net_mp_niov_{set,clear}_page_pool() serve to [dis]associate a net_iov
with a page pool. If used, the memory provider is responsible to match
"set" calls with "clear" once a net_iov is not going to be used by a page
pool anymore, changing a page pool, etc.
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
include/net/page_pool/memory_provider.h | 19 +++++++++++++++++
net/core/page_pool.c | 28 +++++++++++++++++++++++++
2 files changed, 47 insertions(+)
diff --git a/include/net/page_pool/memory_provider.h b/include/net/page_pool/memory_provider.h
index 36469a7e649f..4f0ffb8f6a0a 100644
--- a/include/net/page_pool/memory_provider.h
+++ b/include/net/page_pool/memory_provider.h
@@ -18,4 +18,23 @@ struct memory_provider_ops {
void (*uninstall)(void *mp_priv, struct netdev_rx_queue *rxq);
};
+bool net_mp_niov_set_dma_addr(struct net_iov *niov, dma_addr_t addr);
+void net_mp_niov_set_page_pool(struct page_pool *pool, struct net_iov *niov);
+void net_mp_niov_clear_page_pool(struct net_iov *niov);
+
+/**
+ * net_mp_netmem_place_in_cache() - give a netmem to a page pool
+ * @pool: the page pool to place the netmem into
+ * @netmem: netmem to give
+ *
+ * Push an accounted netmem into the page pool's allocation cache. The caller
+ * must ensure that there is space in the cache. It should only be called off
+ * the mp_ops->alloc_netmems() path.
+ */
+static inline void net_mp_netmem_place_in_cache(struct page_pool *pool,
+ netmem_ref netmem)
+{
+ pool->alloc.cache[pool->alloc.count++] = netmem;
+}
+
#endif
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 199564b03533..c003b9263bd3 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -1196,3 +1196,31 @@ void page_pool_update_nid(struct page_pool *pool, int new_nid)
}
}
EXPORT_SYMBOL(page_pool_update_nid);
+
+bool net_mp_niov_set_dma_addr(struct net_iov *niov, dma_addr_t addr)
+{
+ return page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov), addr);
+}
+
+/* Associate a niov with a page pool. Should follow with a matching
+ * net_mp_niov_clear_page_pool()
+ */
+void net_mp_niov_set_page_pool(struct page_pool *pool, struct net_iov *niov)
+{
+ netmem_ref netmem = net_iov_to_netmem(niov);
+
+ page_pool_set_pp_info(pool, netmem);
+
+ pool->pages_state_hold_cnt++;
+ trace_page_pool_state_hold(pool, netmem, pool->pages_state_hold_cnt);
+}
+
+/* Disassociate a niov from a page pool. Should only be used in the
+ * ->release_netmem() path.
+ */
+void net_mp_niov_clear_page_pool(struct net_iov *niov)
+{
+ netmem_ref netmem = net_iov_to_netmem(niov);
+
+ page_pool_clear_pp_info(netmem);
+}
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v12 10/10] net: add helpers for setting a memory provider on an rx queue
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (8 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 09/10] net: page_pool: add memory provider helpers Pavel Begunkov
@ 2025-01-17 16:11 ` Pavel Begunkov
2025-01-17 22:16 ` [PATCH net-next v12 00/10] io_uring zero copy rx Jakub Kicinski
10 siblings, 0 replies; 12+ messages in thread
From: Pavel Begunkov @ 2025-01-17 16:11 UTC (permalink / raw)
To: io-uring, netdev
Cc: asml.silence, Jens Axboe, Jakub Kicinski, Paolo Abeni,
David S . Miller, Eric Dumazet, Jesper Dangaard Brouer,
David Ahern, Mina Almasry, Stanislav Fomichev, Joe Damato,
Pedro Tammela, David Wei
From: David Wei <[email protected]>
Add helpers that properly prep or remove a memory provider for an rx
queue then restart the queue.
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: David Wei <[email protected]>
---
include/net/page_pool/memory_provider.h | 5 ++
net/core/netdev_rx_queue.c | 62 +++++++++++++++++++++++++
2 files changed, 67 insertions(+)
diff --git a/include/net/page_pool/memory_provider.h b/include/net/page_pool/memory_provider.h
index 4f0ffb8f6a0a..b3e665897767 100644
--- a/include/net/page_pool/memory_provider.h
+++ b/include/net/page_pool/memory_provider.h
@@ -22,6 +22,11 @@ bool net_mp_niov_set_dma_addr(struct net_iov *niov, dma_addr_t addr);
void net_mp_niov_set_page_pool(struct page_pool *pool, struct net_iov *niov);
void net_mp_niov_clear_page_pool(struct net_iov *niov);
+int net_mp_open_rxq(struct net_device *dev, unsigned ifq_idx,
+ struct pp_memory_provider_params *p);
+void net_mp_close_rxq(struct net_device *dev, unsigned ifq_idx,
+ struct pp_memory_provider_params *old_p);
+
/**
* net_mp_netmem_place_in_cache() - give a netmem to a page pool
* @pool: the page pool to place the netmem into
diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c
index db82786fa0c4..a13deedf6fc1 100644
--- a/net/core/netdev_rx_queue.c
+++ b/net/core/netdev_rx_queue.c
@@ -3,6 +3,7 @@
#include <linux/netdevice.h>
#include <net/netdev_queues.h>
#include <net/netdev_rx_queue.h>
+#include <net/page_pool/memory_provider.h>
#include "page_pool_priv.h"
@@ -80,3 +81,64 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
return err;
}
EXPORT_SYMBOL_NS_GPL(netdev_rx_queue_restart, "NETDEV_INTERNAL");
+
+static int __net_mp_open_rxq(struct net_device *dev, unsigned ifq_idx,
+ struct pp_memory_provider_params *p)
+{
+ struct netdev_rx_queue *rxq;
+ int ret;
+
+ if (ifq_idx >= dev->real_num_rx_queues)
+ return -EINVAL;
+ ifq_idx = array_index_nospec(ifq_idx, dev->real_num_rx_queues);
+
+ rxq = __netif_get_rx_queue(dev, ifq_idx);
+ if (rxq->mp_params.mp_ops)
+ return -EEXIST;
+
+ rxq->mp_params = *p;
+ ret = netdev_rx_queue_restart(dev, ifq_idx);
+ if (ret) {
+ rxq->mp_params.mp_ops = NULL;
+ rxq->mp_params.mp_priv = NULL;
+ }
+ return ret;
+}
+
+int net_mp_open_rxq(struct net_device *dev, unsigned ifq_idx,
+ struct pp_memory_provider_params *p)
+{
+ int ret;
+
+ rtnl_lock();
+ ret = __net_mp_open_rxq(dev, ifq_idx, p);
+ rtnl_unlock();
+ return ret;
+}
+
+static void __net_mp_close_rxq(struct net_device *dev, unsigned ifq_idx,
+ struct pp_memory_provider_params *old_p)
+{
+ struct netdev_rx_queue *rxq;
+
+ if (WARN_ON_ONCE(ifq_idx >= dev->real_num_rx_queues))
+ return;
+
+ rxq = __netif_get_rx_queue(dev, ifq_idx);
+
+ if (WARN_ON_ONCE(rxq->mp_params.mp_ops != old_p->mp_ops ||
+ rxq->mp_params.mp_priv != old_p->mp_priv))
+ return;
+
+ rxq->mp_params.mp_ops = NULL;
+ rxq->mp_params.mp_priv = NULL;
+ WARN_ON(netdev_rx_queue_restart(dev, ifq_idx));
+}
+
+void net_mp_close_rxq(struct net_device *dev, unsigned ifq_idx,
+ struct pp_memory_provider_params *old_p)
+{
+ rtnl_lock();
+ __net_mp_close_rxq(dev, ifq_idx, old_p);
+ rtnl_unlock();
+}
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v12 00/10] io_uring zero copy rx
2025-01-17 16:11 [PATCH net-next v12 00/10] io_uring zero copy rx Pavel Begunkov
` (9 preceding siblings ...)
2025-01-17 16:11 ` [PATCH net-next v12 10/10] net: add helpers for setting a memory provider on an rx queue Pavel Begunkov
@ 2025-01-17 22:16 ` Jakub Kicinski
10 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2025-01-17 22:16 UTC (permalink / raw)
To: Pavel Begunkov
Cc: io-uring, netdev, Jens Axboe, Paolo Abeni, David S . Miller,
Eric Dumazet, Jesper Dangaard Brouer, David Ahern, Mina Almasry,
Stanislav Fomichev, Joe Damato, Pedro Tammela, David Wei
On Fri, 17 Jan 2025 16:11:38 +0000 Pavel Begunkov wrote:
> This patchset contains net/ patches needed by a new io_uring request
> implementing zero copy rx into userspace pages, eliminating a kernel
> to user copy.
>
> We configure a page pool that a driver uses to fill a hw rx queue to
> hand out user pages instead of kernel pages. Any data that ends up
> hitting this hw rx queue will thus be dma'd into userspace memory
> directly, without needing to be bounced through kernel memory. 'Reading'
> data out of a socket instead becomes a _notification_ mechanism, where
> the kernel tells userspace where the data is. The overall approach is
> similar to the devmem TCP proposal.
The YNL codegen is not clean on this series, so CI didn't run
the selftests. Plus we need to resolve the issue of calling
the ops on a dead netdev.
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 684090732068..6c6ee183802d 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -87,7 +87,6 @@ enum {
};
enum {
-
__NETDEV_A_IO_URING_PROVIDER_INFO_MAX,
NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
};
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index 684090732068..6c6ee183802d 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -87,7 +87,6 @@ enum {
};
enum {
-
__NETDEV_A_IO_URING_PROVIDER_INFO_MAX,
NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
};
--
pw-bot: cr
^ permalink raw reply related [flat|nested] 12+ messages in thread