* [PATCH net] net: Allow non parent devices to be used for ZC DMA @ 2025-07-09 12:40 Dragos Tatulea 2025-07-09 13:50 ` Pavel Begunkov ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Dragos Tatulea @ 2025-07-09 12:40 UTC (permalink / raw) To: almasrymina, asml.silence, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jens Axboe, Simona Vetter, Willem de Bruijn, Kaiyuan Zhang Cc: Dragos Tatulea, cratiu, parav, Tariq Toukan, netdev, linux-kernel, io-uring For zerocopy (io_uring, devmem), there is an assumption that the parent device can do DMA. However that is not always the case: ScalableFunction devices have the DMA device in the grandparent. This patch adds a helper for getting the DMA device for a netdev from its parent or grandparent if necessary. The NULL case is handled in the callers. devmem and io_uring are updated accordingly to use this helper instead of directly using the parent. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Fixes: 170aafe35cb9 ("netdev: support binding dma-buf to netdevice") Reviewed-by: Tariq Toukan <tariqt@nvidia.com> --- Changes in v1: - Upgraded from RFC status. - Dropped driver specific bits for generic solution. - Implemented single patch as a fix as requested in RFC. - Handling of multi-PF netdevs will be handled in a subsequent patch series. RFC: https://lore.kernel.org/all/20250702172433.1738947-2-dtatulea@nvidia.com/ --- include/linux/netdevice.h | 14 ++++++++++++++ io_uring/zcrx.c | 2 +- net/core/devmem.c | 10 +++++++++- 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5847c20994d3..1cbde7193c4d 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -5560,4 +5560,18 @@ extern struct net_device *blackhole_netdev; atomic_long_add((VAL), &(DEV)->stats.__##FIELD) #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD) +static inline struct device *netdev_get_dma_dev(const struct net_device *dev) +{ + struct device *dma_dev = dev->dev.parent; + + if (!dma_dev) + return NULL; + + /* Some devices (e.g. SFs) have the dma device as a grandparent. */ + if (!dma_dev->dma_mask) + dma_dev = dma_dev->parent; + + return (dma_dev && dma_dev->dma_mask) ? dma_dev : NULL; +} + #endif /* _LINUX_NETDEVICE_H */ diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c index 797247a34cb7..93462e5b2207 100644 --- a/io_uring/zcrx.c +++ b/io_uring/zcrx.c @@ -584,7 +584,7 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx, goto err; } - ifq->dev = ifq->netdev->dev.parent; + ifq->dev = netdev_get_dma_dev(ifq->netdev); if (!ifq->dev) { ret = -EOPNOTSUPP; goto err; diff --git a/net/core/devmem.c b/net/core/devmem.c index b3a62ca0df65..881044e0ae0e 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -183,6 +183,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, { struct net_devmem_dmabuf_binding *binding; static u32 id_alloc_next; + struct device *dma_dev; struct scatterlist *sg; struct dma_buf *dmabuf; unsigned int sg_idx, i; @@ -193,6 +194,13 @@ net_devmem_bind_dmabuf(struct net_device *dev, if (IS_ERR(dmabuf)) return ERR_CAST(dmabuf); + dma_dev = netdev_get_dma_dev(dev); + if (!dma_dev) { + err = -EOPNOTSUPP; + NL_SET_ERR_MSG(extack, "Device doesn't support dma"); + goto err_put_dmabuf; + } + binding = kzalloc_node(sizeof(*binding), GFP_KERNEL, dev_to_node(&dev->dev)); if (!binding) { @@ -209,7 +217,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, binding->dmabuf = dmabuf; - binding->attachment = dma_buf_attach(binding->dmabuf, dev->dev.parent); + binding->attachment = dma_buf_attach(binding->dmabuf, dma_dev); if (IS_ERR(binding->attachment)) { err = PTR_ERR(binding->attachment); NL_SET_ERR_MSG(extack, "Failed to bind dmabuf to device"); -- 2.50.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: Allow non parent devices to be used for ZC DMA 2025-07-09 12:40 [PATCH net] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea @ 2025-07-09 13:50 ` Pavel Begunkov 2025-07-09 19:29 ` Mina Almasry 2025-07-10 23:49 ` Jakub Kicinski 2 siblings, 0 replies; 7+ messages in thread From: Pavel Begunkov @ 2025-07-09 13:50 UTC (permalink / raw) To: Dragos Tatulea, almasrymina, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jens Axboe, Simona Vetter, Willem de Bruijn, Kaiyuan Zhang Cc: cratiu, parav, Tariq Toukan, netdev, linux-kernel, io-uring On 7/9/25 13:40, Dragos Tatulea wrote: > For zerocopy (io_uring, devmem), there is an assumption that the > parent device can do DMA. However that is not always the case: > ScalableFunction devices have the DMA device in the grandparent. > > This patch adds a helper for getting the DMA device for a netdev from > its parent or grandparent if necessary. The NULL case is handled in the > callers. > > devmem and io_uring are updated accordingly to use this helper instead > of directly using the parent. > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > Fixes: 170aafe35cb9 ("netdev: support binding dma-buf to netdevice") > Reviewed-by: Tariq Toukan <tariqt@nvidia.com> > --- > Changes in v1: > - Upgraded from RFC status. > - Dropped driver specific bits for generic solution. > - Implemented single patch as a fix as requested in RFC. > - Handling of multi-PF netdevs will be handled in a subsequent patch > series. > > RFC: https://lore.kernel.org/all/20250702172433.1738947-2-dtatulea@nvidia.com/ I can't say anything about the walking to grand parent part, but the rest looks good. Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> -- Pavel Begunkov ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: Allow non parent devices to be used for ZC DMA 2025-07-09 12:40 [PATCH net] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea 2025-07-09 13:50 ` Pavel Begunkov @ 2025-07-09 19:29 ` Mina Almasry 2025-07-09 19:53 ` Dragos Tatulea 2025-07-10 23:49 ` Jakub Kicinski 2 siblings, 1 reply; 7+ messages in thread From: Mina Almasry @ 2025-07-09 19:29 UTC (permalink / raw) To: Dragos Tatulea Cc: asml.silence, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jens Axboe, Simona Vetter, Willem de Bruijn, Kaiyuan Zhang, cratiu, parav, Tariq Toukan, netdev, linux-kernel, io-uring On Wed, Jul 9, 2025 at 5:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > For zerocopy (io_uring, devmem), there is an assumption that the > parent device can do DMA. However that is not always the case: > ScalableFunction devices have the DMA device in the grandparent. > > This patch adds a helper for getting the DMA device for a netdev from > its parent or grandparent if necessary. The NULL case is handled in the > callers. > > devmem and io_uring are updated accordingly to use this helper instead > of directly using the parent. > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > Fixes: 170aafe35cb9 ("netdev: support binding dma-buf to netdevice") nit: This doesn't seem like a fix? The current code supports all devices that are not SF well enough, right? And in the case of SF devices, I expect net_devmem_bind_dmabuf() to fail gracefully as the dma mapping of a device that doesn't support it, I think, would fail gracefully. So to me this seems like an improvement rather than a bug fix. > Reviewed-by: Tariq Toukan <tariqt@nvidia.com> > --- > Changes in v1: > - Upgraded from RFC status. > - Dropped driver specific bits for generic solution. > - Implemented single patch as a fix as requested in RFC. > - Handling of multi-PF netdevs will be handled in a subsequent patch > series. > > RFC: https://lore.kernel.org/all/20250702172433.1738947-2-dtatulea@nvidia.com/ > --- > include/linux/netdevice.h | 14 ++++++++++++++ > io_uring/zcrx.c | 2 +- > net/core/devmem.c | 10 +++++++++- > 3 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 5847c20994d3..1cbde7193c4d 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -5560,4 +5560,18 @@ extern struct net_device *blackhole_netdev; > atomic_long_add((VAL), &(DEV)->stats.__##FIELD) > #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD) > > +static inline struct device *netdev_get_dma_dev(const struct net_device *dev) > +{ > + struct device *dma_dev = dev->dev.parent; > + > + if (!dma_dev) > + return NULL; > + > + /* Some devices (e.g. SFs) have the dma device as a grandparent. */ > + if (!dma_dev->dma_mask) I was able to confirm that !dev->dma_mask means "this device doesn't support dma". Multiple existing places in the code seem to use this check. > + dma_dev = dma_dev->parent; > + > + return (dma_dev && dma_dev->dma_mask) ? dma_dev : NULL; This may be a noob question, but are we sure that !dma_dev->dma_mask && dma_dev->parent->dma_mask != NULL means that the parent is the dma-device that we should use? I understand SF devices work that way but it's not immediately obvious to me that this is generically true. For example pavel came up with the case where for veth, netdev->dev.parent == NULL , I wonder if there are weird devices in the wild where netdev->dev.parent->dma_mask == NULL but that doesn't necessarily mean that the grandparent is the dma-device that we should use. I guess to keep my long question short: what makes you think this is generically safe to do? Or is it not, but we think most devices behave this way and we're going to handle more edge cases in follow up patches? > +} > + > #endif /* _LINUX_NETDEVICE_H */ > diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c > index 797247a34cb7..93462e5b2207 100644 > --- a/io_uring/zcrx.c > +++ b/io_uring/zcrx.c > @@ -584,7 +584,7 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx, > goto err; > } > > - ifq->dev = ifq->netdev->dev.parent; > + ifq->dev = netdev_get_dma_dev(ifq->netdev); nit: this hunk will not apply when backporting this to trees that only have the Fixes commit... which makes it more weird that this is considered a fix for that, but I'm fine either way. -- Thanks, Mina ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: Allow non parent devices to be used for ZC DMA 2025-07-09 19:29 ` Mina Almasry @ 2025-07-09 19:53 ` Dragos Tatulea 2025-07-09 22:56 ` Mina Almasry 0 siblings, 1 reply; 7+ messages in thread From: Dragos Tatulea @ 2025-07-09 19:53 UTC (permalink / raw) To: Mina Almasry Cc: asml.silence, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jens Axboe, Simona Vetter, Willem de Bruijn, Kaiyuan Zhang, cratiu, parav, Tariq Toukan, netdev, linux-kernel, io-uring On Wed, Jul 09, 2025 at 12:29:22PM -0700, Mina Almasry wrote: > On Wed, Jul 9, 2025 at 5:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > For zerocopy (io_uring, devmem), there is an assumption that the > > parent device can do DMA. However that is not always the case: > > ScalableFunction devices have the DMA device in the grandparent. > > > > This patch adds a helper for getting the DMA device for a netdev from > > its parent or grandparent if necessary. The NULL case is handled in the > > callers. > > > > devmem and io_uring are updated accordingly to use this helper instead > > of directly using the parent. > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > Fixes: 170aafe35cb9 ("netdev: support binding dma-buf to netdevice") > > nit: This doesn't seem like a fix? The current code supports all > devices that are not SF well enough, right? And in the case of SF > devices, I expect net_devmem_bind_dmabuf() to fail gracefully as the > dma mapping of a device that doesn't support it, I think, would fail > gracefully. So to me this seems like an improvement rather than a bug > fix. > dma_buf_map_attachment_unlocked() will return a sg_table with 0 nents. That is graceful. However this will result in page_pools that will always be returning errors further down the line which is very confusing regarding the motives that caused it. I am also fine to not make it a fix btw. Especially since the mlx5 devmem code was just accepted. > > Reviewed-by: Tariq Toukan <tariqt@nvidia.com> > > --- > > Changes in v1: > > - Upgraded from RFC status. > > - Dropped driver specific bits for generic solution. > > - Implemented single patch as a fix as requested in RFC. > > - Handling of multi-PF netdevs will be handled in a subsequent patch > > series. > > > > RFC: https://lore.kernel.org/all/20250702172433.1738947-2-dtatulea@nvidia.com/ > > --- > > include/linux/netdevice.h | 14 ++++++++++++++ > > io_uring/zcrx.c | 2 +- > > net/core/devmem.c | 10 +++++++++- > > 3 files changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > index 5847c20994d3..1cbde7193c4d 100644 > > --- a/include/linux/netdevice.h > > +++ b/include/linux/netdevice.h > > @@ -5560,4 +5560,18 @@ extern struct net_device *blackhole_netdev; > > atomic_long_add((VAL), &(DEV)->stats.__##FIELD) > > #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD) > > > > +static inline struct device *netdev_get_dma_dev(const struct net_device *dev) > > +{ > > + struct device *dma_dev = dev->dev.parent; > > + > > + if (!dma_dev) > > + return NULL; > > + > > + /* Some devices (e.g. SFs) have the dma device as a grandparent. */ > > + if (!dma_dev->dma_mask) > > I was able to confirm that !dev->dma_mask means "this device doesn't > support dma". Multiple existing places in the code seem to use this > check. > Ack. That was my understanding as well. > > + dma_dev = dma_dev->parent; > > + > > + return (dma_dev && dma_dev->dma_mask) ? dma_dev : NULL; > > This may be a noob question, but are we sure that !dma_dev->dma_mask > && dma_dev->parent->dma_mask != NULL means that the parent is the > dma-device that we should use? I understand SF devices work that way > but it's not immediately obvious to me that this is generically true. > This is what I gathered from Parav's answer. > For example pavel came up with the case where for veth, > netdev->dev.parent == NULL , I wonder if there are weird devices in > the wild where netdev->dev.parent->dma_mask == NULL but that doesn't > necessarily mean that the grandparent is the dma-device that we should > use. > Yep. > I guess to keep my long question short: what makes you think this is > generically safe to do? Or is it not, but we think most devices behave > this way and we're going to handle more edge cases in follow up > patches? > It is just what we know so far about SFs. See end of mail. > > +} > > + > > #endif /* _LINUX_NETDEVICE_H */ > > diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c > > index 797247a34cb7..93462e5b2207 100644 > > --- a/io_uring/zcrx.c > > +++ b/io_uring/zcrx.c > > @@ -584,7 +584,7 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx, > > goto err; > > } > > > > - ifq->dev = ifq->netdev->dev.parent; > > + ifq->dev = netdev_get_dma_dev(ifq->netdev); > > nit: this hunk will not apply when backporting this to trees that only > have the Fixes commit... which makes it more weird that this is > considered a fix for that, but I'm fine either way. > Ouch, indeed. Should have thought of that. Big picture view: Maybe after all it is more generic to have queue ops that can provide this info. For zcrx it is trivial (see below hunk). For devmem I was thinking of calling netdev_queue_get_dma_dev() for every bound queue (before the mapping) and return an error only when we find different . It will make netdev_nl_bind_rx_doit() a bit icky, but the idea is not complicated. What do you think? --- diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h index 6e835972abd1..04c69f39558d 100644 --- a/include/net/netdev_queues.h +++ b/include/net/netdev_queues.h @@ -127,6 +127,9 @@ void netdev_stat_queue_sum(struct net_device *netdev, * @ndo_queue_stop: Stop the RX queue at the specified index. The stopped * queue's memory is written at the specified address. * + * @ndo_queue_get_dma_dev: When set, the driver can provide the DMA device to + * be used for the given queue. + * * Note that @ndo_queue_mem_alloc and @ndo_queue_mem_free may be called while * the interface is closed. @ndo_queue_start and @ndo_queue_stop will only * be called for an interface which is open. @@ -144,6 +147,8 @@ struct netdev_queue_mgmt_ops { int (*ndo_queue_stop)(struct net_device *dev, void *per_queue_mem, int idx); + struct device * (*ndo_queue_get_dma_dev)(const struct net_device *dev, + int idx); }; /** @@ -321,4 +326,15 @@ static inline void netif_subqueue_sent(const struct net_device *dev, get_desc, start_thrs); \ }) +static inline struct device *netdev_queue_get_dma_dev(const struct net_device *dev, + int idx) +{ + const struct netdev_queue_mgmt_ops *qops = dev->queue_mgmt_ops; + + if (qops && qops->ndo_queue_get_dma_dev) + return qops->ndo_queue_get_dma_dev(dev, idx); + + return netdev_get_dma_dev(dev); +} + #endif diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c index 93462e5b2207..478693a6d325 100644 --- a/io_uring/zcrx.c +++ b/io_uring/zcrx.c @@ -12,6 +12,7 @@ #include <net/page_pool/helpers.h> #include <net/page_pool/memory_provider.h> #include <net/netlink.h> +#include <net/netdev_queues.h> #include <net/netdev_rx_queue.h> #include <net/tcp.h> #include <net/rps.h> @@ -584,7 +585,7 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx, goto err; } - ifq->dev = netdev_get_dma_dev(ifq->netdev); + ifq->dev = netdev_queue_get_dma_dev(ifq->netdev, reg.if_rxq); if (!ifq->dev) { ret = -EOPNOTSUPP; goto err; --- Thanks, Dragos ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: Allow non parent devices to be used for ZC DMA 2025-07-09 19:53 ` Dragos Tatulea @ 2025-07-09 22:56 ` Mina Almasry 2025-07-10 23:47 ` Jakub Kicinski 0 siblings, 1 reply; 7+ messages in thread From: Mina Almasry @ 2025-07-09 22:56 UTC (permalink / raw) To: Dragos Tatulea Cc: asml.silence, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jens Axboe, Simona Vetter, Willem de Bruijn, Kaiyuan Zhang, cratiu, parav, Tariq Toukan, netdev, linux-kernel, io-uring On Wed, Jul 9, 2025 at 12:54 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > On Wed, Jul 09, 2025 at 12:29:22PM -0700, Mina Almasry wrote: > > On Wed, Jul 9, 2025 at 5:46 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > > > For zerocopy (io_uring, devmem), there is an assumption that the > > > parent device can do DMA. However that is not always the case: > > > ScalableFunction devices have the DMA device in the grandparent. > > > > > > This patch adds a helper for getting the DMA device for a netdev from > > > its parent or grandparent if necessary. The NULL case is handled in the > > > callers. > > > > > > devmem and io_uring are updated accordingly to use this helper instead > > > of directly using the parent. > > > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > > Fixes: 170aafe35cb9 ("netdev: support binding dma-buf to netdevice") > > > > nit: This doesn't seem like a fix? The current code supports all > > devices that are not SF well enough, right? And in the case of SF > > devices, I expect net_devmem_bind_dmabuf() to fail gracefully as the > > dma mapping of a device that doesn't support it, I think, would fail > > gracefully. So to me this seems like an improvement rather than a bug > > fix. > > > dma_buf_map_attachment_unlocked() will return a sg_table with 0 nents. > That is graceful. However this will result in page_pools that will > always be returning errors further down the line which is very confusing > regarding the motives that caused it. > > I am also fine to not make it a fix btw. Especially since the mlx5 > devmem code was just accepted. > If you submit another version I'd rather it be a non-fix, especially since applying the io_uring hunk will be challenging when backporting this patch, but I assume hunk can be dropped while backporting, so I'm fine either way. > > > Reviewed-by: Tariq Toukan <tariqt@nvidia.com> > > > --- > > > Changes in v1: > > > - Upgraded from RFC status. > > > - Dropped driver specific bits for generic solution. > > > - Implemented single patch as a fix as requested in RFC. > > > - Handling of multi-PF netdevs will be handled in a subsequent patch > > > series. > > > > > > RFC: https://lore.kernel.org/all/20250702172433.1738947-2-dtatulea@nvidia.com/ > > > --- > > > include/linux/netdevice.h | 14 ++++++++++++++ > > > io_uring/zcrx.c | 2 +- > > > net/core/devmem.c | 10 +++++++++- > > > 3 files changed, 24 insertions(+), 2 deletions(-) > > > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > > index 5847c20994d3..1cbde7193c4d 100644 > > > --- a/include/linux/netdevice.h > > > +++ b/include/linux/netdevice.h > > > @@ -5560,4 +5560,18 @@ extern struct net_device *blackhole_netdev; > > > atomic_long_add((VAL), &(DEV)->stats.__##FIELD) > > > #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD) > > > > > > +static inline struct device *netdev_get_dma_dev(const struct net_device *dev) > > > +{ > > > + struct device *dma_dev = dev->dev.parent; > > > + > > > + if (!dma_dev) > > > + return NULL; > > > + > > > + /* Some devices (e.g. SFs) have the dma device as a grandparent. */ > > > + if (!dma_dev->dma_mask) > > > > I was able to confirm that !dev->dma_mask means "this device doesn't > > support dma". Multiple existing places in the code seem to use this > > check. > > > Ack. That was my understanding as well. > > > > + dma_dev = dma_dev->parent; > > > + > > > + return (dma_dev && dma_dev->dma_mask) ? dma_dev : NULL; > > > > This may be a noob question, but are we sure that !dma_dev->dma_mask > > && dma_dev->parent->dma_mask != NULL means that the parent is the > > dma-device that we should use? I understand SF devices work that way > > but it's not immediately obvious to me that this is generically true. > > > This is what I gathered from Parav's answer. > > > For example pavel came up with the case where for veth, > > netdev->dev.parent == NULL , I wonder if there are weird devices in > > the wild where netdev->dev.parent->dma_mask == NULL but that doesn't > > necessarily mean that the grandparent is the dma-device that we should > > use. > > > Yep. > > > I guess to keep my long question short: what makes you think this is > > generically safe to do? Or is it not, but we think most devices behave > > this way and we're going to handle more edge cases in follow up > > patches? > > > It is just what we know so far about SFs. See end of mail. > I see. OK, even though this is 'just what we know so far', I'm still in favor of this simple approach, but I would say it would be good to communicate in the comments that this is a best-effort dma-device finding and doesn't handle every case under the sun. Something like (untested): static inline struct device *netdev_get_dma_dev(const struct net_device *dev) { struct device *parent = dev->dev.parent; if (!parent) return NULL; /* For most netdevs, the parent supports dma and is the correct * dma-device */ if (parent->dma_mask) return parent; /* For SF devices, the parent doesn't support dma, but the grandparent * does, and is the correct dma-device to use (link to docs that explain * this if any). */ if (parent->parent && parent->parent->dma_mask) return parent->parent; /* If neither the parent nor grandparent support dma, then we're not * sure what dma-device to use. Error out. Special handling for new * netdevs may need to be added in the future. */ return NULL; } With some comments explaining the logic a bit, you can add: Reviewed-by: Mina Almasry <almasrymina@google.com> And let's see if Jakub likes this. If not, we can always do the future proof approach with the queue API giving the driver the ability to tell us what exactly is the dma-device to use (or whatever approach he prefers). -- Thanks, Mina ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: Allow non parent devices to be used for ZC DMA 2025-07-09 22:56 ` Mina Almasry @ 2025-07-10 23:47 ` Jakub Kicinski 0 siblings, 0 replies; 7+ messages in thread From: Jakub Kicinski @ 2025-07-10 23:47 UTC (permalink / raw) To: Mina Almasry Cc: Dragos Tatulea, asml.silence, Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman, Jens Axboe, Simona Vetter, Willem de Bruijn, Kaiyuan Zhang, cratiu, parav, Tariq Toukan, netdev, linux-kernel, io-uring On Wed, 9 Jul 2025 15:56:16 -0700 Mina Almasry wrote: > > > nit: This doesn't seem like a fix? The current code supports all > > > devices that are not SF well enough, right? And in the case of SF > > > devices, I expect net_devmem_bind_dmabuf() to fail gracefully as the > > > dma mapping of a device that doesn't support it, I think, would fail > > > gracefully. So to me this seems like an improvement rather than a bug > > > fix. > > > > > dma_buf_map_attachment_unlocked() will return a sg_table with 0 nents. > > That is graceful. However this will result in page_pools that will > > always be returning errors further down the line which is very confusing > > regarding the motives that caused it. > > > > I am also fine to not make it a fix btw. Especially since the mlx5 > > devmem code was just accepted. > > If you submit another version I'd rather it be a non-fix, especially > since applying the io_uring hunk will be challenging when backporting > this patch, but I assume hunk can be dropped while backporting, so I'm > fine either way. +1 for non-fix. The core is working as expected, if we want a Fixes tag I'd aim it at mlx5 and make the patch reject queue binding to SFs _within mlx5_ until the core bits are figured out. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: Allow non parent devices to be used for ZC DMA 2025-07-09 12:40 [PATCH net] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea 2025-07-09 13:50 ` Pavel Begunkov 2025-07-09 19:29 ` Mina Almasry @ 2025-07-10 23:49 ` Jakub Kicinski 2 siblings, 0 replies; 7+ messages in thread From: Jakub Kicinski @ 2025-07-10 23:49 UTC (permalink / raw) To: Dragos Tatulea Cc: almasrymina, asml.silence, Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman, Jens Axboe, Simona Vetter, Willem de Bruijn, Kaiyuan Zhang, cratiu, parav, Tariq Toukan, netdev, linux-kernel, io-uring On Wed, 9 Jul 2025 15:40:57 +0300 Dragos Tatulea wrote: > For zerocopy (io_uring, devmem), there is an assumption that the > parent device can do DMA. However that is not always the case: > ScalableFunction devices have the DMA device in the grandparent. > > This patch adds a helper for getting the DMA device for a netdev from > its parent or grandparent if necessary. The NULL case is handled in the > callers. > > devmem and io_uring are updated accordingly to use this helper instead > of directly using the parent. Sorry for the silence. I'll reply to Parav on the RFC, I don't think the question I was asking was answered there. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-07-10 23:49 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-09 12:40 [PATCH net] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea 2025-07-09 13:50 ` Pavel Begunkov 2025-07-09 19:29 ` Mina Almasry 2025-07-09 19:53 ` Dragos Tatulea 2025-07-09 22:56 ` Mina Almasry 2025-07-10 23:47 ` Jakub Kicinski 2025-07-10 23:49 ` Jakub Kicinski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox