From: Mina Almasry <almasrymina@google.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
netdev@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
Willem de Bruijn <willemb@google.com>,
Paolo Abeni <pabeni@redhat.com>,
andrew+netdev@lunn.ch, horms@kernel.org, davem@davemloft.net,
sdf@fomichev.me, dw@davidwei.uk, michael.chan@broadcom.com,
dtatulea@nvidia.com, ap420073@gmail.com,
linux-kernel@vger.kernel.org, io-uring@vger.kernel.org
Subject: Re: [PATCH net-next v3 12/23] net: allocate per-queue config structs and pass them thru the queue API
Date: Tue, 19 Aug 2025 18:32:52 -0700 [thread overview]
Message-ID: <CAHS8izMN+_9iWiZnc_FdkMZVDsRRG9FM8JYsz84V8gqDJU_GAA@mail.gmail.com> (raw)
In-Reply-To: <CAHS8izPgcjR-L=xfLvJDEVVt3uUOfavoEAWuCZ9F4DgpoAmHtQ@mail.gmail.com>
On Tue, Aug 19, 2025 at 2:29 PM Mina Almasry <almasrymina@google.com> wrote:
>
> On Mon, Aug 18, 2025 at 6:56 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >
> > From: Jakub Kicinski <kuba@kernel.org>
> >
> > Create an array of config structs to store per-queue config.
> > Pass these structs in the queue API. Drivers can also retrieve
> > the config for a single queue calling netdev_queue_config()
> > directly.
> >
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > [pavel: patch up mlx callbacks with unused qcfg]
> > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> > ---
> > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++-
> > drivers/net/ethernet/google/gve/gve_main.c | 9 ++-
> > .../net/ethernet/mellanox/mlx5/core/en_main.c | 9 +--
> > drivers/net/netdevsim/netdev.c | 6 +-
> > include/net/netdev_queues.h | 19 ++++++
> > net/core/dev.h | 3 +
> > net/core/netdev_config.c | 58 +++++++++++++++++++
> > net/core/netdev_rx_queue.c | 11 +++-
> > 8 files changed, 109 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > index d3d9b72ef313..48ff6f024e07 100644
> > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > @@ -15824,7 +15824,9 @@ static const struct netdev_stat_ops bnxt_stat_ops = {
> > .get_base_stats = bnxt_get_base_stats,
> > };
> >
> > -static int bnxt_queue_mem_alloc(struct net_device *dev, void *qmem, int idx)
> > +static int bnxt_queue_mem_alloc(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > + void *qmem, int idx)
> > {
> > struct bnxt_rx_ring_info *rxr, *clone;
> > struct bnxt *bp = netdev_priv(dev);
> > @@ -15992,7 +15994,9 @@ static void bnxt_copy_rx_ring(struct bnxt *bp,
> > dst->rx_agg_bmap = src->rx_agg_bmap;
> > }
> >
> > -static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
> > +static int bnxt_queue_start(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > + void *qmem, int idx)
> > {
> > struct bnxt *bp = netdev_priv(dev);
> > struct bnxt_rx_ring_info *rxr, *clone;
> > diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
> > index 1f411d7c4373..f40edab616d8 100644
> > --- a/drivers/net/ethernet/google/gve/gve_main.c
> > +++ b/drivers/net/ethernet/google/gve/gve_main.c
> > @@ -2580,8 +2580,9 @@ static void gve_rx_queue_mem_free(struct net_device *dev, void *per_q_mem)
> > gve_rx_free_ring_dqo(priv, gve_per_q_mem, &cfg);
> > }
> >
> > -static int gve_rx_queue_mem_alloc(struct net_device *dev, void *per_q_mem,
> > - int idx)
> > +static int gve_rx_queue_mem_alloc(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > + void *per_q_mem, int idx)
> > {
> > struct gve_priv *priv = netdev_priv(dev);
> > struct gve_rx_alloc_rings_cfg cfg = {0};
> > @@ -2602,7 +2603,9 @@ static int gve_rx_queue_mem_alloc(struct net_device *dev, void *per_q_mem,
> > return err;
> > }
> >
> > -static int gve_rx_queue_start(struct net_device *dev, void *per_q_mem, int idx)
> > +static int gve_rx_queue_start(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > + void *per_q_mem, int idx)
> > {
> > struct gve_priv *priv = netdev_priv(dev);
> > struct gve_rx_ring *gve_per_q_mem;
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > index 21bb88c5d3dc..83264c17a4f7 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > @@ -5541,8 +5541,9 @@ struct mlx5_qmgmt_data {
> > struct mlx5e_channel_param cparam;
> > };
> >
> > -static int mlx5e_queue_mem_alloc(struct net_device *dev, void *newq,
> > - int queue_index)
> > +static int mlx5e_queue_mem_alloc(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > + void *newq, int queue_index)
> > {
> > struct mlx5_qmgmt_data *new = (struct mlx5_qmgmt_data *)newq;
> > struct mlx5e_priv *priv = netdev_priv(dev);
> > @@ -5603,8 +5604,8 @@ static int mlx5e_queue_stop(struct net_device *dev, void *oldq, int queue_index)
> > return 0;
> > }
> >
> > -static int mlx5e_queue_start(struct net_device *dev, void *newq,
> > - int queue_index)
> > +static int mlx5e_queue_start(struct net_device *dev, struct netdev_queue_config *qcfg,
> > + void *newq, int queue_index)
> > {
> > struct mlx5_qmgmt_data *new = (struct mlx5_qmgmt_data *)newq;
> > struct mlx5e_priv *priv = netdev_priv(dev);
> > diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
> > index 0178219f0db5..985c3403ec57 100644
> > --- a/drivers/net/netdevsim/netdev.c
> > +++ b/drivers/net/netdevsim/netdev.c
> > @@ -733,7 +733,8 @@ struct nsim_queue_mem {
> > };
> >
> > static int
> > -nsim_queue_mem_alloc(struct net_device *dev, void *per_queue_mem, int idx)
> > +nsim_queue_mem_alloc(struct net_device *dev, struct netdev_queue_config *qcfg,
> > + void *per_queue_mem, int idx)
> > {
> > struct nsim_queue_mem *qmem = per_queue_mem;
> > struct netdevsim *ns = netdev_priv(dev);
> > @@ -782,7 +783,8 @@ static void nsim_queue_mem_free(struct net_device *dev, void *per_queue_mem)
> > }
> >
> > static int
> > -nsim_queue_start(struct net_device *dev, void *per_queue_mem, int idx)
> > +nsim_queue_start(struct net_device *dev, struct netdev_queue_config *qcfg,
> > + void *per_queue_mem, int idx)
> > {
> > struct nsim_queue_mem *qmem = per_queue_mem;
> > struct netdevsim *ns = netdev_priv(dev);
> > diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> > index d73f9023c96f..b850cff71d12 100644
> > --- a/include/net/netdev_queues.h
> > +++ b/include/net/netdev_queues.h
> > @@ -32,6 +32,13 @@ struct netdev_config {
> > /** @hds_config: HDS enabled (ETHTOOL_A_RINGS_TCP_DATA_SPLIT).
> > */
> > u8 hds_config;
> > +
> > + /** @qcfg: per-queue configuration */
> > + struct netdev_queue_config *qcfg;
> > +};
> > +
> > +/* Same semantics as fields in struct netdev_config */
> > +struct netdev_queue_config {
> > };
>
> I was very confused why this is empty until I looked at patch 18 :-D
>
> >
> > /* See the netdev.yaml spec for definition of each statistic */
> > @@ -136,6 +143,10 @@ void netdev_stat_queue_sum(struct net_device *netdev,
> > *
> > * @ndo_queue_mem_size: Size of the struct that describes a queue's memory.
> > *
> > + * @ndo_queue_cfg_defaults: (Optional) Populate queue config struct with
> > + * defaults. Queue config structs are passed to this
> > + * helper before the user-requested settings are applied.
> > + *
> > * @ndo_queue_mem_alloc: Allocate memory for an RX queue at the specified index.
> > * The new memory is written at the specified address.
> > *
> > @@ -153,12 +164,17 @@ void netdev_stat_queue_sum(struct net_device *netdev,
> > */
> > struct netdev_queue_mgmt_ops {
> > size_t ndo_queue_mem_size;
> > + void (*ndo_queue_cfg_defaults)(struct net_device *dev,
> > + int idx,
> > + struct netdev_queue_config *qcfg);
> > int (*ndo_queue_mem_alloc)(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > void *per_queue_mem,
> > int idx);
> > void (*ndo_queue_mem_free)(struct net_device *dev,
> > void *per_queue_mem);
> > int (*ndo_queue_start)(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > void *per_queue_mem,
> > int idx);
> > int (*ndo_queue_stop)(struct net_device *dev,
> > @@ -166,6 +182,9 @@ struct netdev_queue_mgmt_ops {
> > int idx);
> > };
> >
> > +void netdev_queue_config(struct net_device *dev, int rxq,
> > + struct netdev_queue_config *qcfg);
> > +
> > /**
> > * DOC: Lockless queue stopping / waking helpers.
> > *
> > diff --git a/net/core/dev.h b/net/core/dev.h
> > index 7041c8bd2a0f..a553a0f1f846 100644
> > --- a/net/core/dev.h
> > +++ b/net/core/dev.h
> > @@ -9,6 +9,7 @@
> > #include <net/netdev_lock.h>
> >
> > struct net;
> > +struct netdev_queue_config;
> > struct netlink_ext_ack;
> > struct cpumask;
> >
> > @@ -96,6 +97,8 @@ int netdev_alloc_config(struct net_device *dev);
> > void __netdev_free_config(struct netdev_config *cfg);
> > void netdev_free_config(struct net_device *dev);
> > int netdev_reconfig_start(struct net_device *dev);
> > +void __netdev_queue_config(struct net_device *dev, int rxq,
> > + struct netdev_queue_config *qcfg, bool pending);
> >
> > /* netdev management, shared between various uAPI entry points */
> > struct netdev_name_node {
> > diff --git a/net/core/netdev_config.c b/net/core/netdev_config.c
> > index 270b7f10a192..bad2d53522f0 100644
> > --- a/net/core/netdev_config.c
> > +++ b/net/core/netdev_config.c
> > @@ -8,18 +8,29 @@
> > int netdev_alloc_config(struct net_device *dev)
> > {
> > struct netdev_config *cfg;
> > + unsigned int maxqs;
> >
> > cfg = kzalloc(sizeof(*dev->cfg), GFP_KERNEL_ACCOUNT);
> > if (!cfg)
> > return -ENOMEM;
> >
> > + maxqs = max(dev->num_rx_queues, dev->num_tx_queues);
>
> I honestly did not think about tx queues at all for the queue api thus
> far. The ndos do specify that api applies to rx queues, and maybe the
> driver only implemented them to assume indeed the calls are to rx
> queues. Are you intentionally extending the queue api support for tx
> queues? Or maybe you're allocing configs for the tx queues to be used
> in some future?
>
> Other places in this patch series uses num_rx_queues directly. Feels
> like this should do the same.
>
> > + cfg->qcfg = kcalloc(maxqs, sizeof(*cfg->qcfg), GFP_KERNEL_ACCOUNT);
> > + if (!cfg->qcfg)
> > + goto err_free_cfg;
> > +
> > dev->cfg = cfg;
> > dev->cfg_pending = cfg;
> > return 0;
> > +
> > +err_free_cfg:
> > + kfree(cfg);
> > + return -ENOMEM;
> > }
> >
> > void __netdev_free_config(struct netdev_config *cfg)
> > {
> > + kfree(cfg->qcfg);
> > kfree(cfg);
> > }
> >
> > @@ -32,12 +43,59 @@ void netdev_free_config(struct net_device *dev)
> > int netdev_reconfig_start(struct net_device *dev)
> > {
> > struct netdev_config *cfg;
> > + unsigned int maxqs;
> >
> > WARN_ON(dev->cfg != dev->cfg_pending);
> > cfg = kmemdup(dev->cfg, sizeof(*dev->cfg), GFP_KERNEL_ACCOUNT);
> > if (!cfg)
> > return -ENOMEM;
> >
> > + maxqs = max(dev->num_rx_queues, dev->num_tx_queues);
> > + cfg->qcfg = kmemdup_array(dev->cfg->qcfg, maxqs, sizeof(*cfg->qcfg),
> > + GFP_KERNEL_ACCOUNT);
> > + if (!cfg->qcfg)
> > + goto err_free_cfg;
> > +
> > dev->cfg_pending = cfg;
> > return 0;
> > +
> > +err_free_cfg:
> > + kfree(cfg);
> > + return -ENOMEM;
> > +}
> > +
> > +void __netdev_queue_config(struct net_device *dev, int rxq,
> > + struct netdev_queue_config *qcfg, bool pending)
> > +{
> > + memset(qcfg, 0, sizeof(*qcfg));
> > +
>
> This memset 0 is wrong for queue configs like hds_thresh where 0 is a
> value, not 'restore default'.
>
> Either netdev_queue_config needs to have a comment that says 'only
> values where 0 is restore default is allowed in this struct', or this
> function needs to handle 0-as-value configs correctly.
>
> But I wonder if the memset(0) is wrong in general. Isn't this helper
> trying to grab the _current_ configuration? So qcfg should be seeded
> with appropriate value from dev->qcfgs[rxq]? This function reads like
> it's trying to get the default configuration, but in a way that
> doesn't handle hds_thresh style semantics correctly?
>
Nevermind this comment, a close review of patch 18 answered this
question actually. You are indeed grabbing the configuration from
dev->qcfgs[rxq], you're just not doing it here because
netdev_queue_config is empty.
--
Thanks,
Mina
next prev parent reply other threads:[~2025-08-20 1:33 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-18 13:57 [PATCH net-next v3 00/23][pull request] Queue configs and large buffer providers Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 01/23] net: page_pool: sanitise allocation order Pavel Begunkov
2025-08-18 23:33 ` Mina Almasry
2025-08-19 15:53 ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 02/23] docs: ethtool: document that rx_buf_len must control payload lengths Pavel Begunkov
2025-08-18 23:50 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 03/23] net: ethtool: report max value for rx-buf-len Pavel Begunkov
2025-08-19 0:00 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 04/23] net: use zero value to restore rx_buf_len to default Pavel Begunkov
2025-08-19 0:07 ` Mina Almasry
2025-08-19 15:52 ` Pavel Begunkov
2025-08-19 19:27 ` Mina Almasry
2025-08-20 11:53 ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 05/23] net: clarify the meaning of netdev_config members Pavel Begunkov
2025-08-19 1:46 ` Mina Almasry
2025-08-20 12:04 ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 06/23] net: add rx_buf_len to netdev config Pavel Begunkov
2025-08-19 19:32 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 07/23] eth: bnxt: read the page size from the adapter struct Pavel Begunkov
2025-08-19 19:37 ` Mina Almasry
2025-08-20 13:43 ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 08/23] eth: bnxt: set page pool page order based on rx_page_size Pavel Begunkov
2025-08-19 19:43 ` Mina Almasry
2025-08-20 13:51 ` Pavel Begunkov
2025-08-25 6:09 ` Somnath Kotur
2025-08-18 13:57 ` [PATCH net-next v3 09/23] eth: bnxt: support setting size of agg buffers via ethtool Pavel Begunkov
2025-08-19 20:10 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 10/23] net: move netdev_config manipulation to dedicated helpers Pavel Begunkov
2025-08-19 20:15 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 11/23] net: reduce indent of struct netdev_queue_mgmt_ops members Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 12/23] net: allocate per-queue config structs and pass them thru the queue API Pavel Begunkov
2025-08-19 21:29 ` Mina Almasry
2025-08-20 1:32 ` Mina Almasry [this message]
2025-08-18 13:57 ` [PATCH net-next v3 13/23] net: pass extack to netdev_rx_queue_restart() Pavel Begunkov
2025-08-19 21:30 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 14/23] net: add queue config validation callback Pavel Begunkov
2025-08-19 21:54 ` Mina Almasry
2025-08-20 1:31 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 15/23] eth: bnxt: always set the queue mgmt ops Pavel Begunkov
2025-08-19 21:57 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 16/23] eth: bnxt: store the rx buf size per queue Pavel Begunkov
2025-08-25 6:24 ` Somnath Kotur
2025-08-18 13:57 ` [PATCH net-next v3 17/23] eth: bnxt: adjust the fill level of agg queues with larger buffers Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 18/23] netdev: add support for setting rx-buf-len per queue Pavel Begunkov
2025-08-19 22:36 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 19/23] net: wipe the setting of deactived queues Pavel Begunkov
2025-08-19 22:49 ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 20/23] eth: bnxt: use queue op config validate Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 21/23] eth: bnxt: support per queue configuration of rx-buf-len Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 22/23] net: let pp memory provider to specify rx buf len Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 23/23] net: validate driver supports passed qcfg params Pavel Begunkov
2025-08-18 13:59 ` [PATCH net-next v3 00/23][pull request] Queue configs and large buffer providers Pavel Begunkov
2025-08-20 2:31 ` Jakub Kicinski
2025-08-20 13:39 ` Pavel Begunkov
2025-08-20 13:59 ` Mina Almasry
2025-08-21 1:26 ` Jakub Kicinski
2025-08-21 1:37 ` Jakub Kicinski
2025-08-21 15:04 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHS8izMN+_9iWiZnc_FdkMZVDsRRG9FM8JYsz84V8gqDJU_GAA@mail.gmail.com \
--to=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=ap420073@gmail.com \
--cc=asml.silence@gmail.com \
--cc=davem@davemloft.net \
--cc=dtatulea@nvidia.com \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=io-uring@vger.kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox