public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Mina Almasry <almasrymina@google.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	netdev@vger.kernel.org,  Eric Dumazet <edumazet@google.com>,
	Willem de Bruijn <willemb@google.com>,
	 Paolo Abeni <pabeni@redhat.com>,
	andrew+netdev@lunn.ch, horms@kernel.org,  davem@davemloft.net,
	sdf@fomichev.me, dw@davidwei.uk,  michael.chan@broadcom.com,
	dtatulea@nvidia.com, ap420073@gmail.com,
	 linux-kernel@vger.kernel.org, io-uring@vger.kernel.org
Subject: Re: [PATCH net-next v3 12/23] net: allocate per-queue config structs and pass them thru the queue API
Date: Tue, 19 Aug 2025 18:32:52 -0700	[thread overview]
Message-ID: <CAHS8izMN+_9iWiZnc_FdkMZVDsRRG9FM8JYsz84V8gqDJU_GAA@mail.gmail.com> (raw)
In-Reply-To: <CAHS8izPgcjR-L=xfLvJDEVVt3uUOfavoEAWuCZ9F4DgpoAmHtQ@mail.gmail.com>

On Tue, Aug 19, 2025 at 2:29 PM Mina Almasry <almasrymina@google.com> wrote:
>
> On Mon, Aug 18, 2025 at 6:56 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >
> > From: Jakub Kicinski <kuba@kernel.org>
> >
> > Create an array of config structs to store per-queue config.
> > Pass these structs in the queue API. Drivers can also retrieve
> > the config for a single queue calling netdev_queue_config()
> > directly.
> >
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > [pavel: patch up mlx callbacks with unused qcfg]
> > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> > ---
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  8 ++-
> >  drivers/net/ethernet/google/gve/gve_main.c    |  9 ++-
> >  .../net/ethernet/mellanox/mlx5/core/en_main.c |  9 +--
> >  drivers/net/netdevsim/netdev.c                |  6 +-
> >  include/net/netdev_queues.h                   | 19 ++++++
> >  net/core/dev.h                                |  3 +
> >  net/core/netdev_config.c                      | 58 +++++++++++++++++++
> >  net/core/netdev_rx_queue.c                    | 11 +++-
> >  8 files changed, 109 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > index d3d9b72ef313..48ff6f024e07 100644
> > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > @@ -15824,7 +15824,9 @@ static const struct netdev_stat_ops bnxt_stat_ops = {
> >         .get_base_stats         = bnxt_get_base_stats,
> >  };
> >
> > -static int bnxt_queue_mem_alloc(struct net_device *dev, void *qmem, int idx)
> > +static int bnxt_queue_mem_alloc(struct net_device *dev,
> > +                               struct netdev_queue_config *qcfg,
> > +                               void *qmem, int idx)
> >  {
> >         struct bnxt_rx_ring_info *rxr, *clone;
> >         struct bnxt *bp = netdev_priv(dev);
> > @@ -15992,7 +15994,9 @@ static void bnxt_copy_rx_ring(struct bnxt *bp,
> >         dst->rx_agg_bmap = src->rx_agg_bmap;
> >  }
> >
> > -static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
> > +static int bnxt_queue_start(struct net_device *dev,
> > +                           struct netdev_queue_config *qcfg,
> > +                           void *qmem, int idx)
> >  {
> >         struct bnxt *bp = netdev_priv(dev);
> >         struct bnxt_rx_ring_info *rxr, *clone;
> > diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
> > index 1f411d7c4373..f40edab616d8 100644
> > --- a/drivers/net/ethernet/google/gve/gve_main.c
> > +++ b/drivers/net/ethernet/google/gve/gve_main.c
> > @@ -2580,8 +2580,9 @@ static void gve_rx_queue_mem_free(struct net_device *dev, void *per_q_mem)
> >                 gve_rx_free_ring_dqo(priv, gve_per_q_mem, &cfg);
> >  }
> >
> > -static int gve_rx_queue_mem_alloc(struct net_device *dev, void *per_q_mem,
> > -                                 int idx)
> > +static int gve_rx_queue_mem_alloc(struct net_device *dev,
> > +                                 struct netdev_queue_config *qcfg,
> > +                                 void *per_q_mem, int idx)
> >  {
> >         struct gve_priv *priv = netdev_priv(dev);
> >         struct gve_rx_alloc_rings_cfg cfg = {0};
> > @@ -2602,7 +2603,9 @@ static int gve_rx_queue_mem_alloc(struct net_device *dev, void *per_q_mem,
> >         return err;
> >  }
> >
> > -static int gve_rx_queue_start(struct net_device *dev, void *per_q_mem, int idx)
> > +static int gve_rx_queue_start(struct net_device *dev,
> > +                             struct netdev_queue_config *qcfg,
> > +                             void *per_q_mem, int idx)
> >  {
> >         struct gve_priv *priv = netdev_priv(dev);
> >         struct gve_rx_ring *gve_per_q_mem;
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > index 21bb88c5d3dc..83264c17a4f7 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> > @@ -5541,8 +5541,9 @@ struct mlx5_qmgmt_data {
> >         struct mlx5e_channel_param cparam;
> >  };
> >
> > -static int mlx5e_queue_mem_alloc(struct net_device *dev, void *newq,
> > -                                int queue_index)
> > +static int mlx5e_queue_mem_alloc(struct net_device *dev,
> > +                                struct netdev_queue_config *qcfg,
> > +                                void *newq, int queue_index)
> >  {
> >         struct mlx5_qmgmt_data *new = (struct mlx5_qmgmt_data *)newq;
> >         struct mlx5e_priv *priv = netdev_priv(dev);
> > @@ -5603,8 +5604,8 @@ static int mlx5e_queue_stop(struct net_device *dev, void *oldq, int queue_index)
> >         return 0;
> >  }
> >
> > -static int mlx5e_queue_start(struct net_device *dev, void *newq,
> > -                            int queue_index)
> > +static int mlx5e_queue_start(struct net_device *dev, struct netdev_queue_config *qcfg,
> > +                            void *newq, int queue_index)
> >  {
> >         struct mlx5_qmgmt_data *new = (struct mlx5_qmgmt_data *)newq;
> >         struct mlx5e_priv *priv = netdev_priv(dev);
> > diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
> > index 0178219f0db5..985c3403ec57 100644
> > --- a/drivers/net/netdevsim/netdev.c
> > +++ b/drivers/net/netdevsim/netdev.c
> > @@ -733,7 +733,8 @@ struct nsim_queue_mem {
> >  };
> >
> >  static int
> > -nsim_queue_mem_alloc(struct net_device *dev, void *per_queue_mem, int idx)
> > +nsim_queue_mem_alloc(struct net_device *dev, struct netdev_queue_config *qcfg,
> > +                    void *per_queue_mem, int idx)
> >  {
> >         struct nsim_queue_mem *qmem = per_queue_mem;
> >         struct netdevsim *ns = netdev_priv(dev);
> > @@ -782,7 +783,8 @@ static void nsim_queue_mem_free(struct net_device *dev, void *per_queue_mem)
> >  }
> >
> >  static int
> > -nsim_queue_start(struct net_device *dev, void *per_queue_mem, int idx)
> > +nsim_queue_start(struct net_device *dev, struct netdev_queue_config *qcfg,
> > +                void *per_queue_mem, int idx)
> >  {
> >         struct nsim_queue_mem *qmem = per_queue_mem;
> >         struct netdevsim *ns = netdev_priv(dev);
> > diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> > index d73f9023c96f..b850cff71d12 100644
> > --- a/include/net/netdev_queues.h
> > +++ b/include/net/netdev_queues.h
> > @@ -32,6 +32,13 @@ struct netdev_config {
> >         /** @hds_config: HDS enabled (ETHTOOL_A_RINGS_TCP_DATA_SPLIT).
> >          */
> >         u8      hds_config;
> > +
> > +       /** @qcfg: per-queue configuration */
> > +       struct netdev_queue_config *qcfg;
> > +};
> > +
> > +/* Same semantics as fields in struct netdev_config */
> > +struct netdev_queue_config {
> >  };
>
> I was very confused why this is empty until I looked at patch 18 :-D
>
> >
> >  /* See the netdev.yaml spec for definition of each statistic */
> > @@ -136,6 +143,10 @@ void netdev_stat_queue_sum(struct net_device *netdev,
> >   *
> >   * @ndo_queue_mem_size: Size of the struct that describes a queue's memory.
> >   *
> > + * @ndo_queue_cfg_defaults: (Optional) Populate queue config struct with
> > + *                     defaults. Queue config structs are passed to this
> > + *                     helper before the user-requested settings are applied.
> > + *
> >   * @ndo_queue_mem_alloc: Allocate memory for an RX queue at the specified index.
> >   *                      The new memory is written at the specified address.
> >   *
> > @@ -153,12 +164,17 @@ void netdev_stat_queue_sum(struct net_device *netdev,
> >   */
> >  struct netdev_queue_mgmt_ops {
> >         size_t  ndo_queue_mem_size;
> > +       void    (*ndo_queue_cfg_defaults)(struct net_device *dev,
> > +                                         int idx,
> > +                                         struct netdev_queue_config *qcfg);
> >         int     (*ndo_queue_mem_alloc)(struct net_device *dev,
> > +                                      struct netdev_queue_config *qcfg,
> >                                        void *per_queue_mem,
> >                                        int idx);
> >         void    (*ndo_queue_mem_free)(struct net_device *dev,
> >                                       void *per_queue_mem);
> >         int     (*ndo_queue_start)(struct net_device *dev,
> > +                                  struct netdev_queue_config *qcfg,
> >                                    void *per_queue_mem,
> >                                    int idx);
> >         int     (*ndo_queue_stop)(struct net_device *dev,
> > @@ -166,6 +182,9 @@ struct netdev_queue_mgmt_ops {
> >                                   int idx);
> >  };
> >
> > +void netdev_queue_config(struct net_device *dev, int rxq,
> > +                        struct netdev_queue_config *qcfg);
> > +
> >  /**
> >   * DOC: Lockless queue stopping / waking helpers.
> >   *
> > diff --git a/net/core/dev.h b/net/core/dev.h
> > index 7041c8bd2a0f..a553a0f1f846 100644
> > --- a/net/core/dev.h
> > +++ b/net/core/dev.h
> > @@ -9,6 +9,7 @@
> >  #include <net/netdev_lock.h>
> >
> >  struct net;
> > +struct netdev_queue_config;
> >  struct netlink_ext_ack;
> >  struct cpumask;
> >
> > @@ -96,6 +97,8 @@ int netdev_alloc_config(struct net_device *dev);
> >  void __netdev_free_config(struct netdev_config *cfg);
> >  void netdev_free_config(struct net_device *dev);
> >  int netdev_reconfig_start(struct net_device *dev);
> > +void __netdev_queue_config(struct net_device *dev, int rxq,
> > +                          struct netdev_queue_config *qcfg, bool pending);
> >
> >  /* netdev management, shared between various uAPI entry points */
> >  struct netdev_name_node {
> > diff --git a/net/core/netdev_config.c b/net/core/netdev_config.c
> > index 270b7f10a192..bad2d53522f0 100644
> > --- a/net/core/netdev_config.c
> > +++ b/net/core/netdev_config.c
> > @@ -8,18 +8,29 @@
> >  int netdev_alloc_config(struct net_device *dev)
> >  {
> >         struct netdev_config *cfg;
> > +       unsigned int maxqs;
> >
> >         cfg = kzalloc(sizeof(*dev->cfg), GFP_KERNEL_ACCOUNT);
> >         if (!cfg)
> >                 return -ENOMEM;
> >
> > +       maxqs = max(dev->num_rx_queues, dev->num_tx_queues);
>
> I honestly did not think about tx queues at all for the queue api thus
> far. The ndos do specify that api applies to rx queues, and maybe the
> driver only implemented them to assume indeed the calls are to rx
> queues. Are you intentionally extending the queue api support for tx
> queues? Or maybe you're allocing configs for the tx queues to be used
> in some future?
>
> Other places in this patch series uses num_rx_queues directly. Feels
> like this should do the same.
>
> > +       cfg->qcfg = kcalloc(maxqs, sizeof(*cfg->qcfg), GFP_KERNEL_ACCOUNT);
> > +       if (!cfg->qcfg)
> > +               goto err_free_cfg;
> > +
> >         dev->cfg = cfg;
> >         dev->cfg_pending = cfg;
> >         return 0;
> > +
> > +err_free_cfg:
> > +       kfree(cfg);
> > +       return -ENOMEM;
> >  }
> >
> >  void __netdev_free_config(struct netdev_config *cfg)
> >  {
> > +       kfree(cfg->qcfg);
> >         kfree(cfg);
> >  }
> >
> > @@ -32,12 +43,59 @@ void netdev_free_config(struct net_device *dev)
> >  int netdev_reconfig_start(struct net_device *dev)
> >  {
> >         struct netdev_config *cfg;
> > +       unsigned int maxqs;
> >
> >         WARN_ON(dev->cfg != dev->cfg_pending);
> >         cfg = kmemdup(dev->cfg, sizeof(*dev->cfg), GFP_KERNEL_ACCOUNT);
> >         if (!cfg)
> >                 return -ENOMEM;
> >
> > +       maxqs = max(dev->num_rx_queues, dev->num_tx_queues);
> > +       cfg->qcfg = kmemdup_array(dev->cfg->qcfg, maxqs, sizeof(*cfg->qcfg),
> > +                                 GFP_KERNEL_ACCOUNT);
> > +       if (!cfg->qcfg)
> > +               goto err_free_cfg;
> > +
> >         dev->cfg_pending = cfg;
> >         return 0;
> > +
> > +err_free_cfg:
> > +       kfree(cfg);
> > +       return -ENOMEM;
> > +}
> > +
> > +void __netdev_queue_config(struct net_device *dev, int rxq,
> > +                          struct netdev_queue_config *qcfg, bool pending)
> > +{
> > +       memset(qcfg, 0, sizeof(*qcfg));
> > +
>
> This memset 0 is wrong for queue configs like hds_thresh where 0 is a
> value, not 'restore default'.
>
> Either netdev_queue_config needs to have a comment that says 'only
> values where 0 is restore default is allowed in this struct', or this
> function needs to handle 0-as-value configs correctly.
>
> But I wonder if the memset(0) is wrong in general. Isn't this helper
> trying to grab the _current_ configuration? So qcfg should be seeded
> with appropriate value from dev->qcfgs[rxq]? This function reads like
> it's trying to get the default configuration, but in a way that
> doesn't handle hds_thresh style semantics correctly?
>

Nevermind this comment, a close review of patch 18 answered this
question actually. You are indeed grabbing the configuration from
dev->qcfgs[rxq], you're just not doing it here because
netdev_queue_config is empty.

-- 
Thanks,
Mina

  reply	other threads:[~2025-08-20  1:33 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-18 13:57 [PATCH net-next v3 00/23][pull request] Queue configs and large buffer providers Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 01/23] net: page_pool: sanitise allocation order Pavel Begunkov
2025-08-18 23:33   ` Mina Almasry
2025-08-19 15:53     ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 02/23] docs: ethtool: document that rx_buf_len must control payload lengths Pavel Begunkov
2025-08-18 23:50   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 03/23] net: ethtool: report max value for rx-buf-len Pavel Begunkov
2025-08-19  0:00   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 04/23] net: use zero value to restore rx_buf_len to default Pavel Begunkov
2025-08-19  0:07   ` Mina Almasry
2025-08-19 15:52     ` Pavel Begunkov
2025-08-19 19:27       ` Mina Almasry
2025-08-20 11:53         ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 05/23] net: clarify the meaning of netdev_config members Pavel Begunkov
2025-08-19  1:46   ` Mina Almasry
2025-08-20 12:04     ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 06/23] net: add rx_buf_len to netdev config Pavel Begunkov
2025-08-19 19:32   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 07/23] eth: bnxt: read the page size from the adapter struct Pavel Begunkov
2025-08-19 19:37   ` Mina Almasry
2025-08-20 13:43     ` Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 08/23] eth: bnxt: set page pool page order based on rx_page_size Pavel Begunkov
2025-08-19 19:43   ` Mina Almasry
2025-08-20 13:51     ` Pavel Begunkov
2025-08-25  6:09       ` Somnath Kotur
2025-08-18 13:57 ` [PATCH net-next v3 09/23] eth: bnxt: support setting size of agg buffers via ethtool Pavel Begunkov
2025-08-19 20:10   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 10/23] net: move netdev_config manipulation to dedicated helpers Pavel Begunkov
2025-08-19 20:15   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 11/23] net: reduce indent of struct netdev_queue_mgmt_ops members Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 12/23] net: allocate per-queue config structs and pass them thru the queue API Pavel Begunkov
2025-08-19 21:29   ` Mina Almasry
2025-08-20  1:32     ` Mina Almasry [this message]
2025-08-18 13:57 ` [PATCH net-next v3 13/23] net: pass extack to netdev_rx_queue_restart() Pavel Begunkov
2025-08-19 21:30   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 14/23] net: add queue config validation callback Pavel Begunkov
2025-08-19 21:54   ` Mina Almasry
2025-08-20  1:31     ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 15/23] eth: bnxt: always set the queue mgmt ops Pavel Begunkov
2025-08-19 21:57   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 16/23] eth: bnxt: store the rx buf size per queue Pavel Begunkov
2025-08-25  6:24   ` Somnath Kotur
2025-08-18 13:57 ` [PATCH net-next v3 17/23] eth: bnxt: adjust the fill level of agg queues with larger buffers Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 18/23] netdev: add support for setting rx-buf-len per queue Pavel Begunkov
2025-08-19 22:36   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 19/23] net: wipe the setting of deactived queues Pavel Begunkov
2025-08-19 22:49   ` Mina Almasry
2025-08-18 13:57 ` [PATCH net-next v3 20/23] eth: bnxt: use queue op config validate Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 21/23] eth: bnxt: support per queue configuration of rx-buf-len Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 22/23] net: let pp memory provider to specify rx buf len Pavel Begunkov
2025-08-18 13:57 ` [PATCH net-next v3 23/23] net: validate driver supports passed qcfg params Pavel Begunkov
2025-08-18 13:59 ` [PATCH net-next v3 00/23][pull request] Queue configs and large buffer providers Pavel Begunkov
2025-08-20  2:31 ` Jakub Kicinski
2025-08-20 13:39   ` Pavel Begunkov
2025-08-20 13:59     ` Mina Almasry
2025-08-21  1:26       ` Jakub Kicinski
2025-08-21  1:37     ` Jakub Kicinski
2025-08-21 15:04       ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHS8izMN+_9iWiZnc_FdkMZVDsRRG9FM8JYsz84V8gqDJU_GAA@mail.gmail.com \
    --to=almasrymina@google.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=ap420073@gmail.com \
    --cc=asml.silence@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dtatulea@nvidia.com \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=io-uring@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox