public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work
@ 2026-02-27 17:13 Jakub Kicinski
  2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
	asml.silence, io-uring, Jakub Kicinski

The iou-zcrx test hasn't been passing in NIPA, I assumed it's because
we're missing iouring changes, but it's still failing after the merge
window. Turns out there was a bug in the implementation which was fixed
separately via the iouring tree. With that out of the way the tests
are passing but flaky. Patch 1 deals with the flakiness.

While looking at this I also noticed that the large chunk test isn't
running at all. So fix and enable it (patches 2 and 3).

Jakub Kicinski (3):
  selftests: drv-net: iou-zcrx: wait for memory provider cleanup
  selftests: drv-net: iou-zcrx: rework large chunks test to use common
    setup
  selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test

 .../selftests/drivers/net/hw/iou-zcrx.py      | 57 ++++++++++---------
 1 file changed, 31 insertions(+), 26 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
  2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
@ 2026-02-27 17:13 ` Jakub Kicinski
  2026-03-02 15:32   ` Dragos Tatulea
  2026-03-02 15:49   ` David Wei
  2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
	asml.silence, io-uring, Jakub Kicinski, shuah, linux-kselftest

io_uring defers zcrx context teardown to the iou_exit workqueue.

  # ps aux | grep iou
  ...    07:58   0:00 [kworker/u19:0-iou_exit]
  ... 07:58   0:00 [kworker/u18:2-iou_exit]

When the test's receiver process exits, bkg() returns but the memory
provider may still be attached to the rx queue. The subsequent defer()
that restores tcp-data-split then fails:

  # Exception while handling defer / cleanup (callback 3 of 3)!
  # Defer Exception| net.ynl.pyynl.lib.ynl.NlError:
      Netlink error: can't disable tcp-data-split while device has
                     memory provider enabled: Invalid argument
  not ok 1 iou-zcrx.test_zcrx.single

Add a helper that polls netdev queue-get until no rx queue reports
the io-uring memory provider attribute. Register it as a defer()
just before tcp-data-split is restored as a "barrier".

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: dw@davidwei.uk
CC: jdamato@fastly.com
CC: linux-kselftest@vger.kernel.org
---
 .../selftests/drivers/net/hw/iou-zcrx.py       | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
index c63d6d6450d2..c27c2064701d 100755
--- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
+++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
@@ -2,14 +2,27 @@
 # SPDX-License-Identifier: GPL-2.0
 
 import re
+import time
 from os import path
 from lib.py import ksft_run, ksft_exit, KsftSkipEx, ksft_variants, KsftNamedVariant
 from lib.py import NetDrvEpEnv
 from lib.py import bkg, cmd, defer, ethtool, rand_port, wait_port_listen
-from lib.py import EthtoolFamily
+from lib.py import EthtoolFamily, NetdevFamily
 
 SKIP_CODE = 42
 
+
+def mp_clear_wait(cfg):
+    """Wait for io_uring memory providers to clear from all device queues."""
+    deadline = time.time() + 5
+    while time.time() < deadline:
+        queues = cfg.netnl.queue_get({'ifindex': cfg.ifindex}, dump=True)
+        if not any('io-uring' in q for q in queues):
+            return
+        time.sleep(0.1)
+    raise TimeoutError("Timed out waiting for memory provider to clear")
+
+
 def create_rss_ctx(cfg):
     output = ethtool(f"-X {cfg.ifname} context new start {cfg.target} equal 1").stdout
     values = re.search(r'New RSS context is (\d+)', output).group(1)
@@ -46,6 +59,7 @@ SKIP_CODE = 42
                                 'tcp-data-split': 'unknown',
                                 'hds-thresh': hds_thresh,
                                 'rx': rx_rings})
+    defer(mp_clear_wait, cfg)
 
     cfg.target = channels - 1
     ethtool(f"-X {cfg.ifname} equal {cfg.target}")
@@ -73,6 +87,7 @@ SKIP_CODE = 42
                                 'tcp-data-split': 'unknown',
                                 'hds-thresh': hds_thresh,
                                 'rx': rx_rings})
+    defer(mp_clear_wait, cfg)
 
     cfg.target = channels - 1
     ethtool(f"-X {cfg.ifname} equal {cfg.target}")
@@ -159,6 +174,7 @@ SKIP_CODE = 42
         cfg.bin_remote = cfg.remote.deploy(cfg.bin_local)
 
         cfg.ethnl = EthtoolFamily()
+        cfg.netnl = NetdevFamily()
         cfg.port = rand_port()
         ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot], args=(cfg, ))
     ksft_exit()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
  2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
  2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
@ 2026-02-27 17:13 ` Jakub Kicinski
  2026-03-02 15:54   ` David Wei
  2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
  2026-03-03  4:47 ` [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work patchwork-bot+netdevbpf
  3 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
	asml.silence, io-uring, Jakub Kicinski, shuah, linux-kselftest

Commit a32bb32d0193 ("selftests: iou-zcrx: test large chunk sizes")
and commit de7c600e2d5b ("selftests/net: parametrise iou-zcrx.py with
ksft_variants") landed at similar time. The large chunks test was
actually not included in the list of tests, so it never run.
We haven't noticed that it uses the old-style helpers
(_get_combined_channels, _get_current_settings, _set_flow_rule)
that were removed by the other commit.

Rework test_zcrx_large_chunks to reuse the single() setup function
and add it to the ksft_run cases list so it actually gets executed.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: dw@davidwei.uk
CC: jdamato@fastly.com
CC: linux-kselftest@vger.kernel.org
---
 .../selftests/drivers/net/hw/iou-zcrx.py      | 31 ++++---------------
 1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
index c27c2064701d..1649c23e05e2 100755
--- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
+++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
@@ -135,36 +135,16 @@ SKIP_CODE = 42
 
     cfg.require_ipver('6')
 
-    combined_chans = _get_combined_channels(cfg)
-    if combined_chans < 2:
-        raise KsftSkipEx('at least 2 combined channels required')
-    (rx_ring, hds_thresh) = _get_current_settings(cfg)
-    port = rand_port()
-
-    ethtool(f"-G {cfg.ifname} tcp-data-split on")
-    defer(ethtool, f"-G {cfg.ifname} tcp-data-split auto")
-
-    ethtool(f"-G {cfg.ifname} hds-thresh 0")
-    defer(ethtool, f"-G {cfg.ifname} hds-thresh {hds_thresh}")
-
-    ethtool(f"-G {cfg.ifname} rx 64")
-    defer(ethtool, f"-G {cfg.ifname} rx {rx_ring}")
-
-    ethtool(f"-X {cfg.ifname} equal {combined_chans - 1}")
-    defer(ethtool, f"-X {cfg.ifname} default")
-
-    flow_rule_id = _set_flow_rule(cfg, port, combined_chans - 1)
-    defer(ethtool, f"-N {cfg.ifname} delete {flow_rule_id}")
-
-    rx_cmd = f"{cfg.bin_local} -s -p {port} -i {cfg.ifname} -q {combined_chans - 1} -x 2"
-    tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {port} -l 12840"
+    single(cfg)
+    rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
+    tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
 
     probe = cmd(rx_cmd + " -d", fail=False)
     if probe.ret == SKIP_CODE:
         raise KsftSkipEx(probe.stdout)
 
     with bkg(rx_cmd, exit_wait=True):
-        wait_port_listen(port, proto="tcp")
+        wait_port_listen(cfg.port, proto="tcp")
         cmd(tx_cmd, host=cfg.remote)
 
 
@@ -176,7 +156,8 @@ SKIP_CODE = 42
         cfg.ethnl = EthtoolFamily()
         cfg.netnl = NetdevFamily()
         cfg.port = rand_port()
-        ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot], args=(cfg, ))
+        ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot,
+                                        test_zcrx_large_chunks], args=(cfg, ))
     ksft_exit()
 
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
  2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
  2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
  2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
@ 2026-02-27 17:13 ` Jakub Kicinski
  2026-03-02 15:16   ` Dragos Tatulea
  2026-03-03  4:47 ` [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work patchwork-bot+netdevbpf
  3 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
	asml.silence, io-uring, Jakub Kicinski, shuah, linux-kselftest

The large chunks test needs 2MB hugepages for its mmap allocation,
but the test system may not have any pre-allocated. Ensure at least
64 hugepages are available before running the test, and restore the
original value on cleanup.

While at it strip the stdout, it has a trailing new line.

Before:
  ok 5 iou-zcrx.test_zcrx_large_chunks # SKIP Can't allocate huge pages

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: dw@davidwei.uk
CC: jdamato@fastly.com
CC: linux-kselftest@vger.kernel.org
---
 tools/testing/selftests/drivers/net/hw/iou-zcrx.py | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
index 1649c23e05e2..66dd496ec5cf 100755
--- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
+++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
@@ -135,13 +135,21 @@ SKIP_CODE = 42
 
     cfg.require_ipver('6')
 
+    hp_file = "/proc/sys/vm/nr_hugepages"
+    with open(hp_file, 'r+', encoding='utf-8') as f:
+        nr_hugepages = int(f.read().strip())
+        if nr_hugepages < 64:
+            f.seek(0)
+            f.write("64")
+            defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
+
     single(cfg)
     rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
     tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
 
     probe = cmd(rx_cmd + " -d", fail=False)
     if probe.ret == SKIP_CODE:
-        raise KsftSkipEx(probe.stdout)
+        raise KsftSkipEx(probe.stdout.strip())
 
     with bkg(rx_cmd, exit_wait=True):
         wait_port_listen(cfg.port, proto="tcp")
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
  2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
@ 2026-03-02 15:16   ` Dragos Tatulea
  2026-03-03  2:22     ` Jakub Kicinski
  0 siblings, 1 reply; 15+ messages in thread
From: Dragos Tatulea @ 2026-03-02 15:16 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
	asml.silence, io-uring, shuah, linux-kselftest



On 27.02.26 18:13, Jakub Kicinski wrote:
> The large chunks test needs 2MB hugepages for its mmap allocation,
> but the test system may not have any pre-allocated. Ensure at least
> 64 hugepages are available before running the test, and restore the
> original value on cleanup.
> 
> While at it strip the stdout, it has a trailing new line.
> 
> Before:
>   ok 5 iou-zcrx.test_zcrx_large_chunks # SKIP Can't allocate huge pages
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---
>  tools/testing/selftests/drivers/net/hw/iou-zcrx.py | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> index 1649c23e05e2..66dd496ec5cf 100755
> --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> @@ -135,13 +135,21 @@ SKIP_CODE = 42
>  
>      cfg.require_ipver('6')
>  
> +    hp_file = "/proc/sys/vm/nr_hugepages"
> +    with open(hp_file, 'r+', encoding='utf-8') as f:
> +        nr_hugepages = int(f.read().strip())
> +        if nr_hugepages < 64:
> +            f.seek(0)
> +            f.write("64")
> +            defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
> +
>      single(cfg)
>      rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
>      tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
>  
>      probe = cmd(rx_cmd + " -d", fail=False)
>      if probe.ret == SKIP_CODE:
> -        raise KsftSkipEx(probe.stdout)
> +        raise KsftSkipEx(probe.stdout.strip())
>  
While working on a similar fix I found that the probe here also requires a barrier.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
  2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
@ 2026-03-02 15:32   ` Dragos Tatulea
  2026-03-02 15:49   ` David Wei
  1 sibling, 0 replies; 15+ messages in thread
From: Dragos Tatulea @ 2026-03-02 15:32 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
	asml.silence, io-uring, shuah, linux-kselftest



On 27.02.26 18:13, Jakub Kicinski wrote:
> io_uring defers zcrx context teardown to the iou_exit workqueue.
> 
>   # ps aux | grep iou
>   ...    07:58   0:00 [kworker/u19:0-iou_exit]
>   ... 07:58   0:00 [kworker/u18:2-iou_exit]
> 
> When the test's receiver process exits, bkg() returns but the memory
> provider may still be attached to the rx queue. The subsequent defer()
> that restores tcp-data-split then fails:
> 
>   # Exception while handling defer / cleanup (callback 3 of 3)!
>   # Defer Exception| net.ynl.pyynl.lib.ynl.NlError:
>       Netlink error: can't disable tcp-data-split while device has
>                      memory provider enabled: Invalid argument
>   not ok 1 iou-zcrx.test_zcrx.single
> 
> Add a helper that polls netdev queue-get until no rx queue reports
> the io-uring memory provider attribute. Register it as a defer()
> just before tcp-data-split is restored as a "barrier".
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---

Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
  2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
  2026-03-02 15:32   ` Dragos Tatulea
@ 2026-03-02 15:49   ` David Wei
  2026-03-03  0:46     ` Jakub Kicinski
  1 sibling, 1 reply; 15+ messages in thread
From: David Wei @ 2026-03-02 15:49 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	asml.silence, io-uring, shuah, linux-kselftest

On 2026-02-27 09:13, Jakub Kicinski wrote:
> io_uring defers zcrx context teardown to the iou_exit workqueue.
> 
>    # ps aux | grep iou
>    ...    07:58   0:00 [kworker/u19:0-iou_exit]
>    ... 07:58   0:00 [kworker/u18:2-iou_exit]
> 
> When the test's receiver process exits, bkg() returns but the memory
> provider may still be attached to the rx queue. The subsequent defer()
> that restores tcp-data-split then fails:
> 
>    # Exception while handling defer / cleanup (callback 3 of 3)!
>    # Defer Exception| net.ynl.pyynl.lib.ynl.NlError:
>        Netlink error: can't disable tcp-data-split while device has
>                       memory provider enabled: Invalid argument
>    not ok 1 iou-zcrx.test_zcrx.single
> 
> Add a helper that polls netdev queue-get until no rx queue reports
> the io-uring memory provider attribute. Register it as a defer()
> just before tcp-data-split is restored as a "barrier".
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---
>   .../selftests/drivers/net/hw/iou-zcrx.py       | 18 +++++++++++++++++-
>   1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> index c63d6d6450d2..c27c2064701d 100755
> --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> @@ -2,14 +2,27 @@
>   # SPDX-License-Identifier: GPL-2.0
>   
>   import re
> +import time
>   from os import path
>   from lib.py import ksft_run, ksft_exit, KsftSkipEx, ksft_variants, KsftNamedVariant
>   from lib.py import NetDrvEpEnv
>   from lib.py import bkg, cmd, defer, ethtool, rand_port, wait_port_listen
> -from lib.py import EthtoolFamily
> +from lib.py import EthtoolFamily, NetdevFamily
>   
>   SKIP_CODE = 42
>   
> +
> +def mp_clear_wait(cfg):
> +    """Wait for io_uring memory providers to clear from all device queues."""
> +    deadline = time.time() + 5

This is potentially a very long time to wait if code is buggy, as I
found out when debugging netkit queue lease. How about reducing this to
say 1 second?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
  2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
@ 2026-03-02 15:54   ` David Wei
  2026-03-03  0:48     ` Jakub Kicinski
  0 siblings, 1 reply; 15+ messages in thread
From: David Wei @ 2026-03-02 15:54 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	asml.silence, io-uring, shuah, linux-kselftest

On 2026-02-27 09:13, Jakub Kicinski wrote:
> Commit a32bb32d0193 ("selftests: iou-zcrx: test large chunk sizes")
> and commit de7c600e2d5b ("selftests/net: parametrise iou-zcrx.py with
> ksft_variants") landed at similar time. The large chunks test was
> actually not included in the list of tests, so it never run.
> We haven't noticed that it uses the old-style helpers
> (_get_combined_channels, _get_current_settings, _set_flow_rule)
> that were removed by the other commit.
> 
> Rework test_zcrx_large_chunks to reuse the single() setup function
> and add it to the ksft_run cases list so it actually gets executed.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---
>   .../selftests/drivers/net/hw/iou-zcrx.py      | 31 ++++---------------
>   1 file changed, 6 insertions(+), 25 deletions(-)
> 
> diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> index c27c2064701d..1649c23e05e2 100755
> --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> @@ -135,36 +135,16 @@ SKIP_CODE = 42
>   
>       cfg.require_ipver('6')
>   
> -    combined_chans = _get_combined_channels(cfg)
> -    if combined_chans < 2:
> -        raise KsftSkipEx('at least 2 combined channels required')
> -    (rx_ring, hds_thresh) = _get_current_settings(cfg)
> -    port = rand_port()
> -
> -    ethtool(f"-G {cfg.ifname} tcp-data-split on")
> -    defer(ethtool, f"-G {cfg.ifname} tcp-data-split auto")
> -
> -    ethtool(f"-G {cfg.ifname} hds-thresh 0")
> -    defer(ethtool, f"-G {cfg.ifname} hds-thresh {hds_thresh}")
> -
> -    ethtool(f"-G {cfg.ifname} rx 64")
> -    defer(ethtool, f"-G {cfg.ifname} rx {rx_ring}")
> -
> -    ethtool(f"-X {cfg.ifname} equal {combined_chans - 1}")
> -    defer(ethtool, f"-X {cfg.ifname} default")
> -
> -    flow_rule_id = _set_flow_rule(cfg, port, combined_chans - 1)
> -    defer(ethtool, f"-N {cfg.ifname} delete {flow_rule_id}")
> -
> -    rx_cmd = f"{cfg.bin_local} -s -p {port} -i {cfg.ifname} -q {combined_chans - 1} -x 2"
> -    tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {port} -l 12840"
> +    single(cfg)

Let's use ksft_variants() with both single() and rss()?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
  2026-03-02 15:49   ` David Wei
@ 2026-03-03  0:46     ` Jakub Kicinski
  2026-03-03  1:39       ` David Wei
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-03-03  0:46 UTC (permalink / raw)
  To: David Wei
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	asml.silence, io-uring, shuah, linux-kselftest

On Mon, 2 Mar 2026 07:49:01 -0800 David Wei wrote:
> > +def mp_clear_wait(cfg):
> > +    """Wait for io_uring memory providers to clear from all device queues."""
> > +    deadline = time.time() + 5  
> 
> This is potentially a very long time to wait if code is buggy, as I
> found out when debugging netkit queue lease. How about reducing this to
> say 1 second?

Just to be clear -- you're saying that 5 seconds is a long time to wait?
Please note that if this wait times out we're going to fail the test,
the timeout does not impact the length of a successful run.

I picked 5 sec because with all the debugs enabled and under QEMU
scheduling latency spikes can be pretty brutal. I guess I could make it
3 seconds if it matters a lot?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
  2026-03-02 15:54   ` David Wei
@ 2026-03-03  0:48     ` Jakub Kicinski
  2026-03-03  1:44       ` David Wei
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-03-03  0:48 UTC (permalink / raw)
  To: David Wei
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	asml.silence, io-uring, shuah, linux-kselftest

On Mon, 2 Mar 2026 07:54:28 -0800 David Wei wrote:
> Let's use ksft_variants() with both single() and rss()?

Woohai? I intentionally chose to only test one, buffer configuration
and flow steering are quite orthogonal. What extra coverage do you have
in mind by asking for both?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
  2026-03-03  0:46     ` Jakub Kicinski
@ 2026-03-03  1:39       ` David Wei
  0 siblings, 0 replies; 15+ messages in thread
From: David Wei @ 2026-03-03  1:39 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	asml.silence, io-uring, shuah, linux-kselftest

On 2026-03-02 16:46, Jakub Kicinski wrote:
> On Mon, 2 Mar 2026 07:49:01 -0800 David Wei wrote:
>>> +def mp_clear_wait(cfg):
>>> +    """Wait for io_uring memory providers to clear from all device queues."""
>>> +    deadline = time.time() + 5
>>
>> This is potentially a very long time to wait if code is buggy, as I
>> found out when debugging netkit queue lease. How about reducing this to
>> say 1 second?
> 
> Just to be clear -- you're saying that 5 seconds is a long time to wait?
> Please note that if this wait times out we're going to fail the test,
> the timeout does not impact the length of a successful run.
> 
> I picked 5 sec because with all the debugs enabled and under QEMU
> scheduling latency spikes can be pretty brutal. I guess I could make it
> 3 seconds if it matters a lot?

Hmm, yeah let's leave it at 5 then. Should not be optimising for my
buggy code.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
  2026-03-03  0:48     ` Jakub Kicinski
@ 2026-03-03  1:44       ` David Wei
  0 siblings, 0 replies; 15+ messages in thread
From: David Wei @ 2026-03-03  1:44 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
	asml.silence, io-uring, shuah, linux-kselftest

On 2026-03-02 16:48, Jakub Kicinski wrote:
> On Mon, 2 Mar 2026 07:54:28 -0800 David Wei wrote:
>> Let's use ksft_variants() with both single() and rss()?
> 
> Woohai? I intentionally chose to only test one, buffer configuration
> and flow steering are quite orthogonal. What extra coverage do you have
> in mind by asking for both?

Mostly paranoia, in case there are any unexpected differences with RSS
vs single queue. Someone wrote the lovely ksft_variants code, why not
use it? :P

I should send the patch that actually adds pthreads to the iou-zcrx.c
test binary...

(I don't feel strongly either way. Whatever you prefer.)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
  2026-03-02 15:16   ` Dragos Tatulea
@ 2026-03-03  2:22     ` Jakub Kicinski
  2026-03-03  8:41       ` Dragos Tatulea
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-03-03  2:22 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, dw,
	jdamato, asml.silence, io-uring, shuah, linux-kselftest

On Mon, 2 Mar 2026 16:16:38 +0100 Dragos Tatulea wrote:
> > +    hp_file = "/proc/sys/vm/nr_hugepages"
> > +    with open(hp_file, 'r+', encoding='utf-8') as f:
> > +        nr_hugepages = int(f.read().strip())
> > +        if nr_hugepages < 64:
> > +            f.seek(0)
> > +            f.write("64")
> > +            defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
> > +
> >      single(cfg)
> >      rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
> >      tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
> >  
> >      probe = cmd(rx_cmd + " -d", fail=False)
> >      if probe.ret == SKIP_CODE:
> > -        raise KsftSkipEx(probe.stdout)
> > +        raise KsftSkipEx(probe.stdout.strip())
> >    
> While working on a similar fix I found that the probe here also requires a barrier.

Hm, I'm not hitting this issue. Maybe because I'm testing in QEMU?
If you can still repro after this series could you send a follow up?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work
  2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
                   ` (2 preceding siblings ...)
  2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
@ 2026-03-03  4:47 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-03-03  4:47 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, dw,
	jdamato, asml.silence, io-uring

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 27 Feb 2026 09:13:02 -0800 you wrote:
> The iou-zcrx test hasn't been passing in NIPA, I assumed it's because
> we're missing iouring changes, but it's still failing after the merge
> window. Turns out there was a bug in the implementation which was fixed
> separately via the iouring tree. With that out of the way the tests
> are passing but flaky. Patch 1 deals with the flakiness.
> 
> While looking at this I also noticed that the large chunk test isn't
> running at all. So fix and enable it (patches 2 and 3).
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
    https://git.kernel.org/netdev/net-next/c/27c4ab943882
  - [net-next,2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
    https://git.kernel.org/netdev/net-next/c/67792dde27a6
  - [net-next,3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
    https://git.kernel.org/netdev/net-next/c/c7b228418e8b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
  2026-03-03  2:22     ` Jakub Kicinski
@ 2026-03-03  8:41       ` Dragos Tatulea
  0 siblings, 0 replies; 15+ messages in thread
From: Dragos Tatulea @ 2026-03-03  8:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, dw,
	jdamato, asml.silence, io-uring, shuah, linux-kselftest



On 03.03.26 03:22, Jakub Kicinski wrote:
> On Mon, 2 Mar 2026 16:16:38 +0100 Dragos Tatulea wrote:
>>> +    hp_file = "/proc/sys/vm/nr_hugepages"
>>> +    with open(hp_file, 'r+', encoding='utf-8') as f:
>>> +        nr_hugepages = int(f.read().strip())
>>> +        if nr_hugepages < 64:
>>> +            f.seek(0)
>>> +            f.write("64")
>>> +            defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
>>> +
>>>      single(cfg)
>>>      rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
>>>      tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
>>>  
>>>      probe = cmd(rx_cmd + " -d", fail=False)
>>>      if probe.ret == SKIP_CODE:
>>> -        raise KsftSkipEx(probe.stdout)
>>> +        raise KsftSkipEx(probe.stdout.strip())
>>>    
>> While working on a similar fix I found that the probe here also requires a barrier.
> 
> Hm, I'm not hitting this issue. Maybe because I'm testing in QEMU?
> If you can still repro after this series could you send a follow up?
Will do.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-03-03  8:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
2026-03-02 15:32   ` Dragos Tatulea
2026-03-02 15:49   ` David Wei
2026-03-03  0:46     ` Jakub Kicinski
2026-03-03  1:39       ` David Wei
2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
2026-03-02 15:54   ` David Wei
2026-03-03  0:48     ` Jakub Kicinski
2026-03-03  1:44       ` David Wei
2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
2026-03-02 15:16   ` Dragos Tatulea
2026-03-03  2:22     ` Jakub Kicinski
2026-03-03  8:41       ` Dragos Tatulea
2026-03-03  4:47 ` [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox