* [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work
@ 2026-02-27 17:13 Jakub Kicinski
2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
asml.silence, io-uring, Jakub Kicinski
The iou-zcrx test hasn't been passing in NIPA, I assumed it's because
we're missing iouring changes, but it's still failing after the merge
window. Turns out there was a bug in the implementation which was fixed
separately via the iouring tree. With that out of the way the tests
are passing but flaky. Patch 1 deals with the flakiness.
While looking at this I also noticed that the large chunk test isn't
running at all. So fix and enable it (patches 2 and 3).
Jakub Kicinski (3):
selftests: drv-net: iou-zcrx: wait for memory provider cleanup
selftests: drv-net: iou-zcrx: rework large chunks test to use common
setup
selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
.../selftests/drivers/net/hw/iou-zcrx.py | 57 ++++++++++---------
1 file changed, 31 insertions(+), 26 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
@ 2026-02-27 17:13 ` Jakub Kicinski
2026-03-02 15:32 ` Dragos Tatulea
2026-03-02 15:49 ` David Wei
2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
` (2 subsequent siblings)
3 siblings, 2 replies; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
asml.silence, io-uring, Jakub Kicinski, shuah, linux-kselftest
io_uring defers zcrx context teardown to the iou_exit workqueue.
# ps aux | grep iou
... 07:58 0:00 [kworker/u19:0-iou_exit]
... 07:58 0:00 [kworker/u18:2-iou_exit]
When the test's receiver process exits, bkg() returns but the memory
provider may still be attached to the rx queue. The subsequent defer()
that restores tcp-data-split then fails:
# Exception while handling defer / cleanup (callback 3 of 3)!
# Defer Exception| net.ynl.pyynl.lib.ynl.NlError:
Netlink error: can't disable tcp-data-split while device has
memory provider enabled: Invalid argument
not ok 1 iou-zcrx.test_zcrx.single
Add a helper that polls netdev queue-get until no rx queue reports
the io-uring memory provider attribute. Register it as a defer()
just before tcp-data-split is restored as a "barrier".
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: dw@davidwei.uk
CC: jdamato@fastly.com
CC: linux-kselftest@vger.kernel.org
---
.../selftests/drivers/net/hw/iou-zcrx.py | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
index c63d6d6450d2..c27c2064701d 100755
--- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
+++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
@@ -2,14 +2,27 @@
# SPDX-License-Identifier: GPL-2.0
import re
+import time
from os import path
from lib.py import ksft_run, ksft_exit, KsftSkipEx, ksft_variants, KsftNamedVariant
from lib.py import NetDrvEpEnv
from lib.py import bkg, cmd, defer, ethtool, rand_port, wait_port_listen
-from lib.py import EthtoolFamily
+from lib.py import EthtoolFamily, NetdevFamily
SKIP_CODE = 42
+
+def mp_clear_wait(cfg):
+ """Wait for io_uring memory providers to clear from all device queues."""
+ deadline = time.time() + 5
+ while time.time() < deadline:
+ queues = cfg.netnl.queue_get({'ifindex': cfg.ifindex}, dump=True)
+ if not any('io-uring' in q for q in queues):
+ return
+ time.sleep(0.1)
+ raise TimeoutError("Timed out waiting for memory provider to clear")
+
+
def create_rss_ctx(cfg):
output = ethtool(f"-X {cfg.ifname} context new start {cfg.target} equal 1").stdout
values = re.search(r'New RSS context is (\d+)', output).group(1)
@@ -46,6 +59,7 @@ SKIP_CODE = 42
'tcp-data-split': 'unknown',
'hds-thresh': hds_thresh,
'rx': rx_rings})
+ defer(mp_clear_wait, cfg)
cfg.target = channels - 1
ethtool(f"-X {cfg.ifname} equal {cfg.target}")
@@ -73,6 +87,7 @@ SKIP_CODE = 42
'tcp-data-split': 'unknown',
'hds-thresh': hds_thresh,
'rx': rx_rings})
+ defer(mp_clear_wait, cfg)
cfg.target = channels - 1
ethtool(f"-X {cfg.ifname} equal {cfg.target}")
@@ -159,6 +174,7 @@ SKIP_CODE = 42
cfg.bin_remote = cfg.remote.deploy(cfg.bin_local)
cfg.ethnl = EthtoolFamily()
+ cfg.netnl = NetdevFamily()
cfg.port = rand_port()
ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot], args=(cfg, ))
ksft_exit()
--
2.53.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
@ 2026-02-27 17:13 ` Jakub Kicinski
2026-03-02 15:54 ` David Wei
2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
2026-03-03 4:47 ` [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work patchwork-bot+netdevbpf
3 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
asml.silence, io-uring, Jakub Kicinski, shuah, linux-kselftest
Commit a32bb32d0193 ("selftests: iou-zcrx: test large chunk sizes")
and commit de7c600e2d5b ("selftests/net: parametrise iou-zcrx.py with
ksft_variants") landed at similar time. The large chunks test was
actually not included in the list of tests, so it never run.
We haven't noticed that it uses the old-style helpers
(_get_combined_channels, _get_current_settings, _set_flow_rule)
that were removed by the other commit.
Rework test_zcrx_large_chunks to reuse the single() setup function
and add it to the ksft_run cases list so it actually gets executed.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: dw@davidwei.uk
CC: jdamato@fastly.com
CC: linux-kselftest@vger.kernel.org
---
.../selftests/drivers/net/hw/iou-zcrx.py | 31 ++++---------------
1 file changed, 6 insertions(+), 25 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
index c27c2064701d..1649c23e05e2 100755
--- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
+++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
@@ -135,36 +135,16 @@ SKIP_CODE = 42
cfg.require_ipver('6')
- combined_chans = _get_combined_channels(cfg)
- if combined_chans < 2:
- raise KsftSkipEx('at least 2 combined channels required')
- (rx_ring, hds_thresh) = _get_current_settings(cfg)
- port = rand_port()
-
- ethtool(f"-G {cfg.ifname} tcp-data-split on")
- defer(ethtool, f"-G {cfg.ifname} tcp-data-split auto")
-
- ethtool(f"-G {cfg.ifname} hds-thresh 0")
- defer(ethtool, f"-G {cfg.ifname} hds-thresh {hds_thresh}")
-
- ethtool(f"-G {cfg.ifname} rx 64")
- defer(ethtool, f"-G {cfg.ifname} rx {rx_ring}")
-
- ethtool(f"-X {cfg.ifname} equal {combined_chans - 1}")
- defer(ethtool, f"-X {cfg.ifname} default")
-
- flow_rule_id = _set_flow_rule(cfg, port, combined_chans - 1)
- defer(ethtool, f"-N {cfg.ifname} delete {flow_rule_id}")
-
- rx_cmd = f"{cfg.bin_local} -s -p {port} -i {cfg.ifname} -q {combined_chans - 1} -x 2"
- tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {port} -l 12840"
+ single(cfg)
+ rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
+ tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
probe = cmd(rx_cmd + " -d", fail=False)
if probe.ret == SKIP_CODE:
raise KsftSkipEx(probe.stdout)
with bkg(rx_cmd, exit_wait=True):
- wait_port_listen(port, proto="tcp")
+ wait_port_listen(cfg.port, proto="tcp")
cmd(tx_cmd, host=cfg.remote)
@@ -176,7 +156,8 @@ SKIP_CODE = 42
cfg.ethnl = EthtoolFamily()
cfg.netnl = NetdevFamily()
cfg.port = rand_port()
- ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot], args=(cfg, ))
+ ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot,
+ test_zcrx_large_chunks], args=(cfg, ))
ksft_exit()
--
2.53.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
@ 2026-02-27 17:13 ` Jakub Kicinski
2026-03-02 15:16 ` Dragos Tatulea
2026-03-03 4:47 ` [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work patchwork-bot+netdevbpf
3 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-02-27 17:13 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
asml.silence, io-uring, Jakub Kicinski, shuah, linux-kselftest
The large chunks test needs 2MB hugepages for its mmap allocation,
but the test system may not have any pre-allocated. Ensure at least
64 hugepages are available before running the test, and restore the
original value on cleanup.
While at it strip the stdout, it has a trailing new line.
Before:
ok 5 iou-zcrx.test_zcrx_large_chunks # SKIP Can't allocate huge pages
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: dw@davidwei.uk
CC: jdamato@fastly.com
CC: linux-kselftest@vger.kernel.org
---
tools/testing/selftests/drivers/net/hw/iou-zcrx.py | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
index 1649c23e05e2..66dd496ec5cf 100755
--- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
+++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
@@ -135,13 +135,21 @@ SKIP_CODE = 42
cfg.require_ipver('6')
+ hp_file = "/proc/sys/vm/nr_hugepages"
+ with open(hp_file, 'r+', encoding='utf-8') as f:
+ nr_hugepages = int(f.read().strip())
+ if nr_hugepages < 64:
+ f.seek(0)
+ f.write("64")
+ defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
+
single(cfg)
rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
probe = cmd(rx_cmd + " -d", fail=False)
if probe.ret == SKIP_CODE:
- raise KsftSkipEx(probe.stdout)
+ raise KsftSkipEx(probe.stdout.strip())
with bkg(rx_cmd, exit_wait=True):
wait_port_listen(cfg.port, proto="tcp")
--
2.53.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
@ 2026-03-02 15:16 ` Dragos Tatulea
2026-03-03 2:22 ` Jakub Kicinski
0 siblings, 1 reply; 15+ messages in thread
From: Dragos Tatulea @ 2026-03-02 15:16 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
asml.silence, io-uring, shuah, linux-kselftest
On 27.02.26 18:13, Jakub Kicinski wrote:
> The large chunks test needs 2MB hugepages for its mmap allocation,
> but the test system may not have any pre-allocated. Ensure at least
> 64 hugepages are available before running the test, and restore the
> original value on cleanup.
>
> While at it strip the stdout, it has a trailing new line.
>
> Before:
> ok 5 iou-zcrx.test_zcrx_large_chunks # SKIP Can't allocate huge pages
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---
> tools/testing/selftests/drivers/net/hw/iou-zcrx.py | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> index 1649c23e05e2..66dd496ec5cf 100755
> --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> @@ -135,13 +135,21 @@ SKIP_CODE = 42
>
> cfg.require_ipver('6')
>
> + hp_file = "/proc/sys/vm/nr_hugepages"
> + with open(hp_file, 'r+', encoding='utf-8') as f:
> + nr_hugepages = int(f.read().strip())
> + if nr_hugepages < 64:
> + f.seek(0)
> + f.write("64")
> + defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
> +
> single(cfg)
> rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
> tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
>
> probe = cmd(rx_cmd + " -d", fail=False)
> if probe.ret == SKIP_CODE:
> - raise KsftSkipEx(probe.stdout)
> + raise KsftSkipEx(probe.stdout.strip())
>
While working on a similar fix I found that the probe here also requires a barrier.
Thanks,
Dragos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
@ 2026-03-02 15:32 ` Dragos Tatulea
2026-03-02 15:49 ` David Wei
1 sibling, 0 replies; 15+ messages in thread
From: Dragos Tatulea @ 2026-03-02 15:32 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, dw, jdamato,
asml.silence, io-uring, shuah, linux-kselftest
On 27.02.26 18:13, Jakub Kicinski wrote:
> io_uring defers zcrx context teardown to the iou_exit workqueue.
>
> # ps aux | grep iou
> ... 07:58 0:00 [kworker/u19:0-iou_exit]
> ... 07:58 0:00 [kworker/u18:2-iou_exit]
>
> When the test's receiver process exits, bkg() returns but the memory
> provider may still be attached to the rx queue. The subsequent defer()
> that restores tcp-data-split then fails:
>
> # Exception while handling defer / cleanup (callback 3 of 3)!
> # Defer Exception| net.ynl.pyynl.lib.ynl.NlError:
> Netlink error: can't disable tcp-data-split while device has
> memory provider enabled: Invalid argument
> not ok 1 iou-zcrx.test_zcrx.single
>
> Add a helper that polls netdev queue-get until no rx queue reports
> the io-uring memory provider attribute. Register it as a defer()
> just before tcp-data-split is restored as a "barrier".
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Thanks,
Dragos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
2026-03-02 15:32 ` Dragos Tatulea
@ 2026-03-02 15:49 ` David Wei
2026-03-03 0:46 ` Jakub Kicinski
1 sibling, 1 reply; 15+ messages in thread
From: David Wei @ 2026-03-02 15:49 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
asml.silence, io-uring, shuah, linux-kselftest
On 2026-02-27 09:13, Jakub Kicinski wrote:
> io_uring defers zcrx context teardown to the iou_exit workqueue.
>
> # ps aux | grep iou
> ... 07:58 0:00 [kworker/u19:0-iou_exit]
> ... 07:58 0:00 [kworker/u18:2-iou_exit]
>
> When the test's receiver process exits, bkg() returns but the memory
> provider may still be attached to the rx queue. The subsequent defer()
> that restores tcp-data-split then fails:
>
> # Exception while handling defer / cleanup (callback 3 of 3)!
> # Defer Exception| net.ynl.pyynl.lib.ynl.NlError:
> Netlink error: can't disable tcp-data-split while device has
> memory provider enabled: Invalid argument
> not ok 1 iou-zcrx.test_zcrx.single
>
> Add a helper that polls netdev queue-get until no rx queue reports
> the io-uring memory provider attribute. Register it as a defer()
> just before tcp-data-split is restored as a "barrier".
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---
> .../selftests/drivers/net/hw/iou-zcrx.py | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> index c63d6d6450d2..c27c2064701d 100755
> --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> @@ -2,14 +2,27 @@
> # SPDX-License-Identifier: GPL-2.0
>
> import re
> +import time
> from os import path
> from lib.py import ksft_run, ksft_exit, KsftSkipEx, ksft_variants, KsftNamedVariant
> from lib.py import NetDrvEpEnv
> from lib.py import bkg, cmd, defer, ethtool, rand_port, wait_port_listen
> -from lib.py import EthtoolFamily
> +from lib.py import EthtoolFamily, NetdevFamily
>
> SKIP_CODE = 42
>
> +
> +def mp_clear_wait(cfg):
> + """Wait for io_uring memory providers to clear from all device queues."""
> + deadline = time.time() + 5
This is potentially a very long time to wait if code is buggy, as I
found out when debugging netkit queue lease. How about reducing this to
say 1 second?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
@ 2026-03-02 15:54 ` David Wei
2026-03-03 0:48 ` Jakub Kicinski
0 siblings, 1 reply; 15+ messages in thread
From: David Wei @ 2026-03-02 15:54 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
asml.silence, io-uring, shuah, linux-kselftest
On 2026-02-27 09:13, Jakub Kicinski wrote:
> Commit a32bb32d0193 ("selftests: iou-zcrx: test large chunk sizes")
> and commit de7c600e2d5b ("selftests/net: parametrise iou-zcrx.py with
> ksft_variants") landed at similar time. The large chunks test was
> actually not included in the list of tests, so it never run.
> We haven't noticed that it uses the old-style helpers
> (_get_combined_channels, _get_current_settings, _set_flow_rule)
> that were removed by the other commit.
>
> Rework test_zcrx_large_chunks to reuse the single() setup function
> and add it to the ksft_run cases list so it actually gets executed.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: dw@davidwei.uk
> CC: jdamato@fastly.com
> CC: linux-kselftest@vger.kernel.org
> ---
> .../selftests/drivers/net/hw/iou-zcrx.py | 31 ++++---------------
> 1 file changed, 6 insertions(+), 25 deletions(-)
>
> diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> index c27c2064701d..1649c23e05e2 100755
> --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py
> @@ -135,36 +135,16 @@ SKIP_CODE = 42
>
> cfg.require_ipver('6')
>
> - combined_chans = _get_combined_channels(cfg)
> - if combined_chans < 2:
> - raise KsftSkipEx('at least 2 combined channels required')
> - (rx_ring, hds_thresh) = _get_current_settings(cfg)
> - port = rand_port()
> -
> - ethtool(f"-G {cfg.ifname} tcp-data-split on")
> - defer(ethtool, f"-G {cfg.ifname} tcp-data-split auto")
> -
> - ethtool(f"-G {cfg.ifname} hds-thresh 0")
> - defer(ethtool, f"-G {cfg.ifname} hds-thresh {hds_thresh}")
> -
> - ethtool(f"-G {cfg.ifname} rx 64")
> - defer(ethtool, f"-G {cfg.ifname} rx {rx_ring}")
> -
> - ethtool(f"-X {cfg.ifname} equal {combined_chans - 1}")
> - defer(ethtool, f"-X {cfg.ifname} default")
> -
> - flow_rule_id = _set_flow_rule(cfg, port, combined_chans - 1)
> - defer(ethtool, f"-N {cfg.ifname} delete {flow_rule_id}")
> -
> - rx_cmd = f"{cfg.bin_local} -s -p {port} -i {cfg.ifname} -q {combined_chans - 1} -x 2"
> - tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {port} -l 12840"
> + single(cfg)
Let's use ksft_variants() with both single() and rss()?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
2026-03-02 15:49 ` David Wei
@ 2026-03-03 0:46 ` Jakub Kicinski
2026-03-03 1:39 ` David Wei
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-03-03 0:46 UTC (permalink / raw)
To: David Wei
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
asml.silence, io-uring, shuah, linux-kselftest
On Mon, 2 Mar 2026 07:49:01 -0800 David Wei wrote:
> > +def mp_clear_wait(cfg):
> > + """Wait for io_uring memory providers to clear from all device queues."""
> > + deadline = time.time() + 5
>
> This is potentially a very long time to wait if code is buggy, as I
> found out when debugging netkit queue lease. How about reducing this to
> say 1 second?
Just to be clear -- you're saying that 5 seconds is a long time to wait?
Please note that if this wait times out we're going to fail the test,
the timeout does not impact the length of a successful run.
I picked 5 sec because with all the debugs enabled and under QEMU
scheduling latency spikes can be pretty brutal. I guess I could make it
3 seconds if it matters a lot?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
2026-03-02 15:54 ` David Wei
@ 2026-03-03 0:48 ` Jakub Kicinski
2026-03-03 1:44 ` David Wei
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-03-03 0:48 UTC (permalink / raw)
To: David Wei
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
asml.silence, io-uring, shuah, linux-kselftest
On Mon, 2 Mar 2026 07:54:28 -0800 David Wei wrote:
> Let's use ksft_variants() with both single() and rss()?
Woohai? I intentionally chose to only test one, buffer configuration
and flow steering are quite orthogonal. What extra coverage do you have
in mind by asking for both?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
2026-03-03 0:46 ` Jakub Kicinski
@ 2026-03-03 1:39 ` David Wei
0 siblings, 0 replies; 15+ messages in thread
From: David Wei @ 2026-03-03 1:39 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
asml.silence, io-uring, shuah, linux-kselftest
On 2026-03-02 16:46, Jakub Kicinski wrote:
> On Mon, 2 Mar 2026 07:49:01 -0800 David Wei wrote:
>>> +def mp_clear_wait(cfg):
>>> + """Wait for io_uring memory providers to clear from all device queues."""
>>> + deadline = time.time() + 5
>>
>> This is potentially a very long time to wait if code is buggy, as I
>> found out when debugging netkit queue lease. How about reducing this to
>> say 1 second?
>
> Just to be clear -- you're saying that 5 seconds is a long time to wait?
> Please note that if this wait times out we're going to fail the test,
> the timeout does not impact the length of a successful run.
>
> I picked 5 sec because with all the debugs enabled and under QEMU
> scheduling latency spikes can be pretty brutal. I guess I could make it
> 3 seconds if it matters a lot?
Hmm, yeah let's leave it at 5 then. Should not be optimising for my
buggy code.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
2026-03-03 0:48 ` Jakub Kicinski
@ 2026-03-03 1:44 ` David Wei
0 siblings, 0 replies; 15+ messages in thread
From: David Wei @ 2026-03-03 1:44 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
asml.silence, io-uring, shuah, linux-kselftest
On 2026-03-02 16:48, Jakub Kicinski wrote:
> On Mon, 2 Mar 2026 07:54:28 -0800 David Wei wrote:
>> Let's use ksft_variants() with both single() and rss()?
>
> Woohai? I intentionally chose to only test one, buffer configuration
> and flow steering are quite orthogonal. What extra coverage do you have
> in mind by asking for both?
Mostly paranoia, in case there are any unexpected differences with RSS
vs single queue. Someone wrote the lovely ksft_variants code, why not
use it? :P
I should send the patch that actually adds pthreads to the iou-zcrx.c
test binary...
(I don't feel strongly either way. Whatever you prefer.)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
2026-03-02 15:16 ` Dragos Tatulea
@ 2026-03-03 2:22 ` Jakub Kicinski
2026-03-03 8:41 ` Dragos Tatulea
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-03-03 2:22 UTC (permalink / raw)
To: Dragos Tatulea
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, dw,
jdamato, asml.silence, io-uring, shuah, linux-kselftest
On Mon, 2 Mar 2026 16:16:38 +0100 Dragos Tatulea wrote:
> > + hp_file = "/proc/sys/vm/nr_hugepages"
> > + with open(hp_file, 'r+', encoding='utf-8') as f:
> > + nr_hugepages = int(f.read().strip())
> > + if nr_hugepages < 64:
> > + f.seek(0)
> > + f.write("64")
> > + defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
> > +
> > single(cfg)
> > rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
> > tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
> >
> > probe = cmd(rx_cmd + " -d", fail=False)
> > if probe.ret == SKIP_CODE:
> > - raise KsftSkipEx(probe.stdout)
> > + raise KsftSkipEx(probe.stdout.strip())
> >
> While working on a similar fix I found that the probe here also requires a barrier.
Hm, I'm not hitting this issue. Maybe because I'm testing in QEMU?
If you can still repro after this series could you send a follow up?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work
2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
` (2 preceding siblings ...)
2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
@ 2026-03-03 4:47 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-03-03 4:47 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, dw,
jdamato, asml.silence, io-uring
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 27 Feb 2026 09:13:02 -0800 you wrote:
> The iou-zcrx test hasn't been passing in NIPA, I assumed it's because
> we're missing iouring changes, but it's still failing after the merge
> window. Turns out there was a bug in the implementation which was fixed
> separately via the iouring tree. With that out of the way the tests
> are passing but flaky. Patch 1 deals with the flakiness.
>
> While looking at this I also noticed that the large chunk test isn't
> running at all. So fix and enable it (patches 2 and 3).
>
> [...]
Here is the summary with links:
- [net-next,1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup
https://git.kernel.org/netdev/net-next/c/27c4ab943882
- [net-next,2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup
https://git.kernel.org/netdev/net-next/c/67792dde27a6
- [net-next,3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
https://git.kernel.org/netdev/net-next/c/c7b228418e8b
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test
2026-03-03 2:22 ` Jakub Kicinski
@ 2026-03-03 8:41 ` Dragos Tatulea
0 siblings, 0 replies; 15+ messages in thread
From: Dragos Tatulea @ 2026-03-03 8:41 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, dw,
jdamato, asml.silence, io-uring, shuah, linux-kselftest
On 03.03.26 03:22, Jakub Kicinski wrote:
> On Mon, 2 Mar 2026 16:16:38 +0100 Dragos Tatulea wrote:
>>> + hp_file = "/proc/sys/vm/nr_hugepages"
>>> + with open(hp_file, 'r+', encoding='utf-8') as f:
>>> + nr_hugepages = int(f.read().strip())
>>> + if nr_hugepages < 64:
>>> + f.seek(0)
>>> + f.write("64")
>>> + defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages)))
>>> +
>>> single(cfg)
>>> rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2"
>>> tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840"
>>>
>>> probe = cmd(rx_cmd + " -d", fail=False)
>>> if probe.ret == SKIP_CODE:
>>> - raise KsftSkipEx(probe.stdout)
>>> + raise KsftSkipEx(probe.stdout.strip())
>>>
>> While working on a similar fix I found that the probe here also requires a barrier.
>
> Hm, I'm not hitting this issue. Maybe because I'm testing in QEMU?
> If you can still repro after this series could you send a follow up?
Will do.
Thanks,
Dragos
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-03-03 8:41 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-27 17:13 [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work Jakub Kicinski
2026-02-27 17:13 ` [PATCH net-next 1/3] selftests: drv-net: iou-zcrx: wait for memory provider cleanup Jakub Kicinski
2026-03-02 15:32 ` Dragos Tatulea
2026-03-02 15:49 ` David Wei
2026-03-03 0:46 ` Jakub Kicinski
2026-03-03 1:39 ` David Wei
2026-02-27 17:13 ` [PATCH net-next 2/3] selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Jakub Kicinski
2026-03-02 15:54 ` David Wei
2026-03-03 0:48 ` Jakub Kicinski
2026-03-03 1:44 ` David Wei
2026-02-27 17:13 ` [PATCH net-next 3/3] selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test Jakub Kicinski
2026-03-02 15:16 ` Dragos Tatulea
2026-03-03 2:22 ` Jakub Kicinski
2026-03-03 8:41 ` Dragos Tatulea
2026-03-03 4:47 ` [PATCH net-next 0/3] selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox