public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: David Ahern <[email protected]>,
	[email protected], [email protected],
	[email protected]
Cc: Jakub Kicinski <[email protected]>,
	Jonathan Lemon <[email protected]>,
	"David S . Miller" <[email protected]>,
	Willem de Bruijn <[email protected]>,
	Eric Dumazet <[email protected]>,
	Hideaki YOSHIFUJI <[email protected]>,
	David Ahern <[email protected]>, Jens Axboe <[email protected]>
Subject: Re: [RFC 00/12] io_uring zerocopy send
Date: Wed, 1 Dec 2021 15:32:36 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 12/1/21 03:10, David Ahern wrote:
> On 11/30/21 8:18 AM, Pavel Begunkov wrote:
>> Early proof of concept for zerocopy send via io_uring. This is just
>> an RFC, there are details yet to be figured out, but hope to gather
>> some feedback.
>>
>> Benchmarking udp (65435 bytes) with a dummy net device (mtu=0xffff):
>> The best case io_uring=116079 MB/s vs msg_zerocopy=47421 MB/s,
>> or 2.44 times faster.
>>
>> № | test:                                | BW (MB/s)  | speedup
>> 1 | msg_zerocopy (non-zc)                |  18281     | 0.38
>> 2 | msg_zerocopy -z (baseline)           |  47421     | 1
>> 3 | io_uring (@flush=false, nr_reqs=1)   |  96534     | 2.03
>> 4 | io_uring (@flush=true,  nr_reqs=1)   |  89310     | 1.88
>> 5 | io_uring (@flush=false, nr_reqs=8)   | 116079     | 2.44
>> 6 | io_uring (@flush=true,  nr_reqs=8)   | 109722     | 2.31
>>
>> Based on selftests/.../msg_zerocopy but more limited. You can use
>> msg_zerocopy -r as usual for receive side.
>>
> ...
> 
> Can you state the exact command lines you are running for all of the
> commands? I tried this set (and commands referenced below) and my

Sure. First, for dummy I set mtu by hand, not sure can do it from
the userspace, can I? Without it __ip_append_data() falls into
non-zerocopy path.

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index f82ad7419508..5c5aeacdabd5 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -132,7 +132,8 @@ static void dummy_setup(struct net_device *dev)
  	eth_hw_addr_random(dev);
  
  	dev->min_mtu = 0;
-	dev->max_mtu = 0;
+	dev->mtu = 0xffff;
+	dev->max_mtu = 0xffff;
  }

# dummy configuration

modprobe dummy numdummies=1
ip link set dummy0 up
# force requests to <dummy_ip_addr> go through the dummy device
ip route add <dummy_ip_addr> dev dummy0


With dummy I was just sinking the traffic to the dummy device,
was good enough for me. Omitting "taskset" and "nice":

send-zc -4 -D <dummy_ip_addr> -t 10 udp

Similarly with msg_zerocopy:

<kernel>/tools/testing/selftests/net/msg_zerocopy -4 -p 6666 -D <dummy_ip_addr> -t 10 -z udp


For loopback testing, as zerocopy is not allowed for it as Willem explained in
the original MSG_ZEROCOPY cover-letter, I used a hack to bypass it:

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ebb12a7d386d..42df33b175ce 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2854,9 +2854,7 @@ static inline int skb_orphan_frags(struct sk_buff *skb, gfp_t gfp_mask)
  /* Frags must be orphaned, even if refcounted, if skb might loop to rx path */
  static inline int skb_orphan_frags_rx(struct sk_buff *skb, gfp_t gfp_mask)
  {
-	if (likely(!skb_zcopy(skb)))
-		return 0;
-	return skb_copy_ubufs(skb, gfp_mask);
+	return skb_orphan_frags(skb, gfp_mask);
  }
  
  /**

Then running those two lines below in parallel and looking for the numbers
send shows. It was in favor of io_uring for me, but don't remember
exactly. perf shows that "send-zc" spends lot of time receiving, so
wasn't testing performance of it after some point.

msg_zerocopy -r -v -4 -t 20 udp
send-zc -4 -D 127.0.0.1 -t 10 udp


> mileage varies quite a bit.

Interesting, any brief notes on the setup and the results? Dummy
or something real? io_uring doesn't show if it was really zerocopied
or not, but I assume you checked it (e.g. with perf/bpftrace).

I expected that @flush=true might be worse with real devices,
there is one spot to be patched, but apart from that and
cycles spend in a real LLD offseting the overhead, didn't
anticipate any problems. I'll see once I try a real device.


> Also, have you run this proposed change (and with TCP) across nodes
> (ie., not just local process to local process via dummy interface)?

Not yet, I tried dummy, and localhost UDP as per above and similarly
TCP. Just need to grab a server with a proper NIC, will try it out
soon.

>> Benchmark:
>> https://github.com/isilence/liburing.git zc_v1
>>
>> or this file in particular:
>> https://github.com/isilence/liburing/blob/zc_v1/test/send-zc.c
>>
>> To run the benchmark:
>> ```
>> cd <liburing_dir> && make && cd test
>> # ./send-zc -4 [-p <port>] [-s <payload_size>] -D <destination> udp
>> ./send-zc -4 -D 127.0.0.1 udp
>> ```
>>
>> msg_zerocopy can be used for the server side, e.g.
>> ```
>> cd <linux-kernel>/tools/testing/selftests/net && make
>> ./msg_zerocopy -4 -r [-p <port>] [-t <sec>] udp
>> ```

-- 
Pavel Begunkov

  reply	other threads:[~2021-12-01 15:32 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30 15:18 [RFC 00/12] io_uring zerocopy send Pavel Begunkov
2021-11-30 15:18 ` [RFC 01/12] skbuff: add SKBFL_DONT_ORPHAN flag Pavel Begunkov
2021-11-30 15:18 ` [RFC 02/12] skbuff: pass a struct ubuf_info in msghdr Pavel Begunkov
2021-11-30 15:18 ` [RFC 03/12] net/udp: add support msgdr::msg_ubuf Pavel Begunkov
2021-11-30 15:18 ` [RFC 04/12] net: add zerocopy_sg_from_iter for bvec Pavel Begunkov
2021-11-30 15:18 ` [RFC 05/12] net: optimise page get/free for bvec zc Pavel Begunkov
2021-12-01 19:20   ` Jonathan Lemon
2021-12-01 20:17     ` Pavel Begunkov
2021-11-30 15:18 ` [RFC 06/12] io_uring: add send notifiers registration Pavel Begunkov
2021-11-30 15:18 ` [RFC 07/12] io_uring: infrastructure for send zc notifications Pavel Begunkov
2021-11-30 15:18 ` [RFC 08/12] io_uring: wire send zc request type Pavel Begunkov
2021-11-30 15:18 ` [RFC 09/12] io_uring: add an option to flush zc notifications Pavel Begunkov
2021-11-30 15:18 ` [RFC 10/12] io_uring: opcode independent fixed buf import Pavel Begunkov
2021-11-30 15:18 ` [RFC 11/12] io_uring: sendzc with fixed buffers Pavel Begunkov
2021-11-30 15:19 ` [RFC 12/12] io_uring: cache struct ubuf_info Pavel Begunkov
2021-12-01  3:10 ` [RFC 00/12] io_uring zerocopy send David Ahern
2021-12-01 15:32   ` Pavel Begunkov [this message]
2021-12-01 17:57     ` David Ahern
     [not found]       ` <[email protected]>
2021-12-01 19:20         ` David Ahern
2021-12-01 20:15           ` Pavel Begunkov
2021-12-01 21:51             ` Martin KaFai Lau
2021-12-01 22:35               ` David Ahern
2021-12-01 23:07                 ` Martin KaFai Lau
2021-12-01 23:18                   ` Pavel Begunkov
2021-12-02 15:48               ` Pavel Begunkov
2021-12-02 17:40                 ` Martin KaFai Lau
2021-12-01 20:42       ` Pavel Begunkov
2021-12-01 14:31 ` Pavel Begunkov
2021-12-01 17:49   ` David Ahern
2021-12-01 19:59     ` Pavel Begunkov
2021-12-01 18:10 ` Willem de Bruijn
2021-12-01 19:59   ` Pavel Begunkov
2021-12-01 20:29     ` Pavel Begunkov
2021-12-02  0:36       ` Willem de Bruijn
2021-12-02 16:25         ` Pavel Begunkov
2021-12-02  0:32     ` Willem de Bruijn
2021-12-02 16:45       ` Pavel Begunkov
2021-12-02 21:25         ` Willem de Bruijn
2021-12-03 16:19           ` Pavel Begunkov
2021-12-03 16:30             ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox