public inbox for [email protected]
 help / color / mirror / Atom feed
* allowing msg_name and msg_control
@ 2020-11-07 14:22 Victor Stewart
  2020-11-07 16:21 ` Pavel Begunkov
  0 siblings, 1 reply; 4+ messages in thread
From: Victor Stewart @ 2020-11-07 14:22 UTC (permalink / raw)
  To: io-uring

RE Jen's proposed patch here
https://lore.kernel.org/io-uring/[email protected]/

and RE what Stefan just mentioned in the "[PATCH 5.11] io_uring: don't
take fs for recvmsg/sendmsg" thread a few minutes ago... "Can't we
better remove these checks and allow msg_control? For me it's a
limitation that I would like to be removed."... which I coincidentally
just read when coming on here to advocate the same.

I also require this for a few vital performance use cases:

1) GSO (UDP_SEGMENT to sendmsg)
2) GRO (UDP_GRO from recvmsg)

GSO and GRO are super important for QUIC servers... essentially
bringing a 3-4x performance improvement that brings them in line with
TCP efficiency.

Would also allow the usage of...

3) MSG_ZEROCOPY (to receive the sock_extended_err from recvmsg)

it's only a single digit % performance gain for large sends (but a
minor crutch until we get registered buffer sendmsg / recvmsg, which I
plan on implementing).

So if there's an agreed upon plan on action I can take charge of all
the work and get this done ASAP.

#Victor

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: allowing msg_name and msg_control
  2020-11-07 14:22 allowing msg_name and msg_control Victor Stewart
@ 2020-11-07 16:21 ` Pavel Begunkov
  2020-11-07 17:12   ` Victor Stewart
  0 siblings, 1 reply; 4+ messages in thread
From: Pavel Begunkov @ 2020-11-07 16:21 UTC (permalink / raw)
  To: Victor Stewart, io-uring

On 07/11/2020 14:22, Victor Stewart wrote:
> RE Jen's proposed patch here
> https://lore.kernel.org/io-uring/[email protected]/

Hmm, I haven't seen this thread, thanks for bringing it up

> 
> and RE what Stefan just mentioned in the "[PATCH 5.11] io_uring: don't
> take fs for recvmsg/sendmsg" thread a few minutes ago... "Can't we
> better remove these checks and allow msg_control? For me it's a
> limitation that I would like to be removed."... which I coincidentally
> just read when coming on here to advocate the same.
> 
> I also require this for a few vital performance use cases:
> 
> 1) GSO (UDP_SEGMENT to sendmsg)
> 2) GRO (UDP_GRO from recvmsg)

Don't know these you listed, may read about them later, but wouldn't [1]
be enough? I was told it's queued up.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/net/socket.c?id=583bbf0624dfd8fc45f1049be1d4980be59451ff

> 
> GSO and GRO are super important for QUIC servers... essentially
> bringing a 3-4x performance improvement that brings them in line with
> TCP efficiency.
> 
> Would also allow the usage of...
> 
> 3) MSG_ZEROCOPY (to receive the sock_extended_err from recvmsg)
> 
> it's only a single digit % performance gain for large sends (but a
> minor crutch until we get registered buffer sendmsg / recvmsg, which I
> plan on implementing).
> 
> So if there's an agreed upon plan on action I can take charge of all
> the work and get this done ASAP.
> 
> #Victor
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: allowing msg_name and msg_control
  2020-11-07 16:21 ` Pavel Begunkov
@ 2020-11-07 17:12   ` Victor Stewart
  2020-11-07 20:15     ` Pavel Begunkov
  0 siblings, 1 reply; 4+ messages in thread
From: Victor Stewart @ 2020-11-07 17:12 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: io-uring

On Sat, Nov 7, 2020 at 4:24 PM Pavel Begunkov <[email protected]> wrote:
>
> On 07/11/2020 14:22, Victor Stewart wrote:
> > RE Jen's proposed patch here
> > https://lore.kernel.org/io-uring/[email protected]/
>
> Hmm, I haven't seen this thread, thanks for bringing it up
>
> >
> > and RE what Stefan just mentioned in the "[PATCH 5.11] io_uring: don't
> > take fs for recvmsg/sendmsg" thread a few minutes ago... "Can't we
> > better remove these checks and allow msg_control? For me it's a
> > limitation that I would like to be removed."... which I coincidentally
> > just read when coming on here to advocate the same.
> >
> > I also require this for a few vital performance use cases:
> >
> > 1) GSO (UDP_SEGMENT to sendmsg)
> > 2) GRO (UDP_GRO from recvmsg)
>
> Don't know these you listed, may read about them later, but wouldn't [1]
> be enough? I was told it's queued up.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/net/socket.c?id=583bbf0624dfd8fc45f1049be1d4980be59451ff
>

Hadn't seen [1], but yes as long as the same were also implemented for
__sys_sendmsg_sock(). Queued up for.. 5.11?

UDP_SEGMENT allows you to sendmsg a UDP message payload up to ~64K
(Max IP Packet size - IPv4(6) header size - UDP header size).. in
order to obey the existing network stack expectations/limitations).
That payload is actually a sequence of DPLPMTUD sized packets (because
MTU size is restricted by / variable per path to each client). That
DPLPMTUD size is provided by the UDP_SEGMENT value, with the last
packet allowed to be a smaller size.

So you can send ~40 UDP messages but only pay the cost of network
stack traversal once. Then the segmentation occurs in the NIC (or in
the kernel with the NIC has no UDP GSO support, but most all do).

There's also a pacing patch in the works for UDP GSO sends:
https://lwn.net/Articles/822726/

Then UDP_GRO is the exact inverse, so when you recvmsg() you receive a
giant payload with the individual packet size notified via the UDP_GRO
value, then self segment.

These mimic the same optimizations available without configuration for
TCP streams.

Willem discusses all in the below paper (and there's a talk on youtube).
http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-paper-DRAFT-1.pdf

oh and sorry the title of this should have been sans msg_name.

> >
> > GSO and GRO are super important for QUIC servers... essentially
> > bringing a 3-4x performance improvement that brings them in line with
> > TCP efficiency.
> >
> > Would also allow the usage of...
> >
> > 3) MSG_ZEROCOPY (to receive the sock_extended_err from recvmsg)
> >
> > it's only a single digit % performance gain for large sends (but a
> > minor crutch until we get registered buffer sendmsg / recvmsg, which I
> > plan on implementing).

and i just began work on fixed versions of sendmsg / recvmsg. So i'll
distribute that patch for initial review probably this week. Should be
fairly trivial given the work exists for read/write.

> >
> > So if there's an agreed upon plan on action I can take charge of all
> > the work and get this done ASAP.
> >
> > #Victor
> >
>
> --
> Pavel Begunkov

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: allowing msg_name and msg_control
  2020-11-07 17:12   ` Victor Stewart
@ 2020-11-07 20:15     ` Pavel Begunkov
  0 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2020-11-07 20:15 UTC (permalink / raw)
  To: Victor Stewart; +Cc: io-uring

On 07/11/2020 17:12, Victor Stewart wrote:
>> Don't know these you listed, may read about them later, but wouldn't [1]
>> be enough? I was told it's queued up.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/net/socket.c?id=583bbf0624dfd8fc45f1049be1d4980be59451ff
>>
> 
> Hadn't seen [1], but yes as long as the same were also implemented for
> __sys_sendmsg_sock(). Queued up for.. 5.11?

Seems for some reason it's only for recv.
It's for 5.10.

> 
> UDP_SEGMENT allows you to sendmsg a UDP message payload up to ~64K
> (Max IP Packet size - IPv4(6) header size - UDP header size).. in
> order to obey the existing network stack expectations/limitations).
> That payload is actually a sequence of DPLPMTUD sized packets (because
> MTU size is restricted by / variable per path to each client). That
> DPLPMTUD size is provided by the UDP_SEGMENT value, with the last
> packet allowed to be a smaller size.
> 
> So you can send ~40 UDP messages but only pay the cost of network
> stack traversal once. Then the segmentation occurs in the NIC (or in
> the kernel with the NIC has no UDP GSO support, but most all do).
> 
> There's also a pacing patch in the works for UDP GSO sends:
> https://lwn.net/Articles/822726/
> 
> Then UDP_GRO is the exact inverse, so when you recvmsg() you receive a
> giant payload with the individual packet size notified via the UDP_GRO
> value, then self segment.
> 
> These mimic the same optimizations available without configuration for
> TCP streams.
> 
> Willem discusses all in the below paper (and there's a talk on youtube).
> http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-paper-DRAFT-1.pdf
> 
> oh and sorry the title of this should have been sans msg_name.
> 
>>>
>>> GSO and GRO are super important for QUIC servers... essentially
>>> bringing a 3-4x performance improvement that brings them in line with
>>> TCP efficiency.
>>>
>>> Would also allow the usage of...
>>>
>>> 3) MSG_ZEROCOPY (to receive the sock_extended_err from recvmsg)
>>>
>>> it's only a single digit % performance gain for large sends (but a
>>> minor crutch until we get registered buffer sendmsg / recvmsg, which I
>>> plan on implementing).
> 
> and i just began work on fixed versions of sendmsg / recvmsg. So i'll
> distribute that patch for initial review probably this week. Should be
> fairly trivial given the work exists for read/write.
> 
>>>
>>> So if there's an agreed upon plan on action I can take charge of all
>>> the work and get this done ASAP.
>>>
>>> #Victor
>>>
>>
>> --
>> Pavel Begunkov

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-07 20:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-07 14:22 allowing msg_name and msg_control Victor Stewart
2020-11-07 16:21 ` Pavel Begunkov
2020-11-07 17:12   ` Victor Stewart
2020-11-07 20:15     ` Pavel Begunkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox