From: Pavel Begunkov <asml.silence@gmail.com>
To: Breno Leitao <leitao@debian.org>
Cc: Jens Axboe <axboe@kernel.dk>, Stefan Metzmacher <metze@samba.org>,
io-uring <io-uring@vger.kernel.org>
Subject: Re: SOCKET_URING_OP_GETSOCKOPT SOL_SOCKET restriction
Date: Sat, 29 Mar 2025 10:59:28 +0000 [thread overview]
Message-ID: <c97c1d57-ab3c-49a0-8c08-7160ad66ea88@gmail.com> (raw)
In-Reply-To: <20250328-monumental-taupe-malamute-d1c54b@leitao>
On 3/28/25 18:22, Breno Leitao wrote:
> Hello Pavel,
>
> On Fri, Mar 28, 2025 at 05:21:06PM +0000, Pavel Begunkov wrote:
>> On 3/28/25 17:18, Pavel Begunkov wrote:
>>> On 3/28/25 16:34, Jens Axboe wrote:
>>>> On 3/28/25 9:02 AM, Pavel Begunkov wrote:
>>>>> On 3/28/25 14:30, Jens Axboe wrote:
>>>>>> On 3/28/25 8:27 AM, Stefan Metzmacher wrote:
>>> I remember Breno looking at several different options.
>>>
>>> Breno, can you remind me, why can't we convert ->getsockopt to
>>> take a normal kernel ptr for length while passing a user ptr
>>> for value as before?
>>
>> Similar to this:
>>
>> getsockopt_syscall(void __user *len_uptr) {
>> int klen;
>>
>> copy_from_user(&klen, len_uptr);
>> ->getsockopt(&klen);
>> copy_to_user(len_uptr, &klen);
>> }
>
> We have a few limitations if I remember correct:
>
> getsockopt() callback expects __user pointers:
>
> int (*getsockopt)(struct socket *sock, int level,
> int optname, char __user *optval, int __user *optlen);
>
>
> So, you cannot copy the memory content and call ->getsockopt() with
> kernel memory.
Right, I'm rather asking about changing the callback to pass
a kernel pointer and make the caller to do the copy_to_user
if needed.
> A solution was to use sockptr, as done by setsockopt(), but, that was
> discouraged.
>
> Another important thing, some getsockopt() callback changes the pointer,
> so, doing copy_to_user() directly in the getsocktopt callback, which
> would break your approach above.
Do you mean writing to it? That's why the snippet passes a kernel
_pointer_ to length. Did I misunderstand you?
> I understand that the next steps here are:
>
> 1) Make getsockopt() operate with either userspace or kernel buffer.
> a) This buffer needs will be written and read on both side. I.e, you
> pass data in the buffer from userspace to kernel space, and kernel
> will overwrite that buffer in kernelspace.
>
> In other words, this is a read-write buffer (which is not something
> we have in iovec IIRC).
>
> 2) Call the same callbacks from io_uring subsystem using kernel memory
>
> 3) Regular syscalls will continue to user userspace memory.
>
--
Pavel Begunkov
next prev parent reply other threads:[~2025-03-29 10:58 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-28 14:27 SOCKET_URING_OP_GETSOCKOPT SOL_SOCKET restriction Stefan Metzmacher
2025-03-28 14:30 ` Jens Axboe
2025-03-28 15:02 ` Stefan Metzmacher
2025-03-28 15:08 ` Stefan Metzmacher
2025-03-28 16:24 ` Breno Leitao
2025-03-28 15:02 ` Pavel Begunkov
2025-03-28 15:03 ` Pavel Begunkov
2025-03-28 16:34 ` Jens Axboe
2025-03-28 17:18 ` Pavel Begunkov
2025-03-28 17:21 ` Pavel Begunkov
2025-03-28 18:22 ` Breno Leitao
2025-03-29 10:59 ` Pavel Begunkov [this message]
2025-03-28 19:41 ` Stefan Metzmacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c97c1d57-ab3c-49a0-8c08-7160ad66ea88@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=leitao@debian.org \
--cc=metze@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox