From: Pavel Begunkov <asml.silence@gmail.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
io-uring@vger.kernel.org,
Caleb Sander Mateos <csander@purestorage.com>,
Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH] io_uring: zero remained bytes when reading to fixed kernel buffer
Date: Sat, 22 Mar 2025 18:15:18 +0000 [thread overview]
Message-ID: <87b1eeb2-238b-413d-b7f3-6dc4fa63c6ca@gmail.com> (raw)
In-Reply-To: <Z97ALTDd-s0-uT7O@fedora>
On 3/22/25 13:50, Ming Lei wrote:
> On Sat, Mar 22, 2025 at 12:02:02PM +0000, Pavel Begunkov wrote:
>> On 3/22/25 07:56, Ming Lei wrote:
>>> So far fixed kernel buffer is only used for FS read/write, in which
>>> the remained bytes need to be zeroed in case of short read, otherwise
>>> kernel data may be leaked to userspace.
>>
>> Can you remind me, how that can happen? Normally, IIUC, you register
>> a request filled with user pages, so no kernel data there. Is it some
>> bounce buffers?
>
> For direct io, it is filled with user pages, but it can be buffered IO,
> and the page can be mapped to userspace.
I see. I don't mind the patch personally, but I think it's a security
concern, it's still a user space app even though privileged. Is there
a precedent maybe for fuse that we trust the user driver enough to
expose kernel memory?
One option is to try to distinguish when it contains user pages,
and conditionally zero it in ublk beforehand.
But if we consider that it's fine, can ublk zero during the struct
request completion? ublk should already know from the userspace driver
if it failed or whether it's a short IO.
>>> Add two helpers for fixing this issue, meantime replace one check
>>> with io_use_fixed_kbuf().
>>>
>>> Cc: Caleb Sander Mateos <csander@purestorage.com>
>>> Cc: Keith Busch <kbusch@kernel.org>
>>> Fixes: 27cb27b6d5ea ("io_uring: add support for kernel registered bvecs")
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>> ...
>>> +/* zero remained bytes of kernel buffer for avoiding to leak data */
>>> +static inline void io_req_zero_remained(struct io_kiocb *req,
>>> + struct iov_iter *iter)
>>> +{
>>> + size_t left = iov_iter_count(iter);
>>> +
>>> + if (left > 0 && iov_iter_rw(iter) == READ)
>>> + iov_iter_zero(left, iter);
>>> +}
>>> +
>>> #endif
>>> diff --git a/io_uring/rw.c b/io_uring/rw.c
>>> index 039e063f7091..67dc1a6710c9 100644
>>> --- a/io_uring/rw.c
>>> +++ b/io_uring/rw.c
>>> @@ -541,6 +541,12 @@ static void __io_complete_rw_common(struct io_kiocb *req, long res)
>>> } else {
>>> req_set_fail(req);
>>> req->cqe.res = res;
>>> +
>>> + if (io_use_fixed_kbuf(req)) {
>>> + struct io_async_rw *io = req->async_data;
>>> +
>>> + io_req_zero_remained(req, &io->iter);
>>> + }
>>
>> I think it can be exploited. It's called from ->ki_complete, i.e.
>> io_complete_rw, so make the request size enough, if you're stuck
>> copying in [soft]irq for too long.
>
> Short read seldom happens, so how it can be exploited? And the request size
> can't be too big in this(ublk) use case.
Denial of service by blocking irq. I'm pretty sure we can construct
a quite large bio / request in general case, e.g. with huge pages.
Maybe ublk forces splitting, but I wouldn't rely on the ublk
behaviour as it's a generic feature even though currently with
one user. We should move it to the task context, where io_uring
requests end up anyway. I'm pretty it can be cleaned up to not
have any overhead later.
--
Pavel Begunkov
next prev parent reply other threads:[~2025-03-22 18:14 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-22 7:56 [PATCH] io_uring: zero remained bytes when reading to fixed kernel buffer Ming Lei
2025-03-22 12:02 ` Pavel Begunkov
2025-03-22 13:50 ` Ming Lei
2025-03-22 17:52 ` Keith Busch
2025-03-22 18:21 ` Pavel Begunkov
2025-03-22 23:58 ` Ming Lei
2025-03-22 18:15 ` Pavel Begunkov [this message]
2025-03-22 18:10 ` Caleb Sander Mateos
2025-03-23 0:08 ` Ming Lei
2025-03-23 15:55 ` Caleb Sander Mateos
2025-03-24 0:26 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87b1eeb2-238b-413d-b7f3-6dc4fa63c6ca@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=io-uring@vger.kernel.org \
--cc=kbusch@kernel.org \
--cc=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox