From: Caleb Sander Mateos <csander@purestorage.com>
To: Sidong Yang <sidong.yang@furiosa.ai>
Cc: Jens Axboe <axboe@kernel.dk>,
Daniel Almeida <daniel.almeida@collabora.com>,
Benno Lossin <lossin@kernel.org>,
Miguel Ojeda <ojeda@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org,
io-uring@vger.kernel.org
Subject: Re: [RFC PATCH v3 2/5] io_uring/cmd: zero-init pdu in io_uring_cmd_prep() to avoid UB
Date: Tue, 9 Sep 2025 09:32:37 -0700 [thread overview]
Message-ID: <CADUfDZppdnM2QAeX37OmZsXqd7sO7KvyLnNPUYOgLpWMb+FpoQ@mail.gmail.com> (raw)
In-Reply-To: <aMA8_MuU0V-_ja5O@sidongui-MacBookPro.local>
On Tue, Sep 9, 2025 at 7:43 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
>
> On Mon, Sep 08, 2025 at 12:45:58PM -0700, Caleb Sander Mateos wrote:
> > On Sat, Sep 6, 2025 at 7:28 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
> > >
> > > On Tue, Sep 02, 2025 at 08:31:00AM -0700, Caleb Sander Mateos wrote:
> > > > On Tue, Sep 2, 2025 at 3:23 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
> > > > >
> > > > > On Mon, Sep 01, 2025 at 05:34:28PM -0700, Caleb Sander Mateos wrote:
> > > > > > On Fri, Aug 22, 2025 at 5:56 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
> > > > > > >
> > > > > > > The pdu field in io_uring_cmd may contain stale data when a request
> > > > > > > object is recycled from the slab cache. Accessing uninitialized or
> > > > > > > garbage memory can lead to undefined behavior in users of the pdu.
> > > > > > >
> > > > > > > Ensure the pdu buffer is cleared during io_uring_cmd_prep() so that
> > > > > > > each command starts from a well-defined state. This avoids exposing
> > > > > > > uninitialized memory and prevents potential misinterpretation of data
> > > > > > > from previous requests.
> > > > > > >
> > > > > > > No functional change is intended other than guaranteeing that pdu is
> > > > > > > always zero-initialized before use.
> > > > > > >
> > > > > > > Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai>
> > > > > > > ---
> > > > > > > io_uring/uring_cmd.c | 1 +
> > > > > > > 1 file changed, 1 insertion(+)
> > > > > > >
> > > > > > > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> > > > > > > index 053bac89b6c0..2492525d4e43 100644
> > > > > > > --- a/io_uring/uring_cmd.c
> > > > > > > +++ b/io_uring/uring_cmd.c
> > > > > > > @@ -203,6 +203,7 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> > > > > > > if (!ac)
> > > > > > > return -ENOMEM;
> > > > > > > ioucmd->sqe = sqe;
> > > > > > > + memset(&ioucmd->pdu, 0, sizeof(ioucmd->pdu));
> > > > > >
> > > > > > Adding this overhead to every existing uring_cmd() implementation is
> > > > > > unfortunate. Could we instead track the initialized/uninitialized
> > > > > > state by using different types on the Rust side? The io_uring_cmd
> > > > > > could start as an IoUringCmd, where the PDU field is MaybeUninit,
> > > > > > write_pdu<T>() could return a new IoUringCmdPdu<T> that guarantees the
> > > > > > PDU has been initialized.
> > > > >
> > > > > I've found a flag IORING_URING_CMD_REISSUE that we could initialize
> > > > > the pdu. In uring_cmd callback, we can fill zero when it's not reissued.
> > > > > But I don't know that we could call T::default() in miscdevice. If we
> > > > > make IoUringCmdPdu<T>, MiscDevice also should be MiscDevice<T>.
> > > > >
> > > > > How about assign a byte in pdu for checking initialized? In uring_cmd(),
> > > > > We could set a byte flag that it's not initialized. And we could return
> > > > > error that it's not initialized in read_pdu().
> > > >
> > > > Could we do the zero-initialization (or T::default()) in
> > > > MiscdeviceVTable::uring_cmd() if the IORING_URING_CMD_REISSUE flag
> > > > isn't set (i.e. on the initial issue)? That way, we avoid any
> > > > performance penalty for the existing C uring_cmd() implementations.
> > > > I'm not quite sure what you mean by "assign a byte in pdu for checking
> > > > initialized".
> > >
> > > Sure, we could fill zero when it's the first time uring_cmd called with
> > > checking the flag. I would remove this commit for next version. I also
> > > suggests that we would provide the method that read_pdu() and write_pdu().
> > > In read_pdu() I want to check write_pdu() is called before. So along the
> > > 20 bytes for pdu, maybe we could use a bytes for the flag that pdu is
> > > initialized?
> >
> > Not sure what you mean about "20 bytes for pdu".
> > It seems like it would be preferable to enforce that write_pdu() has
> > been called before read_pdu() using the Rust type system instead of a
> > runtime check. I was thinking a signature like fn write_pdu(cmd:
> > IoUringCmd, value: T) -> IoUringCmdPdu<T>. Do you feel there's a
> > reason that wouldn't work and a runtime check would be necessary?
>
> I didn't think about make write_pdu() to return IoUringCmdPdu<T> before.
> I think it's good way to pdu is safe without adding a new generic param for
> MiscDevice. write_pdu() would return IoUringCmdPdu<T> and it could call
> IoUringCmdPdu<T>::pdu(&mut self) -> &mut T safely maybe.
Yes, that's what I was thinking.
>
> >
> > >
> > > But maybe I would introduce a new struct that has Pin<&mut IoUringCmd> and
> > > issue_flags. How about some additional field for pdu is initialized like below?
> > >
> > > struct IoUringCmdArgs {
> > > ioucmd: Pin<&mut IoUringCmd>,
> > > issue_flags: u32,
> > > pdu_initialized: bool,
> > > }
> >
> > One other thing I realized is that issue_flags should come from the
> > *current* context rather than the context the uring_cmd() callback was
> > called in. For example, if io_uring_cmd_done() is called from task
> > work context, issue_flags should match the issue_flags passed to the
> > io_uring_cmd_tw_t callback, not the issue_flags originally passed to
> > the uring_cmd() callback. So it probably makes more sense to decouple
> > issue_flags from the (owned) IoUringCmd. I think you could pass it by
> > reference (&IssueFlags) or with a phantom reference lifetime
> > (IssueFlags<'_>) to the Rust uring_cmd() and task work callbacks to
> > ensure it can't be used after those callbacks have returned.
>
> I have had no idea about task work context. I agree with you that
> it would be better to separate issue_flags from IoUringCmd. So,
> IoUringCmdArgs would have a only field Pin<&mut IoUringCmd>?
"Task work" is a mechanism io_uring uses to queue work to run on the
thread that submitted an io_uring operation. It's basically a
per-thread atomic queue of callbacks that the thread will process
whenever it returns from the kernel to userspace (after a syscall or
an interrupt). This is the context where asynchronous uring_cmd
completions are generally processed (see
io_uring_cmd_complete_in_task() and io_uring_cmd_do_in_task_lazy()). I
can't speak to the history of why io_uring uses task work, but my
guess would be that it provides a safe context to acquire the
io_ring_ctx uring_lock mutex (e.g. nvme_uring_cmd_end_io() can be
called from an interrupt handler, so it's not allowed to take a
mutex). Processing all the task work at once also provides natural
opportunities for batching.
Yes, we probably don't need to bundle anything else with the
IoUringCmd after all. As I mentioned earlier, I don't think Pin<&mut
IoUringCmd> will work for uring_cmds that complete asynchronously, as
they will need to outlive the uring_cmd() call. So uring_cmd() needs
to transfer ownership of the struct io_uring_cmd.
Best,
Caleb
next prev parent reply other threads:[~2025-09-09 16:32 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-22 12:55 [RFC PATCH v3 0/5] rust: miscdevice: abstraction for uring_cmd Sidong Yang
2025-08-22 12:55 ` [RFC PATCH v3 1/5] rust: bindings: add io_uring headers in bindings_helper.h Sidong Yang
2025-08-22 12:55 ` [RFC PATCH v3 2/5] io_uring/cmd: zero-init pdu in io_uring_cmd_prep() to avoid UB Sidong Yang
2025-09-02 0:34 ` Caleb Sander Mateos
2025-09-02 10:23 ` Sidong Yang
2025-09-02 15:31 ` Caleb Sander Mateos
2025-09-06 14:28 ` Sidong Yang
2025-09-08 19:45 ` Caleb Sander Mateos
2025-09-09 14:43 ` Sidong Yang
2025-09-09 16:32 ` Caleb Sander Mateos [this message]
2025-09-12 16:41 ` Sidong Yang
2025-09-12 17:56 ` Caleb Sander Mateos
2025-09-13 12:42 ` Sidong Yang
2025-09-15 16:54 ` Caleb Sander Mateos
2025-09-17 14:56 ` Sidong Yang
2025-09-22 18:09 ` Caleb Sander Mateos
2025-08-22 12:55 ` [RFC PATCH v3 3/5] rust: io_uring: introduce rust abstraction for io-uring cmd Sidong Yang
2025-08-27 20:41 ` Daniel Almeida
2025-08-28 7:24 ` Benno Lossin
2025-08-29 15:43 ` Sidong Yang
2025-08-29 16:11 ` Daniel Almeida
2025-08-28 0:36 ` Ming Lei
2025-08-28 7:25 ` Benno Lossin
2025-08-28 10:05 ` Ming Lei
2025-09-02 1:11 ` Caleb Sander Mateos
2025-09-02 11:11 ` Sidong Yang
2025-09-02 15:41 ` Caleb Sander Mateos
2025-08-22 12:55 ` [RFC PATCH v3 4/5] rust: miscdevice: Add `uring_cmd` support Sidong Yang
2025-09-02 1:12 ` Caleb Sander Mateos
2025-09-02 11:18 ` Sidong Yang
2025-09-02 15:53 ` Caleb Sander Mateos
2025-08-22 12:55 ` [RFC PATCH v3 5/5] samples: rust: Add `uring_cmd` example to `rust_misc_device` Sidong Yang
2025-08-28 0:48 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADUfDZppdnM2QAeX37OmZsXqd7sO7KvyLnNPUYOgLpWMb+FpoQ@mail.gmail.com \
--to=csander@purestorage.com \
--cc=arnd@arndb.de \
--cc=axboe@kernel.dk \
--cc=daniel.almeida@collabora.com \
--cc=gregkh@linuxfoundation.org \
--cc=io-uring@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lossin@kernel.org \
--cc=ojeda@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=sidong.yang@furiosa.ai \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox