From: "Walker, Benjamin" <[email protected]>
To: Jens Axboe <[email protected]>, Avi Kivity <[email protected]>,
<[email protected]>
Subject: Re: memory access op ideas
Date: Mon, 25 Apr 2022 11:05:11 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 4/24/2022 5:45 PM, Jens Axboe wrote:
> On 4/24/22 8:56 AM, Avi Kivity wrote:
>>
>> On 24/04/2022 16.30, Jens Axboe wrote:
>>> On 4/24/22 7:04 AM, Avi Kivity wrote:
>>>> On 23/04/2022 20.30, Jens Axboe wrote:
>>>>> On 4/23/22 10:23 AM, Avi Kivity wrote:
>>>>>> Perhaps the interface should be kept separate from io_uring. e.g. use
>>>>>> a pidfd to represent the address space, and then issue
>>>>>> IORING_OP_PREADV/IORING_OP_PWRITEV to initiate dma. Then one can copy
>>>>>> across process boundaries.
>>>>> Then you just made it a ton less efficient, particularly if you used the
>>>>> vectored read/write. For this to make sense, I think it has to be a
>>>>> separate op. At least that's the only implementation I'd be willing to
>>>>> entertain for the immediate copy.
>>>>
>>>> Sorry, I caused a lot of confusion by bundling immediate copy and a
>>>> DMA engine interface. For sure the immediate copy should be a direct
>>>> implementation like you posted!
>>>>
>>>> User-to-user copies are another matter. I feel like that should be a
>>>> stand-alone driver, and that io_uring should be an io_uring-y way to
>>>> access it. Just like io_uring isn't an NVMe driver.
>>> Not sure I understand your logic here or the io_uring vs nvme driver
>>> reference, to be honest. io_uring _is_ a standalone way to access it,
>>> you can use it sync or async through that.
>>>
>>> If you're talking about a standalone op vs being useful from a command
>>> itself, I do think both have merit and I can see good use cases for
>>> both.
I'm actually not so certain the model where io_uring has special
operations for driving DMA engines works out. I think in all cases you
can accomplish what you want by reading or writing to existing file
constructs, and just having those transparently offload to a DMA engine
if one is available on your behalf.
As a concrete example, let's take an inter-process copy. The main
challenges with this one are the security model (who's allowed to copy
where?) and synchronization between the two applications (when did the
data change?).
Rather, I'd consider implementing the inter-process copy using an
existing mechanism like a Unix domain socket. The sender maybe does a
MSG_ZEROCOPY send via io_uring, and the receiver does an async recv, and
the kernel can use a DMA engine to move the data directly between the
two buffers if it has one avaiable. Then you get the existing security
model and coordination, and software works whether there's a DMA engine
available or not.
It's a similar story for copying to memory on a PCI device. You'd need
some security model to decide you're allowed to copy there, which is
probably best expressed by opening a file that represents that BAR and
then doing reads/writes to it.
This is at least the direction I've been pursuing. The DMA engine
channel is associated with the io_uring and the kernel just
intelligently offloads whatever it can.
>>
>>
>> I'm saying that if dma is exposed to userspace, it should have a
>> regular synchronous interface (maybe open("/dev/dma"), maybe something
>> else). io_uring adds asynchrony to everything, but it's not
>> everything's driver.
>
> Sure, my point is that if/when someone wants to add that, they should be
> free to do so. It's not a fair requirement to put on someone doing the
> initial work on wiring this up. It may not be something they would want
> to use to begin with, and it's perfectly easy to run io_uring in sync
> mode should you wish to do so. The hard part is making the
> issue+complete separate actions, rolling a sync API on top of that would
> be trivial.
Just FYI but the Intel idxd driver already has a userspace interface
that's async/poll-mode. Commands are submitted to a mmap'd portal using
the movdir64/enqcmd instructions directly. It does not expose an fd you
can read/write to in order to trigger copies, so it is not compatible
with io_uring, but it doesn't really need to be since it is already async.
What isn't currently exposed to userspace is access to the "dmaengine"
framework. Prior to the patchset I have pending that I linked earlier in
the thread, the "dmaengine" framework couldn't really operate in
async/poll mode and handle out-of-order processing, etc. But after that
series maybe.
>
>> Anyway maybe we drifted off somewhere and this should be decided by
>> pragmatic concerns (like whatever the author of the driver prefers).
>
> Indeed!
>
prev parent reply other threads:[~2022-04-25 18:05 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-13 10:33 memory access op ideas Avi Kivity
2022-04-22 12:52 ` Hao Xu
2022-04-22 13:24 ` Hao Xu
2022-04-22 13:38 ` Jens Axboe
2022-04-23 7:19 ` Hao Xu
2022-04-23 16:14 ` Avi Kivity
2022-04-22 14:50 ` Jens Axboe
2022-04-22 15:03 ` Jens Axboe
2022-04-23 16:30 ` Avi Kivity
2022-04-23 17:32 ` Jens Axboe
2022-04-23 18:02 ` Jens Axboe
2022-04-23 18:11 ` Jens Axboe
2022-04-22 20:03 ` Walker, Benjamin
2022-04-23 10:19 ` Pavel Begunkov
2022-04-23 13:20 ` Jens Axboe
2022-04-23 16:23 ` Avi Kivity
2022-04-23 17:30 ` Jens Axboe
2022-04-24 13:04 ` Avi Kivity
2022-04-24 13:30 ` Jens Axboe
2022-04-24 14:56 ` Avi Kivity
2022-04-25 0:45 ` Jens Axboe
2022-04-25 18:05 ` Walker, Benjamin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox