public inbox for [email protected]
 help / color / mirror / Atom feed
From: "Walker, Benjamin" <[email protected]>
To: Jens Axboe <[email protected]>, Avi Kivity <[email protected]>,
	<[email protected]>
Subject: Re: memory access op ideas
Date: Mon, 25 Apr 2022 11:05:11 -0700	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 4/24/2022 5:45 PM, Jens Axboe wrote:
> On 4/24/22 8:56 AM, Avi Kivity wrote:
>>
>> On 24/04/2022 16.30, Jens Axboe wrote:
>>> On 4/24/22 7:04 AM, Avi Kivity wrote:
>>>> On 23/04/2022 20.30, Jens Axboe wrote:
>>>>> On 4/23/22 10:23 AM, Avi Kivity wrote:
>>>>>> Perhaps the interface should be kept separate from io_uring. e.g. use
>>>>>> a pidfd to represent the address space, and then issue
>>>>>> IORING_OP_PREADV/IORING_OP_PWRITEV to initiate dma. Then one can copy
>>>>>> across process boundaries.
>>>>> Then you just made it a ton less efficient, particularly if you used the
>>>>> vectored read/write. For this to make sense, I think it has to be a
>>>>> separate op. At least that's the only implementation I'd be willing to
>>>>> entertain for the immediate copy.
>>>>
>>>> Sorry, I caused a lot of confusion by bundling immediate copy and a
>>>> DMA engine interface. For sure the immediate copy should be a direct
>>>> implementation like you posted!
>>>>
>>>> User-to-user copies are another matter. I feel like that should be a
>>>> stand-alone driver, and that io_uring should be an io_uring-y way to
>>>> access it. Just like io_uring isn't an NVMe driver.
>>> Not sure I understand your logic here or the io_uring vs nvme driver
>>> reference, to be honest. io_uring _is_ a standalone way to access it,
>>> you can use it sync or async through that.
>>>
>>> If you're talking about a standalone op vs being useful from a command
>>> itself, I do think both have merit and I can see good use cases for
>>> both.

I'm actually not so certain the model where io_uring has special 
operations for driving DMA engines works out. I think in all cases you 
can accomplish what you want by reading or writing to existing file 
constructs, and just having those transparently offload to a DMA engine 
if one is available on your behalf.

As a concrete example, let's take an inter-process copy. The main 
challenges with this one are the security model (who's allowed to copy 
where?) and synchronization between the two applications (when did the 
data change?).

Rather, I'd consider implementing the inter-process copy using an 
existing mechanism like a Unix domain socket. The sender maybe does a 
MSG_ZEROCOPY send via io_uring, and the receiver does an async recv, and 
the kernel can use a DMA engine to move the data directly between the 
two buffers if it has one avaiable. Then you get the existing security 
model and coordination, and software works whether there's a DMA engine 
available or not.

It's a similar story for copying to memory on a PCI device. You'd need 
some security model to decide you're allowed to copy there, which is 
probably best expressed by opening a file that represents that BAR and 
then doing reads/writes to it.

This is at least the direction I've been pursuing. The DMA engine 
channel is associated with the io_uring and the kernel just 
intelligently offloads whatever it can.

>>
>>
>> I'm saying that if dma is exposed to userspace, it should have a
>> regular synchronous interface (maybe open("/dev/dma"), maybe something
>> else). io_uring adds asynchrony to everything, but it's not
>> everything's driver.
> 
> Sure, my point is that if/when someone wants to add that, they should be
> free to do so. It's not a fair requirement to put on someone doing the
> initial work on wiring this up. It may not be something they would want
> to use to begin with, and it's perfectly easy to run io_uring in sync
> mode should you wish to do so. The hard part is making the
> issue+complete separate actions, rolling a sync API on top of that would
> be trivial.

Just FYI but the Intel idxd driver already has a userspace interface 
that's async/poll-mode. Commands are submitted to a mmap'd portal using 
the movdir64/enqcmd instructions directly. It does not expose an fd you 
can read/write to in order to trigger copies, so it is not compatible 
with io_uring, but it doesn't really need to be since it is already async.

What isn't currently exposed to userspace is access to the "dmaengine" 
framework. Prior to the patchset I have pending that I linked earlier in 
the thread, the "dmaengine" framework couldn't really operate in 
async/poll mode and handle out-of-order processing, etc. But after that 
series maybe.

> 
>> Anyway maybe we drifted off somewhere and this should be decided by
>> pragmatic concerns (like whatever the author of the driver prefers).
> 
> Indeed!
> 


      reply	other threads:[~2022-04-25 18:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13 10:33 memory access op ideas Avi Kivity
2022-04-22 12:52 ` Hao Xu
2022-04-22 13:24   ` Hao Xu
2022-04-22 13:38   ` Jens Axboe
2022-04-23  7:19     ` Hao Xu
2022-04-23 16:14   ` Avi Kivity
2022-04-22 14:50 ` Jens Axboe
2022-04-22 15:03   ` Jens Axboe
2022-04-23 16:30     ` Avi Kivity
2022-04-23 17:32       ` Jens Axboe
2022-04-23 18:02         ` Jens Axboe
2022-04-23 18:11           ` Jens Axboe
2022-04-22 20:03   ` Walker, Benjamin
2022-04-23 10:19     ` Pavel Begunkov
2022-04-23 13:20     ` Jens Axboe
2022-04-23 16:23   ` Avi Kivity
2022-04-23 17:30     ` Jens Axboe
2022-04-24 13:04       ` Avi Kivity
2022-04-24 13:30         ` Jens Axboe
2022-04-24 14:56           ` Avi Kivity
2022-04-25  0:45             ` Jens Axboe
2022-04-25 18:05               ` Walker, Benjamin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox