From: "Fang, Wilson" <[email protected]>
To: Pavel Begunkov <[email protected]>,
"[email protected]" <[email protected]>
Cc: Jens Axboe <[email protected]>
Subject: RE: dma_buf support with io_uring
Date: Wed, 13 Jul 2022 05:41:03 +0000 [thread overview]
Message-ID: <BY5PR11MB39904DD49256EEBA8E04C4E6EF899@BY5PR11MB3990.namprd11.prod.outlook.com> (raw)
In-Reply-To: <[email protected]>
Thanks, Pavel, for the recommendation!
We are super interested in collaborating on this - we are working on the prototype of your recommendation but moving a little bit slow due to vacation and resources.
Thanks,
Wilson
-----Original Message-----
From: Pavel Begunkov <[email protected]>
Sent: Thursday, June 23, 2022 3:35 AM
To: Fang, Wilson <[email protected]>; [email protected]
Cc: Jens Axboe <[email protected]>
Subject: Re: dma_buf support with io_uring
On 6/23/22 07:17, Fang, Wilson wrote:
> Hi Jens,
>
> We are exploring a kernel native mechanism to support peer to peer data transfer between a NVMe SSD and another device supporting dma_buf, connected on the same PCIe root complex.
> NVMe SSD DMA engine requires physical memory address and there is no easy way to pass non system memory address through VFS to the block device driver.
> One of the ideas is to use the io_uring and dma_buf mechanism which is supported by the peer device of the SSD.
Interesting, that's quite aligns with what we're doing, that is a more generic way for p2p with some non-p2p optimisations on the way.
Our approach we tried before is to let userspace to register dma-buf fd inside io_uring as a register buffer, prepare everything in advance like dmabuf attach, and then rw/send/etc. can use that.
> The flow is as below:
> 1. Application passes the dma_buf fd to the kernel through liburing.
> 2. Io_uring adds two new options IORING_OP_READ_DMA and IORING_OP_WRITE_DMA to support read write operations that DMA to/from the peer device memory.
> 3. If the dma_buf fd is valid, io_uring attaches dma_buf and get sgl which contains physical memory addresses to be passed down to the block device driver.
> 4. NVMe SSD DMA engine DMA the data to/from the physical memory address.
>
> The road blocker we are facing is that dma_buf_attach() and dma_buf_map_attachment() APIs expects the caller to provide the struct device *dev as input parameter pointing to the device which does the DMA (in this case the block/NVMe device that holds the source data).
> But since io_uring operates at the VFS layer there is no straight forward way of finding the block/NVMe device object (struct device*) from the source file descriptor.
>
> Do you have any recommendations? Much appreciated!
For finding a device pointer, we added an optional file operation callback. I think that's much better than parsing it on the io_uring side, especially since we need a guarantee that the device is the only one which will be targeted and won't change (e.g. network may choose a device dynamically based on target address).
I think we have space to cooperate here :)
--
Pavel Begunkov
prev parent reply other threads:[~2022-07-13 5:42 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <BY5PR11MB399005DAD1BB172B7A42586AEFB59@BY5PR11MB3990.namprd11.prod.outlook.com>
2022-06-23 6:17 ` dma_buf support with io_uring Fang, Wilson
2022-06-23 10:35 ` Pavel Begunkov
2022-07-13 5:41 ` Fang, Wilson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BY5PR11MB39904DD49256EEBA8E04C4E6EF899@BY5PR11MB3990.namprd11.prod.outlook.com \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox