From: Pavel Begunkov <asml.silence@gmail.com>
To: Nitesh Shetty <nj.shetty@samsung.com>
Cc: linux-block@vger.kernel.org, io-uring <io-uring@vger.kernel.org>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
"Christian König" <christian.koenig@amd.com>,
"Christoph Hellwig" <hch@lst.de>,
"Kanchan Joshi" <joshi.k@samsung.com>,
"Anuj Gupta" <anuj20.g@samsung.com>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>
Subject: Re: [LSF/MM/BPF TOPIC] dmabuf backed read/write
Date: Mon, 9 Feb 2026 11:15:22 +0000 [thread overview]
Message-ID: <76299bc2-4871-4571-bef6-9886ee0d2c5f@gmail.com> (raw)
In-Reply-To: <20260204152634.gyybb2axszwpewrk@green245.gost>
On 2/4/26 15:26, Nitesh Shetty wrote:
> On 03/02/26 02:29PM, Pavel Begunkov wrote:
>> Good day everyone,
>>
>> dma-buf is a powerful abstraction for managing buffers and DMA mappings,
>> and there is growing interest in extending it to the read/write path to
>> enable device-to-device transfers without bouncing data through system
>> memory. I was encouraged to submit it to LSF/MM/BPF as that might be
>> useful to mull over details and what capabilities and features people
>> may need.
>>
>> The proposal consists of two parts. The first is a small in-kernel
>> framework that allows a dma-buf to be registered against a given file
>> and returns an object representing a DMA mapping. The actual mapping
>> creation is delegated to the target subsystem (e.g. NVMe). This
>> abstraction centralises request accounting, mapping management, dynamic
>> recreation, etc. The resulting mapping object is passed through the I/O
>> stack via a new iov_iter type.
>>
>> As for the user API, a dma-buf is installed as an io_uring registered
>> buffer for a specific file. Once registered, the buffer can be used by
>> read / write io_uring requests as normal. io_uring will enforce that the
>> buffer is only used with "compatible files", which is for now restricted
>> to the target registration file, but will be expanded in the future.
>> Notably, io_uring is a consumer of the framework rather than a
>> dependency, and the infrastructure can be reused.
>>
> We have been following the series, its interesting from couple of angles,
> - IOPS wise we see a major improvement especially for IOMMU
> - Series provides a way to do p2pdma to accelerator memory
>
> Here are few topics which I am looking into specifically,
> - Right now the series uses a PRP list. We need a good way to keep the
> sg_table info around and decide on‑the‑fly whether to expose the buffer
> as a PRP list or an SG list, depending on the I/O size.
> - Possibility of futher optimization for new type of iov iter to reduce
> per IO cost
There is a bunch of improvements that we can have on the NVMe driver
side, just take a look what Keith was doing in his series ([2] in the
first email in the thread), that looked very exciting (I dropped it for
simplicity). I was planning to take a closer look at optimising the driver
part after, but if someone wants to take it off my hands, it'll definitely
be welcome!
--
Pavel Begunkov
next prev parent reply other threads:[~2026-02-09 11:15 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20260204153051epcas5p1c2efd01ef32883680fed2541f9fca6c2@epcas5p1.samsung.com>
2026-02-03 14:29 ` [LSF/MM/BPF TOPIC] dmabuf backed read/write Pavel Begunkov
2026-02-03 18:07 ` Keith Busch
2026-02-04 6:07 ` Anuj Gupta/Anuj Gupta
2026-02-04 11:38 ` Pavel Begunkov
2026-02-04 15:26 ` Nitesh Shetty
2026-02-09 11:15 ` Pavel Begunkov [this message]
2026-02-05 3:12 ` Ming Lei
2026-02-05 18:13 ` Pavel Begunkov
2026-02-05 17:41 ` Jason Gunthorpe
2026-02-05 19:06 ` Pavel Begunkov
2026-02-05 23:56 ` Jason Gunthorpe
2026-02-06 15:08 ` Pavel Begunkov
2026-02-06 15:20 ` Jason Gunthorpe
2026-02-06 17:57 ` Pavel Begunkov
2026-02-06 18:37 ` Jason Gunthorpe
2026-02-09 10:59 ` Pavel Begunkov
2026-02-09 13:06 ` Jason Gunthorpe
2026-02-09 13:09 ` Christian König
2026-02-09 13:24 ` Jason Gunthorpe
2026-02-09 13:55 ` Christian König
2026-02-09 14:01 ` Jason Gunthorpe
2026-02-09 9:54 ` Kanchan Joshi
2026-02-09 10:13 ` Christian König
2026-02-09 12:54 ` Jason Gunthorpe
2026-02-09 10:04 ` Kanchan Joshi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=76299bc2-4871-4571-bef6-9886ee0d2c5f@gmail.com \
--to=asml.silence@gmail.com \
--cc=anuj20.g@samsung.com \
--cc=christian.koenig@amd.com \
--cc=hch@lst.de \
--cc=io-uring@vger.kernel.org \
--cc=joshi.k@samsung.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=nj.shetty@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox