public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Introduce IORING_OP_MMAP
@ 2026-01-29 22:11 Gabriel Krisman Bertazi
  2026-01-29 22:11 ` [PATCH 1/2] io_uring: Support commands with optional file descriptors Gabriel Krisman Bertazi
  2026-01-29 22:11 ` [PATCH 2/2] io_uring: introduce IORING_OP_MMAP Gabriel Krisman Bertazi
  0 siblings, 2 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2026-01-29 22:11 UTC (permalink / raw)
  To: axboe
  Cc: io-uring, Gabriel Krisman Bertazi, Andrew Morton,
	David Hildenbrand, Lorenzo Stoakes, Vlastimil Babka,
	Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	linux-mm

Hi,

There's been a few requests over time for supporting mmap(2) over
io_uring. The reasoning are twofold: 1) serving as base for batching
multiple mappings in a single operation 2) supporting mmap of fixed
files.

Since mmap can operate on either anonymous memory and file descriptors,
patch 1 adds support for optional fds in io_uring commands.  Patch 2
implements the mmap operation itself.

Note this patchset doesn't do any kind of smarter batching in MM.  While
we can potentially do some interesting optimizations already, like
holding the MM write lock instead of reacquiring it for each mapping, I
wanted to focus on the API discussion first.  This is left as future
work.

liburing support, including testcases, will be sent shortly to the list,
but can also be found at:

 https://github.com/krisman/liburing -b mmap

Thanks,

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org
Cc: io-uring@vger.kernel.org

Gabriel Krisman Bertazi (2):
  io_uring: Support commands with optional file descriptors
  io_uring: introduce IORING_OP_MMAP

 include/uapi/linux/io_uring.h |  10 +++
 io_uring/Makefile             |   2 +-
 io_uring/io_uring.c           |  15 ++--
 io_uring/mmap.c               | 147 ++++++++++++++++++++++++++++++++++
 io_uring/mmap.h               |   4 +
 io_uring/opdef.c              |   9 +++
 io_uring/opdef.h              |   2 +
 7 files changed, 183 insertions(+), 6 deletions(-)
 create mode 100644 io_uring/mmap.c
 create mode 100644 io_uring/mmap.h

-- 
2.52.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] io_uring: Support commands with optional file descriptors
  2026-01-29 22:11 [PATCH 0/2] Introduce IORING_OP_MMAP Gabriel Krisman Bertazi
@ 2026-01-29 22:11 ` Gabriel Krisman Bertazi
  2026-01-29 22:11 ` [PATCH 2/2] io_uring: introduce IORING_OP_MMAP Gabriel Krisman Bertazi
  1 sibling, 0 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2026-01-29 22:11 UTC (permalink / raw)
  To: axboe
  Cc: io-uring, Gabriel Krisman Bertazi, Andrew Morton,
	David Hildenbrand, Lorenzo Stoakes, Vlastimil Babka,
	Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	linux-mm

mmap can be called either for file-backed memory or for anonymous memory
in which case no fd is provided. This patch allows an io_uring command
to request optional files for io_uring.  If a fd is provided, io_uring
loads it in the regular paths. Otherwise, req->file stays NULL.

At the SQE level, Use the ancient mmap semantics of fd == -1 to indicate
no fd.  This feels more useful than a flag or using 0. The later because
I'd expect 0 to be commonly used for direct FDs.  We can abstract this
details in liburing.  It is a bit ugly and it could be a flag elsewhere,
but we don't need to waste a flag on that.

Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
---
 io_uring/io_uring.c | 15 ++++++++++-----
 io_uring/opdef.h    |  2 ++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 87a87396e940..158e9823a72a 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1756,10 +1756,17 @@ static __cold void io_drain_req(struct io_kiocb *req)
 		ctx->drain_active = false;
 }
 
+static inline bool op_wants_file(const struct io_issue_def *def,
+				    struct io_kiocb *req)
+{
+	return (def->needs_file ||
+		(def->opt_file && req->cqe.fd != -1));
+}
+
 static bool io_assign_file(struct io_kiocb *req, const struct io_issue_def *def,
 			   unsigned int issue_flags)
 {
-	if (req->file || !def->needs_file)
+	if (req->file || !op_wants_file(def, req))
 		return true;
 
 	if (req->flags & REQ_F_FIXED_FILE)
@@ -2200,11 +2207,9 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	if (!def->iopoll && (ctx->flags & IORING_SETUP_IOPOLL))
 		return io_init_fail_req(req, -EINVAL);
 
-	if (def->needs_file) {
+	req->cqe.fd = READ_ONCE(sqe->fd);
+	if (op_wants_file(def, req)) {
 		struct io_submit_state *state = &ctx->submit_state;
-
-		req->cqe.fd = READ_ONCE(sqe->fd);
-
 		/*
 		 * Plug now if we have more than 2 IO left after this, and the
 		 * target is potentially a read/write to block based storage.
diff --git a/io_uring/opdef.h b/io_uring/opdef.h
index aa37846880ff..5b81f82c2359 100644
--- a/io_uring/opdef.h
+++ b/io_uring/opdef.h
@@ -5,6 +5,8 @@
 struct io_issue_def {
 	/* needs req->file assigned */
 	unsigned		needs_file : 1;
+	/* Optional req->file assigned, if available. */
+	unsigned		opt_file : 1;
 	/* should block plug */
 	unsigned		plug : 1;
 	/* supports ioprio */
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] io_uring: introduce IORING_OP_MMAP
  2026-01-29 22:11 [PATCH 0/2] Introduce IORING_OP_MMAP Gabriel Krisman Bertazi
  2026-01-29 22:11 ` [PATCH 1/2] io_uring: Support commands with optional file descriptors Gabriel Krisman Bertazi
@ 2026-01-29 22:11 ` Gabriel Krisman Bertazi
  2026-01-30  6:03   ` kernel test robot
  2026-01-30 15:55   ` Jens Axboe
  1 sibling, 2 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2026-01-29 22:11 UTC (permalink / raw)
  To: axboe
  Cc: io-uring, Gabriel Krisman Bertazi, Andrew Morton,
	David Hildenbrand, Lorenzo Stoakes, Vlastimil Babka,
	Liam R. Howlett, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	linux-mm

This enables mmap(2) over io_uring.  The interesting part is allowing
the mapping of multiple regions with different parameters in a single
operation. This is not explored in this patch, but coalescing multiple
operations can enable batching deeper in the MM layer.

The SQE provides an array of memory descriptors to be mapped backed by
fd, or to anonymous memory if fd == -1. All descriptors are mapped against
the same file, but protections and flags can vary.

The API also tries to be very clear about what failed in case of an
error. The number of maps that succeeded is returned on the CQE, and the
error code of the first failed map is passed back via the descriptor
structure (which must live until completion).

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
---
 include/uapi/linux/io_uring.h |  10 +++
 io_uring/Makefile             |   2 +-
 io_uring/mmap.c               | 147 ++++++++++++++++++++++++++++++++++
 io_uring/mmap.h               |   4 +
 io_uring/opdef.c              |   9 +++
 5 files changed, 171 insertions(+), 1 deletion(-)
 create mode 100644 io_uring/mmap.c
 create mode 100644 io_uring/mmap.h

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index b5b23c0d5283..e24fe3b00059 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -74,6 +74,7 @@ struct io_uring_sqe {
 		__u32		install_fd_flags;
 		__u32		nop_flags;
 		__u32		pipe_flags;
+		__u32		mmap_flags;
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	/* pack this to avoid bogus arm OABI complaints */
@@ -303,6 +304,7 @@ enum io_uring_op {
 	IORING_OP_PIPE,
 	IORING_OP_NOP128,
 	IORING_OP_URING_CMD128,
+	IORING_OP_MMAP,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
@@ -1113,6 +1115,14 @@ struct zcrx_ctrl {
 	};
 };
 
+struct io_uring_mmap_desc {
+	void __user *addr;
+	unsigned long len;
+	unsigned long pgoff;
+	unsigned int prot;
+	unsigned int flags;
+};
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/io_uring/Makefile b/io_uring/Makefile
index bc4e4a3fa0a5..be0fa605f87d 100644
--- a/io_uring/Makefile
+++ b/io_uring/Makefile
@@ -13,7 +13,7 @@ obj-$(CONFIG_IO_URING)		+= io_uring.o opdef.o kbuf.o rsrc.o notif.o \
 					sync.o msg_ring.o advise.o openclose.o \
 					statx.o timeout.o cancel.o \
 					waitid.o register.o truncate.o \
-					memmap.o alloc_cache.o query.o
+					memmap.o mmap.o alloc_cache.o query.o
 obj-$(CONFIG_IO_URING_ZCRX)	+= zcrx.o
 obj-$(CONFIG_IO_WQ)		+= io-wq.o
 obj-$(CONFIG_FUTEX)		+= futex.o
diff --git a/io_uring/mmap.c b/io_uring/mmap.c
new file mode 100644
index 000000000000..14b960707bb2
--- /dev/null
+++ b/io_uring/mmap.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/file.h>
+#include <linux/io_uring.h>
+#include <linux/hugetlb.h>
+#include <linux/mm.h>
+#include <linux/mm_inline.h>
+#include <linux/shm.h>
+#include <linux/mman.h>
+#include <linux/audit.h>
+#include "../mm/internal.h"
+#include <uapi/linux/io_uring.h>
+
+#include "io_uring.h"
+#include "mmap.h"
+#include "rsrc.h"
+
+struct io_mmap_data {
+	struct file *file;
+	unsigned long flags;
+	struct io_uring_mmap_desc __user *uaddr;
+};
+struct io_mmap_async {
+	int nr_maps;
+	struct io_uring_mmap_desc maps[] __counted_by(nr_maps);
+};
+
+#define MMAP_MAX_BATCH 1024
+
+int io_mmap_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	struct io_mmap_data *mmap = io_kiocb_to_cmd(req, struct io_mmap_data);
+	struct io_mmap_async *maps;
+	int nr_maps;
+
+	mmap->uaddr = u64_to_user_ptr(READ_ONCE(sqe->addr));
+	mmap->flags = READ_ONCE(sqe->mmap_flags);
+	nr_maps = READ_ONCE(sqe->len);
+
+	if (mmap->flags & MAP_ANONYMOUS && req->cqe.fd != -1)
+		return -EINVAL;
+	if (nr_maps < 0 || nr_maps > MMAP_MAX_BATCH)
+		return -EINVAL;
+	if (!access_ok(mmap->uaddr, nr_maps*sizeof(struct io_uring_mmap_desc)))
+		return -EFAULT;
+
+	maps = kzalloc(struct_size_t(struct io_mmap_async, maps, nr_maps),
+		       GFP_KERNEL);
+	if (!maps)
+		return -ENOMEM;
+	maps->nr_maps = nr_maps;
+
+	req->flags |= REQ_F_ASYNC_DATA;
+	req->async_data = maps;
+	return 0;
+}
+
+static int io_prep_mmap_hugetlb(struct file **filp, unsigned long *len,
+				int flags)
+{
+	if (*filp) {
+		*len = ALIGN(*len, huge_page_size(hstate_file(*filp)));
+	} else {
+		struct hstate *hs;
+		unsigned long nlen = *len;
+
+		hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
+		if (!hs)
+			return -EINVAL;
+		nlen = ALIGN(nlen, huge_page_size(hs));
+		*filp = hugetlb_file_setup(HUGETLB_ANON_FILE, nlen,
+					   VM_NORESERVE,
+					   HUGETLB_ANONHUGE_INODE,
+				   (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
+
+		if (IS_ERR(*filp))
+			return PTR_ERR(*filp);
+		*len = nlen;
+	}
+	return 0;
+}
+
+int io_mmap(struct io_kiocb *req, unsigned int issue_flags)
+{
+	struct io_mmap_data *mmap = io_kiocb_to_cmd(req, struct io_mmap_data);
+	struct io_mmap_async *data = (struct io_mmap_async *) req->async_data;
+	int i, mapped, ret;
+
+	if (unlikely(mmap->flags & MAP_HUGETLB && req->file &&
+		     !is_file_hugepages(req->file))) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	for (i = 0; i < data->nr_maps; i++) {
+		struct io_uring_mmap_desc *desc = &data->maps[i];
+
+		if (copy_from_user(desc, &mmap->uaddr[i], sizeof(*desc))) {
+			ret = -EFAULT;
+			goto out;
+		}
+	}
+
+	mapped = 0;
+	while (mapped < data->nr_maps) {
+		struct io_uring_mmap_desc *desc = &data->maps[mapped++];
+		unsigned long flags = (mmap->flags | desc->flags);
+		unsigned long len = desc->len;
+		struct file *file = req->file;
+
+		/* These cannot be mixed and matched.  need to be passed
+		 * on the SQE.
+		 */
+		if (unlikely(desc->flags & (MAP_ANONYMOUS|MAP_HUGETLB))) {
+			desc->addr = ERR_PTR(-EINVAL);
+			break;
+		}
+		if (!(flags & MAP_ANONYMOUS))
+			audit_mmap_fd(req->cqe.fd, flags);
+
+		if (unlikely(flags & MAP_HUGETLB)) {
+			ret = io_prep_mmap_hugetlb(&file, &len, flags);
+			if (ret) {
+				desc->addr = ERR_PTR(-ret);
+				break;
+			}
+		}
+
+		desc->addr = (void *) vm_mmap_pgoff(file,
+					   (unsigned long) desc->addr,
+					   len, desc->prot, flags, desc->pgoff);
+		if (IS_ERR_OR_NULL(desc->addr))
+			break;
+	}
+
+	if (copy_to_user(mmap->uaddr, data->maps,
+			 sizeof(struct io_uring_mmap_desc)*mapped))
+		ret = -EFAULT;
+
+	ret = mapped;
+out:
+	if (ret < 0)
+		req_set_fail(req);
+	io_req_set_res(req, ret, 0);
+	return IOU_COMPLETE;
+}
diff --git a/io_uring/mmap.h b/io_uring/mmap.h
new file mode 100644
index 000000000000..acddf6db76e7
--- /dev/null
+++ b/io_uring/mmap.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+int io_mmap_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe);
+int io_mmap(struct io_kiocb *req, unsigned int issue_flags);
diff --git a/io_uring/opdef.c b/io_uring/opdef.c
index df52d760240e..679e413d2395 100644
--- a/io_uring/opdef.c
+++ b/io_uring/opdef.c
@@ -29,6 +29,7 @@
 #include "epoll.h"
 #include "statx.h"
 #include "net.h"
+#include "mmap.h"
 #include "msg_ring.h"
 #include "timeout.h"
 #include "poll.h"
@@ -593,6 +594,11 @@ const struct io_issue_def io_issue_defs[] = {
 		.prep			= io_uring_cmd_prep,
 		.issue			= io_uring_cmd,
 	},
+	[IORING_OP_MMAP] = {
+		.prep			= io_mmap_prep,
+		.issue			= io_mmap,
+		.opt_file		= 1,
+	}
 };
 
 const struct io_cold_def io_cold_defs[] = {
@@ -851,6 +857,9 @@ const struct io_cold_def io_cold_defs[] = {
 		.sqe_copy		= io_uring_cmd_sqe_copy,
 		.cleanup		= io_uring_cmd_cleanup,
 	},
+	[IORING_OP_MMAP] = {
+		.name			= "MMAP",
+	},
 };
 
 const char *io_uring_get_opcode(u8 opcode)
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] io_uring: introduce IORING_OP_MMAP
  2026-01-29 22:11 ` [PATCH 2/2] io_uring: introduce IORING_OP_MMAP Gabriel Krisman Bertazi
@ 2026-01-30  6:03   ` kernel test robot
  2026-01-30 15:47     ` Gabriel Krisman Bertazi
  2026-01-30 15:55   ` Jens Axboe
  1 sibling, 1 reply; 6+ messages in thread
From: kernel test robot @ 2026-01-30  6:03 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi, axboe
  Cc: oe-kbuild-all, io-uring, Gabriel Krisman Bertazi, Andrew Morton,
	Linux Memory Management List, David Hildenbrand, Lorenzo Stoakes,
	Vlastimil Babka, Liam R. Howlett, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko

Hi Gabriel,

kernel test robot noticed the following build warnings:

[auto build test WARNING on v6.19-rc7]
[also build test WARNING on linus/master]
[cannot apply to axboe/for-next next-20260129]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Gabriel-Krisman-Bertazi/io_uring-Support-commands-with-optional-file-descriptors/20260130-061445
base:   v6.19-rc7
patch link:    https://lore.kernel.org/r/20260129221138.897715-3-krisman%40suse.de
patch subject: [PATCH 2/2] io_uring: introduce IORING_OP_MMAP
config: m68k-randconfig-r122-20260130 (https://download.01.org/0day-ci/archive/20260130/202601301341.PTetVieu-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260130/202601301341.PTetVieu-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202601301341.PTetVieu-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> io_uring/mmap.c:116:36: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected void [noderef] __user *addr @@     got void * @@
   io_uring/mmap.c:116:36: sparse:     expected void [noderef] __user *addr
   io_uring/mmap.c:116:36: sparse:     got void *
   io_uring/mmap.c:125:44: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected void [noderef] __user *addr @@     got void * @@
   io_uring/mmap.c:125:44: sparse:     expected void [noderef] __user *addr
   io_uring/mmap.c:125:44: sparse:     got void *
   io_uring/mmap.c:130:28: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected void [noderef] __user *addr @@     got void * @@
   io_uring/mmap.c:130:28: sparse:     expected void [noderef] __user *addr
   io_uring/mmap.c:130:28: sparse:     got void *

vim +116 io_uring/mmap.c

    83	
    84	int io_mmap(struct io_kiocb *req, unsigned int issue_flags)
    85	{
    86		struct io_mmap_data *mmap = io_kiocb_to_cmd(req, struct io_mmap_data);
    87		struct io_mmap_async *data = (struct io_mmap_async *) req->async_data;
    88		int i, mapped, ret;
    89	
    90		if (unlikely(mmap->flags & MAP_HUGETLB && req->file &&
    91			     !is_file_hugepages(req->file))) {
    92			ret = -EINVAL;
    93			goto out;
    94		}
    95	
    96		for (i = 0; i < data->nr_maps; i++) {
    97			struct io_uring_mmap_desc *desc = &data->maps[i];
    98	
    99			if (copy_from_user(desc, &mmap->uaddr[i], sizeof(*desc))) {
   100				ret = -EFAULT;
   101				goto out;
   102			}
   103		}
   104	
   105		mapped = 0;
   106		while (mapped < data->nr_maps) {
   107			struct io_uring_mmap_desc *desc = &data->maps[mapped++];
   108			unsigned long flags = (mmap->flags | desc->flags);
   109			unsigned long len = desc->len;
   110			struct file *file = req->file;
   111	
   112			/* These cannot be mixed and matched.  need to be passed
   113			 * on the SQE.
   114			 */
   115			if (unlikely(desc->flags & (MAP_ANONYMOUS|MAP_HUGETLB))) {
 > 116				desc->addr = ERR_PTR(-EINVAL);

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] io_uring: introduce IORING_OP_MMAP
  2026-01-30  6:03   ` kernel test robot
@ 2026-01-30 15:47     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2026-01-30 15:47 UTC (permalink / raw)
  To: kernel test robot
  Cc: axboe, oe-kbuild-all, io-uring, Andrew Morton,
	Linux Memory Management List, David Hildenbrand, Lorenzo Stoakes,
	Vlastimil Babka, Liam R. Howlett, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko

kernel test robot <lkp@intel.com> writes:

> Hi Gabriel,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on v6.19-rc7]
> [also build test WARNING on linus/master]
> [cannot apply to axboe/for-next next-20260129]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Gabriel-Krisman-Bertazi/io_uring-Support-commands-with-optional-file-descriptors/20260130-061445
> base:   v6.19-rc7
> patch link:    https://lore.kernel.org/r/20260129221138.897715-3-krisman%40suse.de
> patch subject: [PATCH 2/2] io_uring: introduce IORING_OP_MMAP
> config: m68k-randconfig-r122-20260130 (https://download.01.org/0day-ci/archive/20260130/202601301341.PTetVieu-lkp@intel.com/config)
> compiler: m68k-linux-gcc (GCC) 8.5.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260130/202601301341.PTetVieu-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202601301341.PTetVieu-lkp@intel.com/
>
> sparse warnings: (new ones prefixed by >>)
>>> io_uring/mmap.c:116:36: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected void [noderef] __user *addr @@     got void * @@
>    io_uring/mmap.c:116:36: sparse:     expected void [noderef] __user *addr
>    io_uring/mmap.c:116:36: sparse:     got void *
>    io_uring/mmap.c:125:44: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected void [noderef] __user *addr @@     got void * @@
>    io_uring/mmap.c:125:44: sparse:     expected void [noderef] __user *addr
>    io_uring/mmap.c:125:44: sparse:     got void *
>    io_uring/mmap.c:130:28: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected void [noderef] __user *addr @@     got void * @@
>    io_uring/mmap.c:130:28: sparse:     expected void [noderef] __user *addr
>    io_uring/mmap.c:130:28: sparse:     got void *

FWIW, for reviewers, these are false positives.  The issue is I'm using
"void* __user addr" to either return a pointer or the error code to
user.  It is properly copied back through copy_to_user, but sparse still
complains.  I'll look into silencing it.

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] io_uring: introduce IORING_OP_MMAP
  2026-01-29 22:11 ` [PATCH 2/2] io_uring: introduce IORING_OP_MMAP Gabriel Krisman Bertazi
  2026-01-30  6:03   ` kernel test robot
@ 2026-01-30 15:55   ` Jens Axboe
  1 sibling, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2026-01-30 15:55 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: io-uring, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
	Vlastimil Babka, Liam R. Howlett, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, linux-mm

On 1/29/26 3:11 PM, Gabriel Krisman Bertazi wrote:
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index b5b23c0d5283..e24fe3b00059 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -74,6 +74,7 @@ struct io_uring_sqe {
>  		__u32		install_fd_flags;
>  		__u32		nop_flags;
>  		__u32		pipe_flags;
> +		__u32		mmap_flags;
>  	};
>  	__u64	user_data;	/* data to be passed back at completion time */
>  	/* pack this to avoid bogus arm OABI complaints */
> @@ -303,6 +304,7 @@ enum io_uring_op {
>  	IORING_OP_PIPE,
>  	IORING_OP_NOP128,
>  	IORING_OP_URING_CMD128,
> +	IORING_OP_MMAP,
>  
>  	/* this goes last, obviously */
>  	IORING_OP_LAST,
> @@ -1113,6 +1115,14 @@ struct zcrx_ctrl {
>  	};
>  };
>  
> +struct io_uring_mmap_desc {
> +	void __user *addr;
> +	unsigned long len;
> +	unsigned long pgoff;
> +	unsigned int prot;
> +	unsigned int flags;
> +};

You can't use pointers or unsigned long or unsigned int in a uapi, as
they'd be different sizes on 32-bit and 64-bit. And then you need compat
handling. It's much better to make this:

struct io_uring_mmap_desc {
	__u64 addr
	__u64 len;
	__u64 pgoff;
	__u32 prot;
	__u32 flags;
};

and then generally also a good idea to have a bit of expansion space
there, so you don't need a new desc down the line.

> +int io_mmap_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
> +{
> +	struct io_mmap_data *mmap = io_kiocb_to_cmd(req, struct io_mmap_data);
> +	struct io_mmap_async *maps;
> +	int nr_maps;
> +
> +	mmap->uaddr = u64_to_user_ptr(READ_ONCE(sqe->addr));
> +	mmap->flags = READ_ONCE(sqe->mmap_flags);
> +	nr_maps = READ_ONCE(sqe->len);
> +
> +	if (mmap->flags & MAP_ANONYMOUS && req->cqe.fd != -1)
> +		return -EINVAL;
> +	if (nr_maps < 0 || nr_maps > MMAP_MAX_BATCH)
> +		return -EINVAL;
> +	if (!access_ok(mmap->uaddr, nr_maps*sizeof(struct io_uring_mmap_desc)))
> +		return -EFAULT;

Does this access_ok actually provide anything? We're copying it in later
anyway, no?

> +static int io_prep_mmap_hugetlb(struct file **filp, unsigned long *len,
> +				int flags)
> +{
> +	if (*filp) {
> +		*len = ALIGN(*len, huge_page_size(hstate_file(*filp)));
> +	} else {
> +		struct hstate *hs;
> +		unsigned long nlen = *len;
> +
> +		hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
> +		if (!hs)
> +			return -EINVAL;
> +		nlen = ALIGN(nlen, huge_page_size(hs));
> +		*filp = hugetlb_file_setup(HUGETLB_ANON_FILE, nlen,
> +					   VM_NORESERVE,
> +					   HUGETLB_ANONHUGE_INODE,
> +				   (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);

This looks like it dips into vm_mmap_pgoff(). More on that below.

> +		desc->addr = (void *) vm_mmap_pgoff(file,
> +					   (unsigned long) desc->addr,
> +					   len, desc->prot, flags, desc->pgoff);

One concern here is that vm_mmap_pgoff() ends up doing:

mmap_write_lock_killable(mm)
	grabs mm lock, can block, for a long time?

which could potentially stall the io_uring pipeline for a long time.
Ideally you'd be able to do something where you try to grab the mm lock
from io_mmap(), and if it fails, then either fail the request (if it's a
killable thing) or punt it with -EAGAIN to let an io-wq thread handle
it.

I'm not so sure simply wrapping vm_mmap_pgoff() either directly or
indirectly via the hugetlb stuff is going to be super useful, if we can
end up blocking for a long time on these operations.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-01-30 15:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-29 22:11 [PATCH 0/2] Introduce IORING_OP_MMAP Gabriel Krisman Bertazi
2026-01-29 22:11 ` [PATCH 1/2] io_uring: Support commands with optional file descriptors Gabriel Krisman Bertazi
2026-01-29 22:11 ` [PATCH 2/2] io_uring: introduce IORING_OP_MMAP Gabriel Krisman Bertazi
2026-01-30  6:03   ` kernel test robot
2026-01-30 15:47     ` Gabriel Krisman Bertazi
2026-01-30 15:55   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox