public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v3 0/1] Add a sysctl to disable io_uring system-wide
@ 2023-06-30 15:10 Matteo Rizzo
  2023-06-30 15:10 ` [PATCH v3 1/1] io_uring: add " Matteo Rizzo
  2023-07-11 20:51 ` [PATCH v3 0/1] Add " Jens Axboe
  0 siblings, 2 replies; 9+ messages in thread
From: Matteo Rizzo @ 2023-06-30 15:10 UTC (permalink / raw)
  To: linux-doc, linux-kernel, io-uring, axboe, asml.silence
  Cc: matteorizzo, corbet, akpm, keescook, ribalda, rostedt, jannh,
	chenhuacai, gpiccoli, ldufour, evn, poprdi, jordyzomer, jmoyer,
	krisman

Over the last few years we've seen many critical vulnerabilities in
io_uring[1] which could be exploited by an unprivileged process to gain
control over the kernel. This patch introduces a new sysctl which disables
the creation of new io_uring instances system-wide.

The goal of this patch is to give distros, system admins, and cloud
providers a way to reduce the risk of privilege escalation through io_uring
where disabling it with seccomp or at compile time is not practical. For
example a distro or cloud provider might want to disable io_uring by
default and have users enable it again if they need to run a program that
requires it. The new sysctl is designed to let a user with root on the
machine enable and disable io_uring systemwide at runtime without requiring
a kernel recompilation or a reboot.

[1] Link: https://goo.gle/limit-iouring

---
v3:
	* Fix the commit message
	* Use READ_ONCE in io_uring_allowed to avoid races
	* Add reviews
v2:
	* Documentation style fixes
	* Add a third level that only disables io_uring for unprivileged
	  processes


Matteo Rizzo (1):
  io_uring: add a sysctl to disable io_uring system-wide

 Documentation/admin-guide/sysctl/kernel.rst | 19 +++++++++++++
 io_uring/io_uring.c                         | 31 +++++++++++++++++++++
 2 files changed, 50 insertions(+)


base-commit: 1601fb26b26758668533bdb211fdfbb5234367ee
-- 
2.41.0.255.g8b1d071c50-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/1] io_uring: add a sysctl to disable io_uring system-wide
  2023-06-30 15:10 [PATCH v3 0/1] Add a sysctl to disable io_uring system-wide Matteo Rizzo
@ 2023-06-30 15:10 ` Matteo Rizzo
  2023-06-30 15:15   ` Jann Horn
  2023-07-26 17:45   ` Andres Freund
  2023-07-11 20:51 ` [PATCH v3 0/1] Add " Jens Axboe
  1 sibling, 2 replies; 9+ messages in thread
From: Matteo Rizzo @ 2023-06-30 15:10 UTC (permalink / raw)
  To: linux-doc, linux-kernel, io-uring, axboe, asml.silence
  Cc: matteorizzo, corbet, akpm, keescook, ribalda, rostedt, jannh,
	chenhuacai, gpiccoli, ldufour, evn, poprdi, jordyzomer, jmoyer,
	krisman

Introduce a new sysctl (io_uring_disabled) which can be either 0, 1,
or 2. When 0 (the default), all processes are allowed to create io_uring
instances, which is the current behavior. When 1, all calls to
io_uring_setup fail with -EPERM unless the calling process has
CAP_SYS_ADMIN. When 2, calls to io_uring_setup fail with -EPERM
regardless of privilege.

Signed-off-by: Matteo Rizzo <[email protected]>
Reviewed-by: Jeff Moyer <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
---
 Documentation/admin-guide/sysctl/kernel.rst | 19 +++++++++++++
 io_uring/io_uring.c                         | 31 +++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 3800fab1619b..ee65f7aeb0cf 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -450,6 +450,25 @@ this allows system administrators to override the
 ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
 
 
+io_uring_disabled
+=================
+
+Prevents all processes from creating new io_uring instances. Enabling this
+shrinks the kernel's attack surface.
+
+= ==================================================================
+0 All processes can create io_uring instances as normal. This is the
+  default setting.
+1 io_uring creation is disabled for unprivileged processes.
+  io_uring_setup fails with -EPERM unless the calling process is
+  privileged (CAP_SYS_ADMIN). Existing io_uring instances can
+  still be used.
+2 io_uring creation is disabled for all processes. io_uring_setup
+  always fails with -EPERM. Existing io_uring instances can still be
+  used.
+= ==================================================================
+
+
 kexec_load_disabled
 ===================
 
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index e8096d502a7c..5410f5576980 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -152,6 +152,22 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx);
 
 struct kmem_cache *req_cachep;
 
+static int __read_mostly sysctl_io_uring_disabled;
+#ifdef CONFIG_SYSCTL
+static struct ctl_table kernel_io_uring_disabled_table[] = {
+	{
+		.procname	= "io_uring_disabled",
+		.data		= &sysctl_io_uring_disabled,
+		.maxlen		= sizeof(sysctl_io_uring_disabled),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_TWO,
+	},
+	{},
+};
+#endif
+
 struct sock *io_uring_get_socket(struct file *file)
 {
 #if defined(CONFIG_UNIX)
@@ -4015,9 +4031,19 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
 	return io_uring_create(entries, &p, params);
 }
 
+static inline bool io_uring_allowed(void)
+{
+	int disabled = READ_ONCE(sysctl_io_uring_disabled);
+
+	return disabled == 0 || (disabled == 1 && capable(CAP_SYS_ADMIN));
+}
+
 SYSCALL_DEFINE2(io_uring_setup, u32, entries,
 		struct io_uring_params __user *, params)
 {
+	if (!io_uring_allowed())
+		return -EPERM;
+
 	return io_uring_setup(entries, params);
 }
 
@@ -4592,6 +4618,11 @@ static int __init io_uring_init(void)
 
 	req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
 				SLAB_ACCOUNT | SLAB_TYPESAFE_BY_RCU);
+
+#ifdef CONFIG_SYSCTL
+	register_sysctl_init("kernel", kernel_io_uring_disabled_table);
+#endif
+
 	return 0;
 };
 __initcall(io_uring_init);
-- 
2.41.0.255.g8b1d071c50-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/1] io_uring: add a sysctl to disable io_uring system-wide
  2023-06-30 15:10 ` [PATCH v3 1/1] io_uring: add " Matteo Rizzo
@ 2023-06-30 15:15   ` Jann Horn
  2023-07-26 17:45   ` Andres Freund
  1 sibling, 0 replies; 9+ messages in thread
From: Jann Horn @ 2023-06-30 15:15 UTC (permalink / raw)
  To: Matteo Rizzo
  Cc: linux-doc, linux-kernel, io-uring, axboe, asml.silence, corbet,
	akpm, keescook, ribalda, rostedt, chenhuacai, gpiccoli, ldufour,
	evn, poprdi, jordyzomer, jmoyer, krisman

On Fri, Jun 30, 2023 at 5:10 PM Matteo Rizzo <[email protected]> wrote:
> Introduce a new sysctl (io_uring_disabled) which can be either 0, 1,
> or 2. When 0 (the default), all processes are allowed to create io_uring
> instances, which is the current behavior. When 1, all calls to
> io_uring_setup fail with -EPERM unless the calling process has
> CAP_SYS_ADMIN. When 2, calls to io_uring_setup fail with -EPERM
> regardless of privilege.
>
> Signed-off-by: Matteo Rizzo <[email protected]>
> Reviewed-by: Jeff Moyer <[email protected]>
> Reviewed-by: Gabriel Krisman Bertazi <[email protected]>

Reviewed-by: Jann Horn <[email protected]>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/1] Add a sysctl to disable io_uring system-wide
  2023-06-30 15:10 [PATCH v3 0/1] Add a sysctl to disable io_uring system-wide Matteo Rizzo
  2023-06-30 15:10 ` [PATCH v3 1/1] io_uring: add " Matteo Rizzo
@ 2023-07-11 20:51 ` Jens Axboe
  1 sibling, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2023-07-11 20:51 UTC (permalink / raw)
  To: linux-doc, linux-kernel, io-uring, asml.silence, Matteo Rizzo
  Cc: corbet, akpm, keescook, ribalda, rostedt, jannh, chenhuacai,
	gpiccoli, ldufour, evn, poprdi, jordyzomer, jmoyer, krisman


On Fri, 30 Jun 2023 15:10:02 +0000, Matteo Rizzo wrote:
> Over the last few years we've seen many critical vulnerabilities in
> io_uring[1] which could be exploited by an unprivileged process to gain
> control over the kernel. This patch introduces a new sysctl which disables
> the creation of new io_uring instances system-wide.
> 
> The goal of this patch is to give distros, system admins, and cloud
> providers a way to reduce the risk of privilege escalation through io_uring
> where disabling it with seccomp or at compile time is not practical. For
> example a distro or cloud provider might want to disable io_uring by
> default and have users enable it again if they need to run a program that
> requires it. The new sysctl is designed to let a user with root on the
> machine enable and disable io_uring systemwide at runtime without requiring
> a kernel recompilation or a reboot.
> 
> [...]

Applied, thanks!

[1/1] io_uring: add a sysctl to disable io_uring system-wide
      commit: d55f54dac19a0cee1818353ab5aa3edac9034db4

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/1] io_uring: add a sysctl to disable io_uring system-wide
  2023-06-30 15:10 ` [PATCH v3 1/1] io_uring: add " Matteo Rizzo
  2023-06-30 15:15   ` Jann Horn
@ 2023-07-26 17:45   ` Andres Freund
  2023-07-26 20:02     ` Jeff Moyer
  1 sibling, 1 reply; 9+ messages in thread
From: Andres Freund @ 2023-07-26 17:45 UTC (permalink / raw)
  To: Matteo Rizzo
  Cc: linux-doc, linux-kernel, io-uring, axboe, asml.silence, corbet,
	akpm, keescook, ribalda, rostedt, jannh, chenhuacai, gpiccoli,
	ldufour, evn, poprdi, jordyzomer, jmoyer, krisman

Hi,

On 2023-06-30 15:10:03 +0000, Matteo Rizzo wrote:
> Introduce a new sysctl (io_uring_disabled) which can be either 0, 1,
> or 2. When 0 (the default), all processes are allowed to create io_uring
> instances, which is the current behavior. When 1, all calls to
> io_uring_setup fail with -EPERM unless the calling process has
> CAP_SYS_ADMIN. When 2, calls to io_uring_setup fail with -EPERM
> regardless of privilege.

Hm, is there a chance that instead of requiring CAP_SYS_ADMIN, a certain group
could be required (similar to hugetlb_shm_group)? Requiring CAP_SYS_ADMIN
could have the unintended consequence of io_uring requiring tasks being run
with more privileges than needed... Or some other more granular way of
granting the right to use io_uring?

ISTM that it'd be nice if e.g. a systemd service specification could allow
some services to use io_uring, without allowing it for everyone, or requiring
to run services effectively as root.

Greetings,

Andres Freund

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/1] io_uring: add a sysctl to disable io_uring system-wide
  2023-07-26 17:45   ` Andres Freund
@ 2023-07-26 20:02     ` Jeff Moyer
  2023-08-09 15:09       ` Andres Freund
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Moyer @ 2023-07-26 20:02 UTC (permalink / raw)
  To: Andres Freund
  Cc: Matteo Rizzo, linux-doc, linux-kernel, io-uring, axboe,
	asml.silence, corbet, akpm, keescook, ribalda, rostedt, jannh,
	chenhuacai, gpiccoli, ldufour, evn, poprdi, jordyzomer, krisman

Hi, Andres,

Andres Freund <[email protected]> writes:

> Hi,
>
> On 2023-06-30 15:10:03 +0000, Matteo Rizzo wrote:
>> Introduce a new sysctl (io_uring_disabled) which can be either 0, 1,
>> or 2. When 0 (the default), all processes are allowed to create io_uring
>> instances, which is the current behavior. When 1, all calls to
>> io_uring_setup fail with -EPERM unless the calling process has
>> CAP_SYS_ADMIN. When 2, calls to io_uring_setup fail with -EPERM
>> regardless of privilege.
>
> Hm, is there a chance that instead of requiring CAP_SYS_ADMIN, a certain group
> could be required (similar to hugetlb_shm_group)? Requiring CAP_SYS_ADMIN
> could have the unintended consequence of io_uring requiring tasks being run
> with more privileges than needed... Or some other more granular way of
> granting the right to use io_uring?

That's fine with me, so long as there is still an option to completely
disable io_uring.

> ISTM that it'd be nice if e.g. a systemd service specification could allow
> some services to use io_uring, without allowing it for everyone, or requiring
> to run services effectively as root.

Do you have a proposal for how that would work?  Why is this preferable
to using a group?

Cheers,
Jeff


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/1] io_uring: add a sysctl to disable io_uring system-wide
  2023-07-26 20:02     ` Jeff Moyer
@ 2023-08-09 15:09       ` Andres Freund
  2023-08-09 16:45         ` Jens Axboe
  2023-08-09 18:38         ` Gabriel Krisman Bertazi
  0 siblings, 2 replies; 9+ messages in thread
From: Andres Freund @ 2023-08-09 15:09 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Matteo Rizzo, linux-doc, linux-kernel, io-uring, axboe,
	asml.silence, corbet, akpm, keescook, ribalda, rostedt, jannh,
	chenhuacai, gpiccoli, ldufour, evn, poprdi, jordyzomer, krisman

Hi,

Sorry for the delayed response, EINBOXOVERFLOW.

On 2023-07-26 16:02:26 -0400, Jeff Moyer wrote:
> Andres Freund <[email protected]> writes:
> 
> > Hi,
> >
> > On 2023-06-30 15:10:03 +0000, Matteo Rizzo wrote:
> >> Introduce a new sysctl (io_uring_disabled) which can be either 0, 1,
> >> or 2. When 0 (the default), all processes are allowed to create io_uring
> >> instances, which is the current behavior. When 1, all calls to
> >> io_uring_setup fail with -EPERM unless the calling process has
> >> CAP_SYS_ADMIN. When 2, calls to io_uring_setup fail with -EPERM
> >> regardless of privilege.
> >
> > Hm, is there a chance that instead of requiring CAP_SYS_ADMIN, a certain group
> > could be required (similar to hugetlb_shm_group)? Requiring CAP_SYS_ADMIN
> > could have the unintended consequence of io_uring requiring tasks being run
> > with more privileges than needed... Or some other more granular way of
> > granting the right to use io_uring?
> 
> That's fine with me, so long as there is still an option to completely
> disable io_uring.

Makes sense.


> > ISTM that it'd be nice if e.g. a systemd service specification could allow
> > some services to use io_uring, without allowing it for everyone, or requiring
> > to run services effectively as root.
> 
> Do you have a proposal for how that would work?

I think group based permissions would allow for it, even if perhaps not in the
most beautiful manner. Systemd can configure additional groups for a service
with SupplementaryGroups, so adding a "io_uring" group or such should work.

Greetings,

Andres Freund

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/1] io_uring: add a sysctl to disable io_uring system-wide
  2023-08-09 15:09       ` Andres Freund
@ 2023-08-09 16:45         ` Jens Axboe
  2023-08-09 18:38         ` Gabriel Krisman Bertazi
  1 sibling, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2023-08-09 16:45 UTC (permalink / raw)
  To: Andres Freund, Jeff Moyer
  Cc: Matteo Rizzo, linux-doc, linux-kernel, io-uring, asml.silence,
	corbet, akpm, keescook, ribalda, rostedt, jannh, chenhuacai,
	gpiccoli, ldufour, evn, poprdi, jordyzomer, krisman

On 8/9/23 9:09 AM, Andres Freund wrote:
> Hi,
> 
> Sorry for the delayed response, EINBOXOVERFLOW.
> 
> On 2023-07-26 16:02:26 -0400, Jeff Moyer wrote:
>> Andres Freund <[email protected]> writes:
>>
>>> Hi,
>>>
>>> On 2023-06-30 15:10:03 +0000, Matteo Rizzo wrote:
>>>> Introduce a new sysctl (io_uring_disabled) which can be either 0, 1,
>>>> or 2. When 0 (the default), all processes are allowed to create io_uring
>>>> instances, which is the current behavior. When 1, all calls to
>>>> io_uring_setup fail with -EPERM unless the calling process has
>>>> CAP_SYS_ADMIN. When 2, calls to io_uring_setup fail with -EPERM
>>>> regardless of privilege.
>>>
>>> Hm, is there a chance that instead of requiring CAP_SYS_ADMIN, a certain group
>>> could be required (similar to hugetlb_shm_group)? Requiring CAP_SYS_ADMIN
>>> could have the unintended consequence of io_uring requiring tasks being run
>>> with more privileges than needed... Or some other more granular way of
>>> granting the right to use io_uring?
>>
>> That's fine with me, so long as there is still an option to completely
>> disable io_uring.
> 
> Makes sense.
> 
> 
>>> ISTM that it'd be nice if e.g. a systemd service specification could allow
>>> some services to use io_uring, without allowing it for everyone, or requiring
>>> to run services effectively as root.
>>
>> Do you have a proposal for how that would work?
> 
> I think group based permissions would allow for it, even if perhaps not in the
> most beautiful manner. Systemd can configure additional groups for a service
> with SupplementaryGroups, so adding a "io_uring" group or such should work.

I'm going to drop the original patch until we work out a scheme that
everybody is happy with, and that is flexible enough.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/1] io_uring: add a sysctl to disable io_uring system-wide
  2023-08-09 15:09       ` Andres Freund
  2023-08-09 16:45         ` Jens Axboe
@ 2023-08-09 18:38         ` Gabriel Krisman Bertazi
  1 sibling, 0 replies; 9+ messages in thread
From: Gabriel Krisman Bertazi @ 2023-08-09 18:38 UTC (permalink / raw)
  To: Andres Freund
  Cc: Jeff Moyer, Matteo Rizzo, linux-doc, linux-kernel, io-uring,
	axboe, asml.silence, corbet, akpm, keescook, ribalda, rostedt,
	jannh, chenhuacai, gpiccoli, ldufour, evn, poprdi, jordyzomer

Andres Freund <[email protected]> writes:

> Hi,
>
> Sorry for the delayed response, EINBOXOVERFLOW.
>
> On 2023-07-26 16:02:26 -0400, Jeff Moyer wrote:
>> Andres Freund <[email protected]> writes:
>> 
>> > Hi,
>> >
>> > On 2023-06-30 15:10:03 +0000, Matteo Rizzo wrote:
>> >> Introduce a new sysctl (io_uring_disabled) which can be either 0, 1,
>> >> or 2. When 0 (the default), all processes are allowed to create io_uring
>> >> instances, which is the current behavior. When 1, all calls to
>> >> io_uring_setup fail with -EPERM unless the calling process has
>> >> CAP_SYS_ADMIN. When 2, calls to io_uring_setup fail with -EPERM
>> >> regardless of privilege.
>> >
>> > Hm, is there a chance that instead of requiring CAP_SYS_ADMIN, a certain group
>> > could be required (similar to hugetlb_shm_group)? Requiring CAP_SYS_ADMIN
>> > could have the unintended consequence of io_uring requiring tasks being run
>> > with more privileges than needed... Or some other more granular way of
>> > granting the right to use io_uring?
>> 
>> That's fine with me, so long as there is still an option to completely
>> disable io_uring.
>
> Makes sense.
>
>
>> > ISTM that it'd be nice if e.g. a systemd service specification could allow
>> > some services to use io_uring, without allowing it for everyone, or requiring
>> > to run services effectively as root.
>> 
>> Do you have a proposal for how that would work?
>
> I think group based permissions would allow for it, even if perhaps not in the
> most beautiful manner. Systemd can configure additional groups for a service
> with SupplementaryGroups, so adding a "io_uring" group or such should
> work.

This is more complex/requires more configuration than just blocking
root/non-root. Also, might not be practical for non-systemd systems, I
suspect. Can we keep the other options in the sysctl io_uring_disabled
as well:

0 -> all allowed (default)
1 -> group based permission
2 -> root only
3 -> all blocked

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-09 18:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-30 15:10 [PATCH v3 0/1] Add a sysctl to disable io_uring system-wide Matteo Rizzo
2023-06-30 15:10 ` [PATCH v3 1/1] io_uring: add " Matteo Rizzo
2023-06-30 15:15   ` Jann Horn
2023-07-26 17:45   ` Andres Freund
2023-07-26 20:02     ` Jeff Moyer
2023-08-09 15:09       ` Andres Freund
2023-08-09 16:45         ` Jens Axboe
2023-08-09 18:38         ` Gabriel Krisman Bertazi
2023-07-11 20:51 ` [PATCH v3 0/1] Add " Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox