* [PATCH 0/1] Add a sysctl to disable io_uring system-wide
@ 2023-06-27 12:00 Matteo Rizzo
2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo
0 siblings, 1 reply; 11+ messages in thread
From: Matteo Rizzo @ 2023-06-27 12:00 UTC (permalink / raw)
To: linux-doc, linux-kernel, io-uring
Cc: matteorizzo, jordyzomer, evn, poprdi, corbet, axboe, asml.silence,
akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve,
gpiccoli, ldufour
Over the last few years we've seen many critical vulnerabilities in
io_uring (https://goo.gle/limit-iouring) which could be exploited by
an unprivileged process. There is currently no way to disable io_uring
system-wide except by compiling it out of the kernel entirely. The only
way to prevent a process from accessing io_uring is to use a seccomp
filter, but seccomp cannot be applied system-wide. This patch introduces a
new sysctl which disables the creation of new io_uring instances
system-wide. This gives system admins a way to reduce the kernel's attack
surface on systems where io_uring is not used.
Matteo Rizzo (1):
Add a new sysctl to disable io_uring system-wide
Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++++++
io_uring/io_uring.c | 24 +++++++++++++++++++++
2 files changed, 38 insertions(+)
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-27 12:00 [PATCH 0/1] Add a sysctl to disable io_uring system-wide Matteo Rizzo
@ 2023-06-27 12:00 ` Matteo Rizzo
2023-06-27 16:23 ` Randy Dunlap
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Matteo Rizzo @ 2023-06-27 12:00 UTC (permalink / raw)
To: linux-doc, linux-kernel, io-uring
Cc: matteorizzo, jordyzomer, evn, poprdi, corbet, axboe, asml.silence,
akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve,
gpiccoli, ldufour
Introduce a new sysctl (io_uring_disabled) which can be either 0 or 1.
When 0 (the default), all processes are allowed to create io_uring
instances, which is the current behavior. When 1, all calls to
io_uring_setup fail with -EPERM.
Signed-off-by: Matteo Rizzo <[email protected]>
---
Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++++++
io_uring/io_uring.c | 24 +++++++++++++++++++++
2 files changed, 38 insertions(+)
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index d85d90f5d000..3c53a238332a 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -450,6 +450,20 @@ this allows system administrators to override the
``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
+io_uring_disabled
+=========================
+
+Prevents all processes from creating new io_uring instances. Enabling this
+shrinks the kernel's attack surface.
+
+= =============================================================
+0 All processes can create io_uring instances as normal. This is the default
+ setting.
+1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing
+ io_uring instances can still be used.
+= =============================================================
+
+
kexec_load_disabled
===================
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 1b53a2ab0a27..0496ae7017f7 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -153,6 +153,22 @@ static __cold void io_fallback_tw(struct io_uring_task *tctx);
struct kmem_cache *req_cachep;
+static int __read_mostly sysctl_io_uring_disabled;
+#ifdef CONFIG_SYSCTL
+static struct ctl_table kernel_io_uring_disabled_table[] = {
+ {
+ .procname = "io_uring_disabled",
+ .data = &sysctl_io_uring_disabled,
+ .maxlen = sizeof(sysctl_io_uring_disabled),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {},
+};
+#endif
+
struct sock *io_uring_get_socket(struct file *file)
{
#if defined(CONFIG_UNIX)
@@ -4003,6 +4019,9 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
SYSCALL_DEFINE2(io_uring_setup, u32, entries,
struct io_uring_params __user *, params)
{
+ if (sysctl_io_uring_disabled)
+ return -EPERM;
+
return io_uring_setup(entries, params);
}
@@ -4577,6 +4596,11 @@ static int __init io_uring_init(void)
req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
SLAB_ACCOUNT | SLAB_TYPESAFE_BY_RCU);
+
+#ifdef CONFIG_SYSCTL
+ register_sysctl_init("kernel", kernel_io_uring_disabled_table);
+#endif
+
return 0;
};
__initcall(io_uring_init);
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo
@ 2023-06-27 16:23 ` Randy Dunlap
2023-06-27 17:10 ` Bart Van Assche
2023-06-28 13:50 ` Gabriel Krisman Bertazi
2 siblings, 0 replies; 11+ messages in thread
From: Randy Dunlap @ 2023-06-27 16:23 UTC (permalink / raw)
To: Matteo Rizzo, linux-doc, linux-kernel, io-uring
Cc: jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm,
keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve,
gpiccoli, ldufour
Hi--
On 6/27/23 05:00, Matteo Rizzo wrote:
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index d85d90f5d000..3c53a238332a 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -450,6 +450,20 @@ this allows system administrators to override the
> ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
>
>
> +io_uring_disabled
> +=========================
> +
> +Prevents all processes from creating new io_uring instances. Enabling this
> +shrinks the kernel's attack surface.
> +
> += =============================================================
> +0 All processes can create io_uring instances as normal. This is the default
> + setting.
> +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing
> + io_uring instances can still be used.
> += =============================================================
These table lines should be extended at least as far as the text that they
enclose. I.e., the top and bottom lines should be like:
> += ==========================================================================
thanks.
--
~Randy
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo
2023-06-27 16:23 ` Randy Dunlap
@ 2023-06-27 17:10 ` Bart Van Assche
2023-06-27 18:15 ` Matteo Rizzo
2023-06-28 13:50 ` Gabriel Krisman Bertazi
2 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2023-06-27 17:10 UTC (permalink / raw)
To: Matteo Rizzo, linux-doc, linux-kernel, io-uring
Cc: jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm,
keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve,
gpiccoli, ldufour
On 6/27/23 05:00, Matteo Rizzo wrote:
> +Prevents all processes from creating new io_uring instances. Enabling this
> +shrinks the kernel's attack surface.
> +
> += =============================================================
> +0 All processes can create io_uring instances as normal. This is the default
> + setting.
> +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing
> + io_uring instances can still be used.
> += =============================================================
I'm using fio + io_uring all the time on Android devices. I think we need a
better solution than disabling io_uring system-wide, e.g. a mechanism based
on SELinux that disables io_uring for apps and that keeps io_uring enabled
for processes started via 'adb root && adb shell ...'
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-27 17:10 ` Bart Van Assche
@ 2023-06-27 18:15 ` Matteo Rizzo
2023-06-28 11:36 ` Ricardo Ribalda
0 siblings, 1 reply; 11+ messages in thread
From: Matteo Rizzo @ 2023-06-27 18:15 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi,
corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen,
ribalda, chenhuacai, steve, gpiccoli, ldufour
On Tue, 27 Jun 2023 at 19:10, Bart Van Assche <[email protected]> wrote:
> I'm using fio + io_uring all the time on Android devices. I think we need a
> better solution than disabling io_uring system-wide, e.g. a mechanism based
> on SELinux that disables io_uring for apps and that keeps io_uring enabled
> for processes started via 'adb root && adb shell ...'
Android already uses seccomp to prevent untrusted applications from using
io_uring. This patch is aimed at server/desktop environments where there is
no easy way to set a system-wide seccomp policy and right now the only way
to disable io_uring system-wide is to compile it out of the kernel entirely
(not really feasible for e.g. a general-purpose distro).
I thought about adding a capability check that lets privileged processes
bypass this sysctl, but it wasn't clear to me which capability I should use.
For userfaultfd the kernel uses CAP_SYS_PTRACE, but I wasn't sure that's
the best choice here since io_uring has nothing to do with ptrace.
If anyone has any suggestions please let me know. A LSM hook also sounds
like an option but it would be more complicated to implement and use.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-27 18:15 ` Matteo Rizzo
@ 2023-06-28 11:36 ` Ricardo Ribalda
2023-06-28 15:12 ` Matteo Rizzo
0 siblings, 1 reply; 11+ messages in thread
From: Ricardo Ribalda @ 2023-06-28 11:36 UTC (permalink / raw)
To: Matteo Rizzo
Cc: Bart Van Assche, linux-doc, linux-kernel, io-uring, jordyzomer,
evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt,
dave.hansen, chenhuacai, steve, gpiccoli, ldufour
Hi Matteo
On Tue, 27 Jun 2023 at 20:15, Matteo Rizzo <[email protected]> wrote:
>
> On Tue, 27 Jun 2023 at 19:10, Bart Van Assche <[email protected]> wrote:
> > I'm using fio + io_uring all the time on Android devices. I think we need a
> > better solution than disabling io_uring system-wide, e.g. a mechanism based
> > on SELinux that disables io_uring for apps and that keeps io_uring enabled
> > for processes started via 'adb root && adb shell ...'
>
> Android already uses seccomp to prevent untrusted applications from using
> io_uring. This patch is aimed at server/desktop environments where there is
> no easy way to set a system-wide seccomp policy and right now the only way
> to disable io_uring system-wide is to compile it out of the kernel entirely
> (not really feasible for e.g. a general-purpose distro).
>
> I thought about adding a capability check that lets privileged processes
> bypass this sysctl, but it wasn't clear to me which capability I should use.
> For userfaultfd the kernel uses CAP_SYS_PTRACE, but I wasn't sure that's
> the best choice here since io_uring has nothing to do with ptrace.
> If anyone has any suggestions please let me know. A LSM hook also sounds
> like an option but it would be more complicated to implement and use.
Have you considered that the new sysctl is "sticky like kexec_load_disabled.
When the user disables it there is no way to turn it back on until the
system is rebooted.
Best regards!
--
Ricardo Ribalda
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo
2023-06-27 16:23 ` Randy Dunlap
2023-06-27 17:10 ` Bart Van Assche
@ 2023-06-28 13:50 ` Gabriel Krisman Bertazi
2023-06-28 15:59 ` Jeff Moyer
2 siblings, 1 reply; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2023-06-28 13:50 UTC (permalink / raw)
To: Matteo Rizzo
Cc: linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi,
corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen,
ribalda, chenhuacai, steve, gpiccoli, ldufour
Matteo Rizzo <[email protected]> writes:
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index d85d90f5d000..3c53a238332a 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -450,6 +450,20 @@ this allows system administrators to override the
> ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
>
>
> +io_uring_disabled
> +=========================
> +
> +Prevents all processes from creating new io_uring instances. Enabling this
> +shrinks the kernel's attack surface.
> +
> += =============================================================
> +0 All processes can create io_uring instances as normal. This is the default
> + setting.
> +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing
> + io_uring instances can still be used.
> += =============================================================
I had an internal request for something like this recently. If we go
this route, we could use a intermediary option that limits io_uring
to root processes only.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-28 11:36 ` Ricardo Ribalda
@ 2023-06-28 15:12 ` Matteo Rizzo
2023-06-28 15:59 ` Jeff Moyer
2023-06-28 15:59 ` Ricardo Ribalda
0 siblings, 2 replies; 11+ messages in thread
From: Matteo Rizzo @ 2023-06-28 15:12 UTC (permalink / raw)
To: Ricardo Ribalda
Cc: Bart Van Assche, linux-doc, linux-kernel, io-uring, jordyzomer,
evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt,
dave.hansen, chenhuacai, steve, gpiccoli, ldufour
On Wed, 28 Jun 2023 at 13:44, Ricardo Ribalda <[email protected]> wrote:
>
> Have you considered that the new sysctl is "sticky like kexec_load_disabled.
> When the user disables it there is no way to turn it back on until the
> system is rebooted.
Are you suggesting making this sysctl sticky? Are there any examples of how to
implement a sticky sysctl that can take more than 2 values in case we want to
add an intermediate level that still allows privileged processes to use
io_uring? Also, what would be the use case? Preventing privileged processes
from re-enabling io_uring?
Thanks!
--
Matteo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-28 13:50 ` Gabriel Krisman Bertazi
@ 2023-06-28 15:59 ` Jeff Moyer
0 siblings, 0 replies; 11+ messages in thread
From: Jeff Moyer @ 2023-06-28 15:59 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: Matteo Rizzo, linux-doc, linux-kernel, io-uring, jordyzomer, evn,
poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt,
dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour
Gabriel Krisman Bertazi <[email protected]> writes:
> Matteo Rizzo <[email protected]> writes:
>
>> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
>> index d85d90f5d000..3c53a238332a 100644
>> --- a/Documentation/admin-guide/sysctl/kernel.rst
>> +++ b/Documentation/admin-guide/sysctl/kernel.rst
>> @@ -450,6 +450,20 @@ this allows system administrators to override the
>> ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
>>
>>
>> +io_uring_disabled
>> +=========================
>> +
>> +Prevents all processes from creating new io_uring instances. Enabling this
>> +shrinks the kernel's attack surface.
>> +
>> += =============================================================
>> +0 All processes can create io_uring instances as normal. This is the default
>> + setting.
>> +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing
>> + io_uring instances can still be used.
>> += =============================================================
>
> I had an internal request for something like this recently. If we go
> this route, we could use a intermediary option that limits io_uring
> to root processes only.
This is all regrettable, but this option makes the most sense to me.
Testing for CAP_SYS_ADMIN or CAP_SYS_RAW_IO would work for that third
option, I think.
-Jeff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-28 15:12 ` Matteo Rizzo
@ 2023-06-28 15:59 ` Jeff Moyer
2023-06-28 15:59 ` Ricardo Ribalda
1 sibling, 0 replies; 11+ messages in thread
From: Jeff Moyer @ 2023-06-28 15:59 UTC (permalink / raw)
To: Matteo Rizzo
Cc: Ricardo Ribalda, Bart Van Assche, linux-doc, linux-kernel,
io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence,
akpm, keescook, rostedt, dave.hansen, chenhuacai, steve, gpiccoli,
ldufour
Matteo Rizzo <[email protected]> writes:
> On Wed, 28 Jun 2023 at 13:44, Ricardo Ribalda <[email protected]> wrote:
>>
>> Have you considered that the new sysctl is "sticky like kexec_load_disabled.
>> When the user disables it there is no way to turn it back on until the
>> system is rebooted.
>
> Are you suggesting making this sysctl sticky? Are there any examples of how to
> implement a sticky sysctl that can take more than 2 values in case we want to
> add an intermediate level that still allows privileged processes to use
> io_uring? Also, what would be the use case? Preventing privileged processes
> from re-enabling io_uring?
See unprivileged_bpf_disabled for an example. I can't speak to the use
case for a sticky value.
-Jeff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide
2023-06-28 15:12 ` Matteo Rizzo
2023-06-28 15:59 ` Jeff Moyer
@ 2023-06-28 15:59 ` Ricardo Ribalda
1 sibling, 0 replies; 11+ messages in thread
From: Ricardo Ribalda @ 2023-06-28 15:59 UTC (permalink / raw)
To: Matteo Rizzo
Cc: Bart Van Assche, linux-doc, linux-kernel, io-uring, jordyzomer,
evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt,
dave.hansen, chenhuacai, steve, gpiccoli, ldufour
HI Matteo
On Wed, 28 Jun 2023 at 17:12, Matteo Rizzo <[email protected]> wrote:
>
> On Wed, 28 Jun 2023 at 13:44, Ricardo Ribalda <[email protected]> wrote:
> >
> > Have you considered that the new sysctl is "sticky like kexec_load_disabled.
> > When the user disables it there is no way to turn it back on until the
> > system is rebooted.
>
> Are you suggesting making this sysctl sticky? Are there any examples of how to
> implement a sticky sysctl that can take more than 2 values in case we want to
> add an intermediate level that still allows privileged processes to use
> io_uring? Also, what would be the use case? Preventing privileged processes
> from re-enabling io_uring?
Yes, if this sysctl is accepted, I think it would make sense to make it sticky.
For more than one value take a look to kexec_load_limit_reboot and
kexec_load_limit_panic
Thanks!
>
> Thanks!
> --
> Matteo
--
Ricardo Ribalda
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-06-28 16:00 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-27 12:00 [PATCH 0/1] Add a sysctl to disable io_uring system-wide Matteo Rizzo
2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo
2023-06-27 16:23 ` Randy Dunlap
2023-06-27 17:10 ` Bart Van Assche
2023-06-27 18:15 ` Matteo Rizzo
2023-06-28 11:36 ` Ricardo Ribalda
2023-06-28 15:12 ` Matteo Rizzo
2023-06-28 15:59 ` Jeff Moyer
2023-06-28 15:59 ` Ricardo Ribalda
2023-06-28 13:50 ` Gabriel Krisman Bertazi
2023-06-28 15:59 ` Jeff Moyer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox