* [PATCH 0/1] Add a sysctl to disable io_uring system-wide @ 2023-06-27 12:00 Matteo Rizzo 2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo 0 siblings, 1 reply; 11+ messages in thread From: Matteo Rizzo @ 2023-06-27 12:00 UTC (permalink / raw) To: linux-doc, linux-kernel, io-uring Cc: matteorizzo, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour Over the last few years we've seen many critical vulnerabilities in io_uring (https://goo.gle/limit-iouring) which could be exploited by an unprivileged process. There is currently no way to disable io_uring system-wide except by compiling it out of the kernel entirely. The only way to prevent a process from accessing io_uring is to use a seccomp filter, but seccomp cannot be applied system-wide. This patch introduces a new sysctl which disables the creation of new io_uring instances system-wide. This gives system admins a way to reduce the kernel's attack surface on systems where io_uring is not used. Matteo Rizzo (1): Add a new sysctl to disable io_uring system-wide Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++++++ io_uring/io_uring.c | 24 +++++++++++++++++++++ 2 files changed, 38 insertions(+) -- 2.41.0.162.gfafddb0af9-goog ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-27 12:00 [PATCH 0/1] Add a sysctl to disable io_uring system-wide Matteo Rizzo @ 2023-06-27 12:00 ` Matteo Rizzo 2023-06-27 16:23 ` Randy Dunlap ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Matteo Rizzo @ 2023-06-27 12:00 UTC (permalink / raw) To: linux-doc, linux-kernel, io-uring Cc: matteorizzo, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour Introduce a new sysctl (io_uring_disabled) which can be either 0 or 1. When 0 (the default), all processes are allowed to create io_uring instances, which is the current behavior. When 1, all calls to io_uring_setup fail with -EPERM. Signed-off-by: Matteo Rizzo <[email protected]> --- Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++++++ io_uring/io_uring.c | 24 +++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index d85d90f5d000..3c53a238332a 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -450,6 +450,20 @@ this allows system administrators to override the ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded. +io_uring_disabled +========================= + +Prevents all processes from creating new io_uring instances. Enabling this +shrinks the kernel's attack surface. + += ============================================================= +0 All processes can create io_uring instances as normal. This is the default + setting. +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing + io_uring instances can still be used. += ============================================================= + + kexec_load_disabled =================== diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 1b53a2ab0a27..0496ae7017f7 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -153,6 +153,22 @@ static __cold void io_fallback_tw(struct io_uring_task *tctx); struct kmem_cache *req_cachep; +static int __read_mostly sysctl_io_uring_disabled; +#ifdef CONFIG_SYSCTL +static struct ctl_table kernel_io_uring_disabled_table[] = { + { + .procname = "io_uring_disabled", + .data = &sysctl_io_uring_disabled, + .maxlen = sizeof(sysctl_io_uring_disabled), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_ONE, + }, + {}, +}; +#endif + struct sock *io_uring_get_socket(struct file *file) { #if defined(CONFIG_UNIX) @@ -4003,6 +4019,9 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params) SYSCALL_DEFINE2(io_uring_setup, u32, entries, struct io_uring_params __user *, params) { + if (sysctl_io_uring_disabled) + return -EPERM; + return io_uring_setup(entries, params); } @@ -4577,6 +4596,11 @@ static int __init io_uring_init(void) req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT | SLAB_TYPESAFE_BY_RCU); + +#ifdef CONFIG_SYSCTL + register_sysctl_init("kernel", kernel_io_uring_disabled_table); +#endif + return 0; }; __initcall(io_uring_init); -- 2.41.0.162.gfafddb0af9-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo @ 2023-06-27 16:23 ` Randy Dunlap 2023-06-27 17:10 ` Bart Van Assche 2023-06-28 13:50 ` Gabriel Krisman Bertazi 2 siblings, 0 replies; 11+ messages in thread From: Randy Dunlap @ 2023-06-27 16:23 UTC (permalink / raw) To: Matteo Rizzo, linux-doc, linux-kernel, io-uring Cc: jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour Hi-- On 6/27/23 05:00, Matteo Rizzo wrote: > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst > index d85d90f5d000..3c53a238332a 100644 > --- a/Documentation/admin-guide/sysctl/kernel.rst > +++ b/Documentation/admin-guide/sysctl/kernel.rst > @@ -450,6 +450,20 @@ this allows system administrators to override the > ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded. > > > +io_uring_disabled > +========================= > + > +Prevents all processes from creating new io_uring instances. Enabling this > +shrinks the kernel's attack surface. > + > += ============================================================= > +0 All processes can create io_uring instances as normal. This is the default > + setting. > +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing > + io_uring instances can still be used. > += ============================================================= These table lines should be extended at least as far as the text that they enclose. I.e., the top and bottom lines should be like: > += ========================================================================== thanks. -- ~Randy ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo 2023-06-27 16:23 ` Randy Dunlap @ 2023-06-27 17:10 ` Bart Van Assche 2023-06-27 18:15 ` Matteo Rizzo 2023-06-28 13:50 ` Gabriel Krisman Bertazi 2 siblings, 1 reply; 11+ messages in thread From: Bart Van Assche @ 2023-06-27 17:10 UTC (permalink / raw) To: Matteo Rizzo, linux-doc, linux-kernel, io-uring Cc: jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour On 6/27/23 05:00, Matteo Rizzo wrote: > +Prevents all processes from creating new io_uring instances. Enabling this > +shrinks the kernel's attack surface. > + > += ============================================================= > +0 All processes can create io_uring instances as normal. This is the default > + setting. > +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing > + io_uring instances can still be used. > += ============================================================= I'm using fio + io_uring all the time on Android devices. I think we need a better solution than disabling io_uring system-wide, e.g. a mechanism based on SELinux that disables io_uring for apps and that keeps io_uring enabled for processes started via 'adb root && adb shell ...' Bart. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-27 17:10 ` Bart Van Assche @ 2023-06-27 18:15 ` Matteo Rizzo 2023-06-28 11:36 ` Ricardo Ribalda 0 siblings, 1 reply; 11+ messages in thread From: Matteo Rizzo @ 2023-06-27 18:15 UTC (permalink / raw) To: Bart Van Assche Cc: linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour On Tue, 27 Jun 2023 at 19:10, Bart Van Assche <[email protected]> wrote: > I'm using fio + io_uring all the time on Android devices. I think we need a > better solution than disabling io_uring system-wide, e.g. a mechanism based > on SELinux that disables io_uring for apps and that keeps io_uring enabled > for processes started via 'adb root && adb shell ...' Android already uses seccomp to prevent untrusted applications from using io_uring. This patch is aimed at server/desktop environments where there is no easy way to set a system-wide seccomp policy and right now the only way to disable io_uring system-wide is to compile it out of the kernel entirely (not really feasible for e.g. a general-purpose distro). I thought about adding a capability check that lets privileged processes bypass this sysctl, but it wasn't clear to me which capability I should use. For userfaultfd the kernel uses CAP_SYS_PTRACE, but I wasn't sure that's the best choice here since io_uring has nothing to do with ptrace. If anyone has any suggestions please let me know. A LSM hook also sounds like an option but it would be more complicated to implement and use. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-27 18:15 ` Matteo Rizzo @ 2023-06-28 11:36 ` Ricardo Ribalda 2023-06-28 15:12 ` Matteo Rizzo 0 siblings, 1 reply; 11+ messages in thread From: Ricardo Ribalda @ 2023-06-28 11:36 UTC (permalink / raw) To: Matteo Rizzo Cc: Bart Van Assche, linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, chenhuacai, steve, gpiccoli, ldufour Hi Matteo On Tue, 27 Jun 2023 at 20:15, Matteo Rizzo <[email protected]> wrote: > > On Tue, 27 Jun 2023 at 19:10, Bart Van Assche <[email protected]> wrote: > > I'm using fio + io_uring all the time on Android devices. I think we need a > > better solution than disabling io_uring system-wide, e.g. a mechanism based > > on SELinux that disables io_uring for apps and that keeps io_uring enabled > > for processes started via 'adb root && adb shell ...' > > Android already uses seccomp to prevent untrusted applications from using > io_uring. This patch is aimed at server/desktop environments where there is > no easy way to set a system-wide seccomp policy and right now the only way > to disable io_uring system-wide is to compile it out of the kernel entirely > (not really feasible for e.g. a general-purpose distro). > > I thought about adding a capability check that lets privileged processes > bypass this sysctl, but it wasn't clear to me which capability I should use. > For userfaultfd the kernel uses CAP_SYS_PTRACE, but I wasn't sure that's > the best choice here since io_uring has nothing to do with ptrace. > If anyone has any suggestions please let me know. A LSM hook also sounds > like an option but it would be more complicated to implement and use. Have you considered that the new sysctl is "sticky like kexec_load_disabled. When the user disables it there is no way to turn it back on until the system is rebooted. Best regards! -- Ricardo Ribalda ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-28 11:36 ` Ricardo Ribalda @ 2023-06-28 15:12 ` Matteo Rizzo 2023-06-28 15:59 ` Jeff Moyer 2023-06-28 15:59 ` Ricardo Ribalda 0 siblings, 2 replies; 11+ messages in thread From: Matteo Rizzo @ 2023-06-28 15:12 UTC (permalink / raw) To: Ricardo Ribalda Cc: Bart Van Assche, linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, chenhuacai, steve, gpiccoli, ldufour On Wed, 28 Jun 2023 at 13:44, Ricardo Ribalda <[email protected]> wrote: > > Have you considered that the new sysctl is "sticky like kexec_load_disabled. > When the user disables it there is no way to turn it back on until the > system is rebooted. Are you suggesting making this sysctl sticky? Are there any examples of how to implement a sticky sysctl that can take more than 2 values in case we want to add an intermediate level that still allows privileged processes to use io_uring? Also, what would be the use case? Preventing privileged processes from re-enabling io_uring? Thanks! -- Matteo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-28 15:12 ` Matteo Rizzo @ 2023-06-28 15:59 ` Jeff Moyer 2023-06-28 15:59 ` Ricardo Ribalda 1 sibling, 0 replies; 11+ messages in thread From: Jeff Moyer @ 2023-06-28 15:59 UTC (permalink / raw) To: Matteo Rizzo Cc: Ricardo Ribalda, Bart Van Assche, linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, chenhuacai, steve, gpiccoli, ldufour Matteo Rizzo <[email protected]> writes: > On Wed, 28 Jun 2023 at 13:44, Ricardo Ribalda <[email protected]> wrote: >> >> Have you considered that the new sysctl is "sticky like kexec_load_disabled. >> When the user disables it there is no way to turn it back on until the >> system is rebooted. > > Are you suggesting making this sysctl sticky? Are there any examples of how to > implement a sticky sysctl that can take more than 2 values in case we want to > add an intermediate level that still allows privileged processes to use > io_uring? Also, what would be the use case? Preventing privileged processes > from re-enabling io_uring? See unprivileged_bpf_disabled for an example. I can't speak to the use case for a sticky value. -Jeff ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-28 15:12 ` Matteo Rizzo 2023-06-28 15:59 ` Jeff Moyer @ 2023-06-28 15:59 ` Ricardo Ribalda 1 sibling, 0 replies; 11+ messages in thread From: Ricardo Ribalda @ 2023-06-28 15:59 UTC (permalink / raw) To: Matteo Rizzo Cc: Bart Van Assche, linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, chenhuacai, steve, gpiccoli, ldufour HI Matteo On Wed, 28 Jun 2023 at 17:12, Matteo Rizzo <[email protected]> wrote: > > On Wed, 28 Jun 2023 at 13:44, Ricardo Ribalda <[email protected]> wrote: > > > > Have you considered that the new sysctl is "sticky like kexec_load_disabled. > > When the user disables it there is no way to turn it back on until the > > system is rebooted. > > Are you suggesting making this sysctl sticky? Are there any examples of how to > implement a sticky sysctl that can take more than 2 values in case we want to > add an intermediate level that still allows privileged processes to use > io_uring? Also, what would be the use case? Preventing privileged processes > from re-enabling io_uring? Yes, if this sysctl is accepted, I think it would make sense to make it sticky. For more than one value take a look to kexec_load_limit_reboot and kexec_load_limit_panic Thanks! > > Thanks! > -- > Matteo -- Ricardo Ribalda ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo 2023-06-27 16:23 ` Randy Dunlap 2023-06-27 17:10 ` Bart Van Assche @ 2023-06-28 13:50 ` Gabriel Krisman Bertazi 2023-06-28 15:59 ` Jeff Moyer 2 siblings, 1 reply; 11+ messages in thread From: Gabriel Krisman Bertazi @ 2023-06-28 13:50 UTC (permalink / raw) To: Matteo Rizzo Cc: linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour Matteo Rizzo <[email protected]> writes: > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst > index d85d90f5d000..3c53a238332a 100644 > --- a/Documentation/admin-guide/sysctl/kernel.rst > +++ b/Documentation/admin-guide/sysctl/kernel.rst > @@ -450,6 +450,20 @@ this allows system administrators to override the > ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded. > > > +io_uring_disabled > +========================= > + > +Prevents all processes from creating new io_uring instances. Enabling this > +shrinks the kernel's attack surface. > + > += ============================================================= > +0 All processes can create io_uring instances as normal. This is the default > + setting. > +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing > + io_uring instances can still be used. > += ============================================================= I had an internal request for something like this recently. If we go this route, we could use a intermediary option that limits io_uring to root processes only. -- Gabriel Krisman Bertazi ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] Add a new sysctl to disable io_uring system-wide 2023-06-28 13:50 ` Gabriel Krisman Bertazi @ 2023-06-28 15:59 ` Jeff Moyer 0 siblings, 0 replies; 11+ messages in thread From: Jeff Moyer @ 2023-06-28 15:59 UTC (permalink / raw) To: Gabriel Krisman Bertazi Cc: Matteo Rizzo, linux-doc, linux-kernel, io-uring, jordyzomer, evn, poprdi, corbet, axboe, asml.silence, akpm, keescook, rostedt, dave.hansen, ribalda, chenhuacai, steve, gpiccoli, ldufour Gabriel Krisman Bertazi <[email protected]> writes: > Matteo Rizzo <[email protected]> writes: > >> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst >> index d85d90f5d000..3c53a238332a 100644 >> --- a/Documentation/admin-guide/sysctl/kernel.rst >> +++ b/Documentation/admin-guide/sysctl/kernel.rst >> @@ -450,6 +450,20 @@ this allows system administrators to override the >> ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded. >> >> >> +io_uring_disabled >> +========================= >> + >> +Prevents all processes from creating new io_uring instances. Enabling this >> +shrinks the kernel's attack surface. >> + >> += ============================================================= >> +0 All processes can create io_uring instances as normal. This is the default >> + setting. >> +1 io_uring is disabled. io_uring_setup always fails with -EPERM. Existing >> + io_uring instances can still be used. >> += ============================================================= > > I had an internal request for something like this recently. If we go > this route, we could use a intermediary option that limits io_uring > to root processes only. This is all regrettable, but this option makes the most sense to me. Testing for CAP_SYS_ADMIN or CAP_SYS_RAW_IO would work for that third option, I think. -Jeff ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-06-28 16:00 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-06-27 12:00 [PATCH 0/1] Add a sysctl to disable io_uring system-wide Matteo Rizzo 2023-06-27 12:00 ` [PATCH 1/1] Add a new " Matteo Rizzo 2023-06-27 16:23 ` Randy Dunlap 2023-06-27 17:10 ` Bart Van Assche 2023-06-27 18:15 ` Matteo Rizzo 2023-06-28 11:36 ` Ricardo Ribalda 2023-06-28 15:12 ` Matteo Rizzo 2023-06-28 15:59 ` Jeff Moyer 2023-06-28 15:59 ` Ricardo Ribalda 2023-06-28 13:50 ` Gabriel Krisman Bertazi 2023-06-28 15:59 ` Jeff Moyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox