public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Gabriel Krisman Bertazi <krisman@suse.de>
Cc: io-uring@vger.kernel.org
Subject: Re: [PATCHSET RFC 0/2] Per-task io_uring opcode restrictions
Date: Thu, 8 Jan 2026 16:54:06 -0700	[thread overview]
Message-ID: <9f7e8b5f-e89b-4e7e-a520-ae97127b45f7@kernel.dk> (raw)
In-Reply-To: <87pl7j4v8h.fsf@mailhost.krisman.be>

On 1/8/26 3:04 PM, Gabriel Krisman Bertazi wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> Hi,
>>
>> One common complaint is that io_uring doesn't work with seccomp. Which
>> is true, as seccomp is entirely designed around a classic sync syscall -
>> if you can filter what you need based on a syscall number and the
>> arguments, then it's fine. But for anything else, it doesn't really
>> work. This means that solutions that rely on syscall filtering, eg
>> docker, there's really not much you can do with seccomp outside of
>> entirely disabling io_uring. That's not ideal.
>>
>> As I do think that's a gap we have that needs closing, here's an RFC
>> attempt at that. Suggestions more than welcome! I want to arrive at
>> something that works for the various use cases.
>>
>> io_uring already has a filtering mechanism for opcodes, however it needs
>> to be done after a ring has been created. The ring is created in a
>> disabled state, and then restrictions are applied, and finally the ring
>> is enabled so it can get used. This is cumbersome and doesn't
>> necessarily fit everybody's needs.
>>
>> This patch adds support for extending that same list of disallowed
>> opcodes and register to something that can be applied to the task as a
>> whole. Once applied, any ring created under that task will have these
>> restrictions applied. Patch 1 adds the basic support for this, and patch
>> 2 adds support for having the restrictions applied at fork or thread
>> create time too, so any task or thread created under the current task
>> will get the same restrictions.
> 
> Hi Jens,
> 
> Considering this is like to seccomp, a security mechanism, I don't see a
> use case for running without IORING_REG_RESTRICTIONS_INHERIT.  Otherwise
> there is a quick way around it by just execve'ing into itself.  IIRC,
> seccomp also doesn't support disabling filters for the same reason.
> So, unless someone has a use case, I'd suggest dropping the flag
> and just making IORING_REG_RESTRICTIONS_INHERIT the default behavior.

Yes good point, and then I can fold these two patches as well. I do
agree that having it be inherited on fork is probably the only way to
go. Not posted with this series, but I did add support for unregistering
a filter, IFF you were the original creator of it. You can either update
it with a new set of restrictions, or simply pass NULL and get the
current set removed.

> Beyond that, adding more restrictions on an already restricted
> application would be a useful use-case, so returning -EBUSY on
> current->io_uring_restrict might not be doable long trem.  But feature
> can be added later.

We could certainly do something like that, where you can "OR" in more
restrictions, you can't just "AND" them. I'll add that.

> Finally, I suspect we will come quickly to the need of more complex
> filtering of arguments, like seccomp.  Again, something that can be
> added later but could be considered now for the interface.

Quite possibly, as it's using the same mechanism we already have, it
just supports filtering opcodes, register opcodes, and flags for either
of those. We do have some vacant fields in io_uring_restriction right
now which could cover more cases, at least.

-- 
Jens Axboe

      reply	other threads:[~2026-01-08 23:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-08 20:17 [PATCHSET RFC 0/2] Per-task io_uring opcode restrictions Jens Axboe
2026-01-08 20:17 ` [PATCH 1/2] io_uring: allow registration of per-task restrictions Jens Axboe
2026-01-08 20:17 ` [PATCH 2/2] io_uring/register: add support for inheriting task restrictions Jens Axboe
2026-01-08 22:04 ` [PATCHSET RFC 0/2] Per-task io_uring opcode restrictions Gabriel Krisman Bertazi
2026-01-08 23:54   ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f7e8b5f-e89b-4e7e-a520-ae97127b45f7@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=krisman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox