public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org
Subject: Re: [PATCH 3/3] io_uring/register: allow original task restrictions owner to unregister
Date: Mon, 12 Jan 2026 19:10:15 -0500	[thread overview]
Message-ID: <877btm4bko.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <20260109185155.88150-4-axboe@kernel.dk> (Jens Axboe's message of "Fri, 9 Jan 2026 11:48:27 -0700")

[-- Attachment #1: Type: text/plain, Size: 3736 bytes --]

Jens Axboe <axboe@kernel.dk> writes:

> Currently any attempt to register a set of task restrictions if an
> existing set exists will fail with -EPERM. But it is feasible to let the
> original creator/owner performance this operation. Either to remove
> restrictions entirely, or to replace them with a new set.
>
> If an existing set exists and NULL is passed for the new set, the
> current set is unregistered. If an existing set exists and a new set is
> supplied, the old set is dropped and replaced with the new one.

Feature-wise, I think this covers what I mentioned in the previous
iteration.  Even though this is an RFC, I think I found two bugs that
allow the child to escape the restrictions:

> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  include/linux/io_uring_types.h |  1 +
>  io_uring/register.c            | 45 ++++++++++++++++++++++++++++------
>  2 files changed, 38 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index 196f41ec6d60..1ff7817b3535 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -222,6 +222,7 @@ struct io_rings {
>  struct io_restriction {
>  	DECLARE_BITMAP(register_op, IORING_REGISTER_LAST);
>  	DECLARE_BITMAP(sqe_op, IORING_OP_LAST);
> +	pid_t pid;
>  	refcount_t refs;
>  	u8 sqe_flags_allowed;
>  	u8 sqe_flags_required;
> diff --git a/io_uring/register.c b/io_uring/register.c
> index 552b22f6b2dc..c8b8a9edbc65 100644
> --- a/io_uring/register.c
> +++ b/io_uring/register.c
> @@ -189,12 +189,19 @@ static int io_register_restrictions_task(void __user *arg, unsigned int nr_args)
>  {
>  	struct io_uring_task_restriction __user *ures = arg;
>  	struct io_uring_task_restriction tres;
> -	struct io_restriction *res;
> +	struct io_restriction *old_res, *res;
>  	int ret;
>  
>  	if (nr_args != 1)
>  		return -EINVAL;
>  
> +	res = current->io_uring_restrict;
> +	if (!ures) {
> +		if (!res)
> +			return -EFAULT;
> +		goto drop_set;
> +	}
> +
>  	if (copy_from_user(&tres, arg, sizeof(tres)))
>  		return -EFAULT;
>  
> @@ -207,13 +214,27 @@ static int io_register_restrictions_task(void __user *arg, unsigned int nr_args)
>  	 * Disallow if task already has registered restrictions, and we're
>  	 * not passing in further restrictions to add to an existing set.
>  	 */
> -	if (current->io_uring_restrict &&
> -	    !(tres.flags & IORING_REG_RESTRICTIONS_MASK))
> -		return -EPERM;
> +	old_res = NULL;
> +	if (res && !(tres.flags & IORING_REG_RESTRICTIONS_MASK)) {
> +		/* Not owner, may only append further restrictions */
> +drop_set:
> +		if (res->pid != current->pid)
> +			return -EPERM;

This might be hard to exploit, but if the parent terminates, the pid can
get reused.  Then, if the child forks until it gets the same pid, it can
unregister the filter.  I suppose the fix would require holding a
reference to the task, similar to what pidfd does. but perhaps just
abandon the unregistering semantics?  I'm not sure it is that useful...

> @@ -226,14 +247,22 @@ static int io_register_restrictions_task(void __user *arg, unsigned int nr_args)
>  				    tres.flags & IORING_REG_RESTRICTIONS_MASK);
>  	if (ret) {
>  		kfree(res);
> -		return ret;
> +		goto out;
>  	}
>  	if (current->io_uring_restrict &&
>  	    refcount_dec_and_test(&current->io_uring_restrict->refs))
>  		kfree(current->io_uring_restrict);
> +	res->pid = current->pid;

res->pid must always point to the first task that added a
restriction. So:

if (!current->io_uring_restrict)
       res->pid = current->pid;

Otherwise, the child will become the owner after adding another
restriction, and can then break out with a further unregister.  Based on
your testcase, this escapes the filter:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: poc.patch --]
[-- Type: text/x-patch, Size: 889 bytes --]

diff --git a/test/task-restrictions.c b/test/task-restrictions.c
index 5a9170b4..4d4b457c 100644
--- a/test/task-restrictions.c
+++ b/test/task-restrictions.c
@@ -92,6 +92,12 @@ static int test_restrictions(int should_work)
 static void *thread_fn(void *unused)
 {
 	int ret;
+	struct io_uring_task_restriction  *res =
+		calloc(1, sizeof(*res) + 1 * sizeof(struct io_uring_restriction));
+	res->restrictions[1].opcode = IORING_RESTRICTION_SQE_OP;
+	res->restrictions[1].sqe_op = IORING_OP_FUTEX_WAIT;
+	res->nr_res = 1;
+	res->flags = IORING_REG_RESTRICTIONS_MASK;
 
 	ret = test_restrictions(0);
 	if (ret) {
@@ -99,6 +105,7 @@ static void *thread_fn(void *unused)
 		return (void *) (uintptr_t) ret;
 	}
 
+	ret = io_uring_register_task_restrictions(res);
 	ret = io_uring_register_task_restrictions(NULL);
 	if (!ret) {
 		fprintf(stderr, "thread restrictions unregister worked?!\n");

[-- Attachment #3: Type: text/plain, Size: 29 bytes --]


-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2026-01-13  0:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-09 18:48 [PATCHSET RFC v2 0/3] Per-task io_uring opcode restrictions Jens Axboe
2026-01-09 18:48 ` [PATCH 1/3] io_uring: allow registration of per-task restrictions Jens Axboe
2026-01-09 18:48 ` [PATCH 2/3] io_uring/register: add MASK support for task filter set Jens Axboe
2026-01-09 18:48 ` [PATCH 3/3] io_uring/register: allow original task restrictions owner to unregister Jens Axboe
2026-01-13  0:10   ` Gabriel Krisman Bertazi [this message]
2026-01-13 18:25     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877btm4bko.fsf@mailhost.krisman.be \
    --to=krisman@suse.de \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox