* [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
@ 2023-02-15 0:42 Josh Triplett
2023-02-15 17:44 ` Jens Axboe
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Josh Triplett @ 2023-02-15 0:42 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov; +Cc: io-uring, linux-kernel
Add a new flag IORING_REGISTER_USE_REGISTERED_RING (set via the high bit
of the opcode) to treat the fd as a registered index rather than a file
descriptor.
This makes it possible for a library to open an io_uring, register the
ring fd, close the ring fd, and subsequently use the ring entirely via
registered index.
Signed-off-by: Josh Triplett <[email protected]>
---
v2: Rebase. Change io_uring_register to extract the flag from the opcode first.
include/uapi/linux/io_uring.h | 6 +++++-
io_uring/io_uring.c | 34 +++++++++++++++++++++++++++-------
2 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 2780bce62faf..35e6f8046b9b 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -470,6 +470,7 @@ struct io_uring_params {
#define IORING_FEAT_RSRC_TAGS (1U << 10)
#define IORING_FEAT_CQE_SKIP (1U << 11)
#define IORING_FEAT_LINKED_FILE (1U << 12)
+#define IORING_FEAT_REG_REG_RING (1U << 13)
/*
* io_uring_register(2) opcodes and arguments
@@ -517,7 +518,10 @@ enum {
IORING_REGISTER_FILE_ALLOC_RANGE = 25,
/* this goes last */
- IORING_REGISTER_LAST
+ IORING_REGISTER_LAST,
+
+ /* flag added to the opcode to use a registered ring fd */
+ IORING_REGISTER_USE_REGISTERED_RING = 1U << 31
};
/* io-wq worker categories */
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index db623b3185c8..1fb743ecba5a 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3663,7 +3663,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
IORING_FEAT_POLL_32BITS | IORING_FEAT_SQPOLL_NONFIXED |
IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS |
IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP |
- IORING_FEAT_LINKED_FILE;
+ IORING_FEAT_LINKED_FILE | IORING_FEAT_REG_REG_RING;
if (copy_to_user(params, p, sizeof(*p))) {
ret = -EFAULT;
@@ -4177,17 +4177,37 @@ SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode,
struct io_ring_ctx *ctx;
long ret = -EBADF;
struct fd f;
+ bool use_registered_ring;
+
+ use_registered_ring = !!(opcode & IORING_REGISTER_USE_REGISTERED_RING);
+ opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
if (opcode >= IORING_REGISTER_LAST)
return -EINVAL;
- f = fdget(fd);
- if (!f.file)
- return -EBADF;
+ if (use_registered_ring) {
+ /*
+ * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
+ * need only dereference our task private array to find it.
+ */
+ struct io_uring_task *tctx = current->io_uring;
- ret = -EOPNOTSUPP;
- if (!io_is_uring_fops(f.file))
- goto out_fput;
+ if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
+ return -EINVAL;
+ fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
+ f.file = tctx->registered_rings[fd];
+ f.flags = 0;
+ if (unlikely(!f.file))
+ return -EBADF;
+ opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
+ } else {
+ f = fdget(fd);
+ if (unlikely(!f.file))
+ return -EBADF;
+ ret = -EOPNOTSUPP;
+ if (!io_is_uring_fops(f.file))
+ goto out_fput;
+ }
ctx = f.file->private_data;
--
2.39.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
2023-02-15 0:42 [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd Josh Triplett
@ 2023-02-15 17:44 ` Jens Axboe
2023-02-15 20:33 ` Josh Triplett
2023-02-16 3:24 ` Jens Axboe
2023-02-16 9:35 ` Dylan Yudaken
2 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2023-02-15 17:44 UTC (permalink / raw)
To: Josh Triplett, Pavel Begunkov; +Cc: io-uring, linux-kernel
On 2/14/23 5:42 PM, Josh Triplett wrote:
> Add a new flag IORING_REGISTER_USE_REGISTERED_RING (set via the high bit
> of the opcode) to treat the fd as a registered index rather than a file
> descriptor.
>
> This makes it possible for a library to open an io_uring, register the
> ring fd, close the ring fd, and subsequently use the ring entirely via
> registered index.
This looks pretty straight forward to me, only real question I had
was whether using the top bit of the register opcode for this is the
best choice. But I can't think of better ways to do it, and the space
is definitely big enough to do that, so looks fine to me.
One more comment below:
> + if (use_registered_ring) {
> + /*
> + * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
> + * need only dereference our task private array to find it.
> + */
> + struct io_uring_task *tctx = current->io_uring;
I need to double check if it's guaranteed we always have current->io_uring
assigned here. If the ring is registered we certainly will have it, but
what if someone calls io_uring_register(2) without having a ring setup
upfront?
IOW, I think we need a NULL check here and failing the request at that
point.
--
Jens Axboe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
2023-02-15 17:44 ` Jens Axboe
@ 2023-02-15 20:33 ` Josh Triplett
2023-02-15 21:39 ` Jens Axboe
0 siblings, 1 reply; 8+ messages in thread
From: Josh Triplett @ 2023-02-15 20:33 UTC (permalink / raw)
To: Jens Axboe; +Cc: Pavel Begunkov, io-uring, linux-kernel
On Wed, Feb 15, 2023 at 10:44:38AM -0700, Jens Axboe wrote:
> On 2/14/23 5:42 PM, Josh Triplett wrote:
> > Add a new flag IORING_REGISTER_USE_REGISTERED_RING (set via the high bit
> > of the opcode) to treat the fd as a registered index rather than a file
> > descriptor.
> >
> > This makes it possible for a library to open an io_uring, register the
> > ring fd, close the ring fd, and subsequently use the ring entirely via
> > registered index.
>
> This looks pretty straight forward to me, only real question I had
> was whether using the top bit of the register opcode for this is the
> best choice. But I can't think of better ways to do it, and the space
> is definitely big enough to do that, so looks fine to me.
It seemed like the cleanest way available given the ABI of
io_uring_register, yeah.
> One more comment below:
>
> > + if (use_registered_ring) {
> > + /*
> > + * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
> > + * need only dereference our task private array to find it.
> > + */
> > + struct io_uring_task *tctx = current->io_uring;
>
> I need to double check if it's guaranteed we always have current->io_uring
> assigned here. If the ring is registered we certainly will have it, but
> what if someone calls io_uring_register(2) without having a ring setup
> upfront?
>
> IOW, I think we need a NULL check here and failing the request at that
> point.
The next line is:
+ if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
The first part of that condition is the NULL check you're looking for,
right?
- Josh Triplett
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
2023-02-15 20:33 ` Josh Triplett
@ 2023-02-15 21:39 ` Jens Axboe
0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2023-02-15 21:39 UTC (permalink / raw)
To: Josh Triplett; +Cc: Pavel Begunkov, io-uring, linux-kernel
On 2/15/23 1:33?PM, Josh Triplett wrote:
> On Wed, Feb 15, 2023 at 10:44:38AM -0700, Jens Axboe wrote:
>> On 2/14/23 5:42?PM, Josh Triplett wrote:
>>> Add a new flag IORING_REGISTER_USE_REGISTERED_RING (set via the high bit
>>> of the opcode) to treat the fd as a registered index rather than a file
>>> descriptor.
>>>
>>> This makes it possible for a library to open an io_uring, register the
>>> ring fd, close the ring fd, and subsequently use the ring entirely via
>>> registered index.
>>
>> This looks pretty straight forward to me, only real question I had
>> was whether using the top bit of the register opcode for this is the
>> best choice. But I can't think of better ways to do it, and the space
>> is definitely big enough to do that, so looks fine to me.
>
> It seemed like the cleanest way available given the ABI of
> io_uring_register, yeah.
>
>> One more comment below:
>>
>>> + if (use_registered_ring) {
>>> + /*
>>> + * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
>>> + * need only dereference our task private array to find it.
>>> + */
>>> + struct io_uring_task *tctx = current->io_uring;
>>
>> I need to double check if it's guaranteed we always have current->io_uring
>> assigned here. If the ring is registered we certainly will have it, but
>> what if someone calls io_uring_register(2) without having a ring setup
>> upfront?
>>
>> IOW, I think we need a NULL check here and failing the request at that
>> point.
>
> The next line is:
>
> + if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
>
> The first part of that condition is the NULL check you're looking for,
> right?
Ah yeah, I'm just blind... Looks fine!
--
Jens Axboe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
2023-02-15 0:42 [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd Josh Triplett
2023-02-15 17:44 ` Jens Axboe
@ 2023-02-16 3:24 ` Jens Axboe
2023-02-16 9:35 ` Dylan Yudaken
2 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2023-02-16 3:24 UTC (permalink / raw)
To: Pavel Begunkov, Josh Triplett; +Cc: io-uring, linux-kernel
On Tue, 14 Feb 2023 16:42:22 -0800, Josh Triplett wrote:
> Add a new flag IORING_REGISTER_USE_REGISTERED_RING (set via the high bit
> of the opcode) to treat the fd as a registered index rather than a file
> descriptor.
>
> This makes it possible for a library to open an io_uring, register the
> ring fd, close the ring fd, and subsequently use the ring entirely via
> registered index.
>
> [...]
Applied, thanks!
[1/1] io_uring: Support calling io_uring_register with a registered ring fd
commit: 04eb372cac91a4f70c9b921c1b86758f5553d311
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
2023-02-15 0:42 [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd Josh Triplett
2023-02-15 17:44 ` Jens Axboe
2023-02-16 3:24 ` Jens Axboe
@ 2023-02-16 9:35 ` Dylan Yudaken
2023-02-16 12:05 ` Josh Triplett
2 siblings, 1 reply; 8+ messages in thread
From: Dylan Yudaken @ 2023-02-16 9:35 UTC (permalink / raw)
To: [email protected], [email protected], [email protected]
Cc: [email protected], [email protected]
On Tue, 2023-02-14 at 16:42 -0800, Josh Triplett wrote:
> @@ -4177,17 +4177,37 @@ SYSCALL_DEFINE4(io_uring_register, unsigned
> int, fd, unsigned int, opcode,
> struct io_ring_ctx *ctx;
> long ret = -EBADF;
> struct fd f;
> + bool use_registered_ring;
> +
> + use_registered_ring = !!(opcode &
> IORING_REGISTER_USE_REGISTERED_RING);
> + opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
>
> if (opcode >= IORING_REGISTER_LAST)
> return -EINVAL;
>
> - f = fdget(fd);
> - if (!f.file)
> - return -EBADF;
> + if (use_registered_ring) {
> + /*
> + * Ring fd has been registered via
> IORING_REGISTER_RING_FDS, we
> + * need only dereference our task private array to
> find it.
> + */
> + struct io_uring_task *tctx = current->io_uring;
>
> - ret = -EOPNOTSUPP;
> - if (!io_is_uring_fops(f.file))
> - goto out_fput;
> + if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
> + return -EINVAL;
> + fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
> + f.file = tctx->registered_rings[fd];
> + f.flags = 0;
> + if (unlikely(!f.file))
> + return -EBADF;
> + opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
^ this line looks duplicated at the top of the function?
Also - is there a liburing regression test for this?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
2023-02-16 9:35 ` Dylan Yudaken
@ 2023-02-16 12:05 ` Josh Triplett
2023-02-16 13:10 ` Jens Axboe
0 siblings, 1 reply; 8+ messages in thread
From: Josh Triplett @ 2023-02-16 12:05 UTC (permalink / raw)
To: Dylan Yudaken
Cc: [email protected], [email protected],
[email protected], [email protected]
On Thu, Feb 16, 2023 at 09:35:44AM +0000, Dylan Yudaken wrote:
> On Tue, 2023-02-14 at 16:42 -0800, Josh Triplett wrote:
> > @@ -4177,17 +4177,37 @@ SYSCALL_DEFINE4(io_uring_register, unsigned
> > int, fd, unsigned int, opcode,
> > struct io_ring_ctx *ctx;
> > long ret = -EBADF;
> > struct fd f;
> > + bool use_registered_ring;
> > +
> > + use_registered_ring = !!(opcode &
> > IORING_REGISTER_USE_REGISTERED_RING);
> > + opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
> >
> > if (opcode >= IORING_REGISTER_LAST)
> > return -EINVAL;
> >
> > - f = fdget(fd);
> > - if (!f.file)
> > - return -EBADF;
> > + if (use_registered_ring) {
> > + /*
> > + * Ring fd has been registered via
> > IORING_REGISTER_RING_FDS, we
> > + * need only dereference our task private array to
> > find it.
> > + */
> > + struct io_uring_task *tctx = current->io_uring;
> >
> > - ret = -EOPNOTSUPP;
> > - if (!io_is_uring_fops(f.file))
> > - goto out_fput;
> > + if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
> > + return -EINVAL;
> > + fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
> > + f.file = tctx->registered_rings[fd];
> > + f.flags = 0;
> > + if (unlikely(!f.file))
> > + return -EBADF;
> > + opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
>
> ^ this line looks duplicated at the top of the function?
Good catch!
Jens, since you've already applied this, can you remove this line or
would you like a patch doing so?
> Also - is there a liburing regression test for this?
Userspace, including test: https://github.com/axboe/liburing/pull/664
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd
2023-02-16 12:05 ` Josh Triplett
@ 2023-02-16 13:10 ` Jens Axboe
0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2023-02-16 13:10 UTC (permalink / raw)
To: Josh Triplett, Dylan Yudaken
Cc: [email protected], [email protected],
[email protected]
On 2/16/23 5:05?AM, Josh Triplett wrote:
> On Thu, Feb 16, 2023 at 09:35:44AM +0000, Dylan Yudaken wrote:
>> On Tue, 2023-02-14 at 16:42 -0800, Josh Triplett wrote:
>>> @@ -4177,17 +4177,37 @@ SYSCALL_DEFINE4(io_uring_register, unsigned
>>> int, fd, unsigned int, opcode,
>>> struct io_ring_ctx *ctx;
>>> long ret = -EBADF;
>>> struct fd f;
>>> + bool use_registered_ring;
>>> +
>>> + use_registered_ring = !!(opcode &
>>> IORING_REGISTER_USE_REGISTERED_RING);
>>> + opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
>>>
>>> if (opcode >= IORING_REGISTER_LAST)
>>> return -EINVAL;
>>>
>>> - f = fdget(fd);
>>> - if (!f.file)
>>> - return -EBADF;
>>> + if (use_registered_ring) {
>>> + /*
>>> + * Ring fd has been registered via
>>> IORING_REGISTER_RING_FDS, we
>>> + * need only dereference our task private array to
>>> find it.
>>> + */
>>> + struct io_uring_task *tctx = current->io_uring;
>>>
>>> - ret = -EOPNOTSUPP;
>>> - if (!io_is_uring_fops(f.file))
>>> - goto out_fput;
>>> + if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
>>> + return -EINVAL;
>>> + fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
>>> + f.file = tctx->registered_rings[fd];
>>> + f.flags = 0;
>>> + if (unlikely(!f.file))
>>> + return -EBADF;
>>> + opcode &= ~IORING_REGISTER_USE_REGISTERED_RING;
>>
>> ^ this line looks duplicated at the top of the function?
>
> Good catch!
Indeed!
> Jens, since you've already applied this, can you remove this line or
> would you like a patch doing so?
It's still top-of-tree, I just amended it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-02-16 13:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-15 0:42 [PATCHv2] io_uring: Support calling io_uring_register with a registered ring fd Josh Triplett
2023-02-15 17:44 ` Jens Axboe
2023-02-15 20:33 ` Josh Triplett
2023-02-15 21:39 ` Jens Axboe
2023-02-16 3:24 ` Jens Axboe
2023-02-16 9:35 ` Dylan Yudaken
2023-02-16 12:05 ` Josh Triplett
2023-02-16 13:10 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox