* [PATCHSET v4 RFC liburing 0/4] Implement the kernel style return value
@ 2021-10-03 10:17 Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 1/3] src/syscall: Wrap `errno` for `__sys_io_uring_{register,setup,enter{2,}}` Ammar Faizi
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Ammar Faizi @ 2021-10-03 10:17 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov
Cc: io-uring Mailing List, Bedirhan KURT, Louvian Lyndal
This is the v4 of RFC to implement the kernel style return value for
liburing.
The main purpose of these changes is to make it possible to remove
the dependency of `errno` variable in the liburing C sources. If we
can land this on liburing, we will start working on adding support
build liburing without libc.
Currently, we expose these functions to userland:
1) `__sys_io_uring_register`
2) `__sys_io_uring_setup`
3) `__sys_io_uring_enter2`
4) `__sys_io_uring_enter`
The tests in `test/io_uring_{enter,register,setup}.c` are the examples
of it. Since the userland needs to check the `errno` value to use them
properly, this means those functions always depend on libc. So we
cannot change their behavior. Don't touch them all, this ensures the
changes only affect liburing internal and no visible functionality
changes for the users.
Then we introduce new functions with the same name (with extra
underscore as prefix, 4 underscores):
1) `____sys_io_uring_register`
2) `____sys_io_uring_setup`
3) `____sys_io_uring_enter2`
4) `____sys_io_uring_enter`
These functions do not use `errno` variable *on the caller*, they use
the kernel style return value (return a negative value of error code
when errors).
These functions are defined as `static inline` in `src/syscall.h`.
They are just a wrapper to make sure liburing internal sources do not
touch `errno` variable from C files directly. We need to make C files
not to touch the `errno` variable to support build without libc.
To completely remove the `errno` variable dependency from liburing C
files. We wrap all syscalls in a kernel style return value as well.
Currently we have 5 other syscalls in liburing. We wrapped all of
them as these 5 functions:
1) `uring_mmap`
2) `uring_munmap`
3) `uring_madvise`
4) `uring_getrlimit`
5) `uring_setrlimit`
All of them are `static inline` and will return a negative value of
error code in case error happens.
Extra new helpers:
1) `ERR_PTR()`
2) `PTR_ERR()`
3) `IS_ERR()`
These helpers are used to deal with syscalls that return a pointer.
Currently only `uring_mmap()` that depends on these.
If you want a git repository to test these patches, you can pull from:
git://github.com/ammarfaizi2/liburing.git tags/kernel-style-retval-v4
Please review and comment.
----------------------------------------------------------------
Changes from v1 to v2:
- Make all wrapper functions be `static inline`, so they don't pollute
the global scope.
- Reduce the number of patches. Now we only have 4 patches (it was 6).
Changes from v2 to v3:
- Remove duplicate Signed-off-by.
Changes from v3 to v4 (this thread):
- Fix unintentional logic change [1].
- Don't take the kernel header file to have extra helpers (ERR_PTR,
PTR_ERR, IS_ERR). Write our own helpers with the same principles [2].
- Change the wrapper prefix, it was
`liburing_{mmap,munmap,madvise,{get,set}rlimit}`, now we use
`uring_{mmap,munmap,madvise,{get,set}rlimit}` to make it more
compact and consistent (as we already have `uring_{{un,}likely})`.
- Reduce the number of patches. Now we only have 3 patches (it was 4).
Link: [GH Issue] https://github.com/axboe/liburing/issues/443
Link: [v1] https://lore.kernel.org/io-uring/[email protected]/T/
Link: [v2] https://lore.kernel.org/io-uring/[email protected]/T/
Link: [v3] https://lore.kernel.org/io-uring/[email protected]/T/
#Ref
Link: [1] https://lore.kernel.org/io-uring/[email protected]/
Link: [2] https://github.com/axboe/liburing/issues/446
----------------------------------------------------------------
Ammar Faizi (3):
src/syscall: Wrap `errno` for `__sys_io_uring_{register,setup,enter{2,}}`
src/{queue,register,setup}: Don't use `__sys_io_uring*`
Wrap all syscalls in a kernel style return value
src/queue.c | 28 +++----
src/register.c | 197 +++++++++++++++--------------------------------
src/setup.c | 57 +++++++-------
src/syscall.c | 43 +----------
src/syscall.h | 141 +++++++++++++++++++++++++++++++++
5 files changed, 247 insertions(+), 219 deletions(-)
--
Ammar Faizi
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 RFC liburing 1/3] src/syscall: Wrap `errno` for `__sys_io_uring_{register,setup,enter{2,}}`
2021-10-03 10:17 [PATCHSET v4 RFC liburing 0/4] Implement the kernel style return value Ammar Faizi
@ 2021-10-03 10:17 ` Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 2/3] src/{queue,register,setup}: Don't use `__sys_io_uring*` Ammar Faizi
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Ammar Faizi @ 2021-10-03 10:17 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov
Cc: io-uring Mailing List, Bedirhan KURT, Louvian Lyndal, Ammar Faizi
Make it possible to remove the dependency of `errno` variable (which
comes from libc).
Currently, we expose these functions to userland:
1) `__sys_io_uring_register`
2) `__sys_io_uring_setup`
3) `__sys_io_uring_enter2`
4) `__sys_io_uring_enter`
The tests in `test/io_uring_{enter,register,setup}.c` are the examples
of it. Since the userland needs to check the `errno` value to use them
properly, this means those functions always depend on libc. So we
cannot change their behavior. Don't touch them all, this ensures the
changes only affect liburing internal and no visible functionality
changes for the users.
Then we introduce new functions with the same name (with extra
underscore as prefix, 4 underscores):
1) `____sys_io_uring_register`
2) `____sys_io_uring_setup`
3) `____sys_io_uring_enter2`
4) `____sys_io_uring_enter`
These functions do not use `errno` variable *on the caller*, they use
the kernel style return value (return a negative value of error code
when errors).
These functions are defined as `static inline` in `src/syscall.h`.
They are just a wrapper to make sure liburing internal sources do not
touch `errno` variable from C files directly. We need to make C files
not to touch the `errno` variable to support build without libc.
Link: https://github.com/axboe/liburing/issues/443#issuecomment-927873932
Cc: Bedirhan KURT <[email protected]>
Suggested-by: Louvian Lyndal <[email protected]>
Signed-off-by: Ammar Faizi <[email protected]>
---
src/syscall.c | 43 +++------------------------
src/syscall.h | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 85 insertions(+), 39 deletions(-)
diff --git a/src/syscall.c b/src/syscall.c
index 69027e5..5923fbb 100644
--- a/src/syscall.c
+++ b/src/syscall.c
@@ -11,41 +11,6 @@
#include "liburing/io_uring.h"
#include "syscall.h"
-#ifdef __alpha__
-/*
- * alpha and mips are exception, other architectures have
- * common numbers for new system calls.
- */
-# ifndef __NR_io_uring_setup
-# define __NR_io_uring_setup 535
-# endif
-# ifndef __NR_io_uring_enter
-# define __NR_io_uring_enter 536
-# endif
-# ifndef __NR_io_uring_register
-# define __NR_io_uring_register 537
-# endif
-#elif defined __mips__
-# ifndef __NR_io_uring_setup
-# define __NR_io_uring_setup (__NR_Linux + 425)
-# endif
-# ifndef __NR_io_uring_enter
-# define __NR_io_uring_enter (__NR_Linux + 426)
-# endif
-# ifndef __NR_io_uring_register
-# define __NR_io_uring_register (__NR_Linux + 427)
-# endif
-#else /* !__alpha__ and !__mips__ */
-# ifndef __NR_io_uring_setup
-# define __NR_io_uring_setup 425
-# endif
-# ifndef __NR_io_uring_enter
-# define __NR_io_uring_enter 426
-# endif
-# ifndef __NR_io_uring_register
-# define __NR_io_uring_register 427
-# endif
-#endif
int __sys_io_uring_register(int fd, unsigned opcode, const void *arg,
unsigned nr_args)
@@ -59,15 +24,15 @@ int __sys_io_uring_setup(unsigned entries, struct io_uring_params *p)
}
int __sys_io_uring_enter2(int fd, unsigned to_submit, unsigned min_complete,
- unsigned flags, sigset_t *sig, int sz)
+ unsigned flags, sigset_t *sig, int sz)
{
- return syscall(__NR_io_uring_enter, fd, to_submit, min_complete,
- flags, sig, sz);
+ return syscall(__NR_io_uring_enter, fd, to_submit, min_complete, flags,
+ sig, sz);
}
int __sys_io_uring_enter(int fd, unsigned to_submit, unsigned min_complete,
unsigned flags, sigset_t *sig)
{
return __sys_io_uring_enter2(fd, to_submit, min_complete, flags, sig,
- _NSIG / 8);
+ _NSIG / 8);
}
diff --git a/src/syscall.h b/src/syscall.h
index 2368f83..f7f63aa 100644
--- a/src/syscall.h
+++ b/src/syscall.h
@@ -2,7 +2,49 @@
#ifndef LIBURING_SYSCALL_H
#define LIBURING_SYSCALL_H
+#include <errno.h>
#include <signal.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
+#include <sys/resource.h>
+
+#ifdef __alpha__
+/*
+ * alpha and mips are exception, other architectures have
+ * common numbers for new system calls.
+ */
+# ifndef __NR_io_uring_setup
+# define __NR_io_uring_setup 535
+# endif
+# ifndef __NR_io_uring_enter
+# define __NR_io_uring_enter 536
+# endif
+# ifndef __NR_io_uring_register
+# define __NR_io_uring_register 537
+# endif
+#elif defined __mips__
+# ifndef __NR_io_uring_setup
+# define __NR_io_uring_setup (__NR_Linux + 425)
+# endif
+# ifndef __NR_io_uring_enter
+# define __NR_io_uring_enter (__NR_Linux + 426)
+# endif
+# ifndef __NR_io_uring_register
+# define __NR_io_uring_register (__NR_Linux + 427)
+# endif
+#else /* !__alpha__ and !__mips__ */
+# ifndef __NR_io_uring_setup
+# define __NR_io_uring_setup 425
+# endif
+# ifndef __NR_io_uring_enter
+# define __NR_io_uring_enter 426
+# endif
+# ifndef __NR_io_uring_register
+# define __NR_io_uring_register 427
+# endif
+#endif
+
struct io_uring_params;
@@ -17,4 +59,43 @@ int __sys_io_uring_enter2(int fd, unsigned to_submit, unsigned min_complete,
int __sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
unsigned int nr_args);
+
+
+static inline int ____sys_io_uring_register(int fd, unsigned opcode,
+ const void *arg, unsigned nr_args)
+{
+ int ret;
+
+ ret = syscall(__NR_io_uring_register, fd, opcode, arg, nr_args);
+ return (ret < 0) ? -errno : ret;
+}
+
+static inline int ____sys_io_uring_setup(unsigned entries,
+ struct io_uring_params *p)
+{
+ int ret;
+
+ ret = syscall(__NR_io_uring_setup, entries, p);
+ return (ret < 0) ? -errno : ret;
+}
+
+static inline int ____sys_io_uring_enter2(int fd, unsigned to_submit,
+ unsigned min_complete, unsigned flags,
+ sigset_t *sig, int sz)
+{
+ int ret;
+
+ ret = syscall(__NR_io_uring_enter, fd, to_submit, min_complete, flags,
+ sig, sz);
+ return (ret < 0) ? -errno : ret;
+}
+
+static inline int ____sys_io_uring_enter(int fd, unsigned to_submit,
+ unsigned min_complete, unsigned flags,
+ sigset_t *sig)
+{
+ return ____sys_io_uring_enter2(fd, to_submit, min_complete, flags, sig,
+ _NSIG / 8);
+}
+
#endif
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 RFC liburing 2/3] src/{queue,register,setup}: Don't use `__sys_io_uring*`
2021-10-03 10:17 [PATCHSET v4 RFC liburing 0/4] Implement the kernel style return value Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 1/3] src/syscall: Wrap `errno` for `__sys_io_uring_{register,setup,enter{2,}}` Ammar Faizi
@ 2021-10-03 10:17 ` Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 3/3] Wrap all syscalls in a kernel style return value Ammar Faizi
2021-10-03 10:43 ` [PATCHSET v4 RFC liburing 0/4] Implement the " Ammar Faizi
3 siblings, 0 replies; 7+ messages in thread
From: Ammar Faizi @ 2021-10-03 10:17 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov
Cc: io-uring Mailing List, Bedirhan KURT, Louvian Lyndal, Ammar Faizi
Don't use `__sys_io_uring*` for liburing internal. These functions
are now supposed for user backward compatibility.
For now, we use `____sys_io_uring*` (4 underscores). These are
`static inline` functions that wrap the `errno` variable in a kernel
style return value (directly returns negative error code when errors).
The main purpose of this change is to make it possible to remove the
`errno` variable dependency from liburing C sources, so that later
we will be able to build liburing without libc which doesn't use
`errno` variable at all.
Cc: Bedirhan KURT <[email protected]>
Cc: Louvian Lyndal <[email protected]>
Signed-off-by: Ammar Faizi <[email protected]>
---
src/queue.c | 27 +++----
src/register.c | 187 +++++++++++++++----------------------------------
src/setup.c | 4 +-
3 files changed, 70 insertions(+), 148 deletions(-)
diff --git a/src/queue.c b/src/queue.c
index 10ef31c..c2881e9 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -117,11 +117,11 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
if (!need_enter)
break;
- ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
- data->wait_nr, flags, data->arg,
- data->sz);
+ ret = ____sys_io_uring_enter2(ring->ring_fd, data->submit,
+ data->wait_nr, flags, data->arg,
+ data->sz);
if (ret < 0) {
- err = -errno;
+ err = ret;
break;
}
@@ -178,8 +178,8 @@ again:
goto done;
if (cq_ring_needs_flush(ring)) {
- __sys_io_uring_enter(ring->ring_fd, 0, 0,
- IORING_ENTER_GETEVENTS, NULL);
+ ____sys_io_uring_enter(ring->ring_fd, 0, 0,
+ IORING_ENTER_GETEVENTS, NULL);
overflow_checked = true;
goto again;
}
@@ -333,10 +333,8 @@ static int __io_uring_submit(struct io_uring *ring, unsigned submitted,
if (wait_nr || (ring->flags & IORING_SETUP_IOPOLL))
flags |= IORING_ENTER_GETEVENTS;
- ret = __sys_io_uring_enter(ring->ring_fd, submitted, wait_nr,
- flags, NULL);
- if (ret < 0)
- return -errno;
+ ret = ____sys_io_uring_enter(ring->ring_fd, submitted, wait_nr,
+ flags, NULL);
} else
ret = submitted;
@@ -391,11 +389,6 @@ struct io_uring_sqe *io_uring_get_sqe(struct io_uring *ring)
int __io_uring_sqring_wait(struct io_uring *ring)
{
- int ret;
-
- ret = __sys_io_uring_enter(ring->ring_fd, 0, 0, IORING_ENTER_SQ_WAIT,
- NULL);
- if (ret < 0)
- ret = -errno;
- return ret;
+ return ____sys_io_uring_enter(ring->ring_fd, 0, 0,
+ IORING_ENTER_SQ_WAIT, NULL);
}
diff --git a/src/register.c b/src/register.c
index 5ea4331..cb09dea 100644
--- a/src/register.c
+++ b/src/register.c
@@ -26,12 +26,10 @@ int io_uring_register_buffers_update_tag(struct io_uring *ring, unsigned off,
.tags = (unsigned long)tags,
.nr = nr,
};
- int ret;
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_BUFFERS_UPDATE,
- &up, sizeof(up));
- return ret < 0 ? -errno : ret;
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_BUFFERS_UPDATE, &up,
+ sizeof(up));
}
int io_uring_register_buffers_tags(struct io_uring *ring,
@@ -44,11 +42,9 @@ int io_uring_register_buffers_tags(struct io_uring *ring,
.data = (unsigned long)iovecs,
.tags = (unsigned long)tags,
};
- int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_BUFFERS2,
- ®, sizeof(reg));
- return ret < 0 ? -errno : ret;
+ return ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_BUFFERS2,
+ ®, sizeof(reg));
}
int io_uring_register_buffers(struct io_uring *ring, const struct iovec *iovecs,
@@ -56,24 +52,18 @@ int io_uring_register_buffers(struct io_uring *ring, const struct iovec *iovecs,
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_BUFFERS,
+ ret = ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_BUFFERS,
iovecs, nr_iovecs);
- if (ret < 0)
- return -errno;
-
- return 0;
+ return (ret < 0) ? ret : 0;
}
int io_uring_unregister_buffers(struct io_uring *ring)
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_BUFFERS,
+ ret = ____sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_BUFFERS,
NULL, 0);
- if (ret < 0)
- return -errno;
-
- return 0;
+ return (ret < 0) ? ret : 0;
}
int io_uring_register_files_update_tag(struct io_uring *ring, unsigned off,
@@ -86,12 +76,10 @@ int io_uring_register_files_update_tag(struct io_uring *ring, unsigned off,
.tags = (unsigned long)tags,
.nr = nr_files,
};
- int ret;
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_FILES_UPDATE2,
- &up, sizeof(up));
- return ret < 0 ? -errno : ret;
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_FILES_UPDATE2, &up,
+ sizeof(up));
}
/*
@@ -108,15 +96,10 @@ int io_uring_register_files_update(struct io_uring *ring, unsigned off,
.offset = off,
.fds = (unsigned long) files,
};
- int ret;
-
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_FILES_UPDATE, &up,
- nr_files);
- if (ret < 0)
- return -errno;
- return ret;
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_FILES_UPDATE, &up,
+ nr_files);
}
static int increase_rlimit_nofile(unsigned nr)
@@ -133,9 +116,8 @@ static int increase_rlimit_nofile(unsigned nr)
return 0;
}
-int io_uring_register_files_tags(struct io_uring *ring,
- const int *files, const __u64 *tags,
- unsigned nr)
+int io_uring_register_files_tags(struct io_uring *ring, const int *files,
+ const __u64 *tags, unsigned nr)
{
struct io_uring_rsrc_register reg = {
.nr = nr,
@@ -145,12 +127,12 @@ int io_uring_register_files_tags(struct io_uring *ring,
int ret, did_increase = 0;
do {
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_FILES2, ®,
- sizeof(reg));
+ ret = ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_FILES2, ®,
+ sizeof(reg));
if (ret >= 0)
break;
- if (errno == EMFILE && !did_increase) {
+ if (ret == -EMFILE && !did_increase) {
did_increase = 1;
increase_rlimit_nofile(nr);
continue;
@@ -158,21 +140,21 @@ int io_uring_register_files_tags(struct io_uring *ring,
break;
} while (1);
- return ret < 0 ? -errno : ret;
+ return ret;
}
int io_uring_register_files(struct io_uring *ring, const int *files,
- unsigned nr_files)
+ unsigned nr_files)
{
int ret, did_increase = 0;
do {
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_FILES, files,
- nr_files);
+ ret = ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_FILES, files,
+ nr_files);
if (ret >= 0)
break;
- if (errno == EMFILE && !did_increase) {
+ if (ret == -EMFILE && !did_increase) {
did_increase = 1;
increase_rlimit_nofile(nr_files);
continue;
@@ -180,55 +162,44 @@ int io_uring_register_files(struct io_uring *ring, const int *files,
break;
} while (1);
- return ret < 0 ? -errno : ret;
+ return ret;
}
int io_uring_unregister_files(struct io_uring *ring)
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_FILES,
+ ret = ____sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_FILES,
NULL, 0);
- if (ret < 0)
- return -errno;
-
- return 0;
+ return (ret < 0) ? ret : 0;
}
int io_uring_register_eventfd(struct io_uring *ring, int event_fd)
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_EVENTFD,
+ ret = ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_EVENTFD,
&event_fd, 1);
- if (ret < 0)
- return -errno;
-
- return 0;
+ return (ret < 0) ? ret : 0;
}
int io_uring_unregister_eventfd(struct io_uring *ring)
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_EVENTFD,
+ ret = ____sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_EVENTFD,
NULL, 0);
- if (ret < 0)
- return -errno;
-
- return 0;
+ return (ret < 0) ? ret : 0;
}
int io_uring_register_eventfd_async(struct io_uring *ring, int event_fd)
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_EVENTFD_ASYNC,
- &event_fd, 1);
- if (ret < 0)
- return -errno;
-
- return 0;
+ ret = ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_EVENTFD_ASYNC,
+ &event_fd, 1);
+ return (ret < 0) ? ret : 0;
}
int io_uring_register_probe(struct io_uring *ring, struct io_uring_probe *p,
@@ -236,36 +207,21 @@ int io_uring_register_probe(struct io_uring *ring, struct io_uring_probe *p,
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_PROBE,
- p, nr_ops);
- if (ret < 0)
- return -errno;
-
- return 0;
+ ret = ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_PROBE, p,
+ nr_ops);
+ return (ret < 0) ? ret : 0;
}
int io_uring_register_personality(struct io_uring *ring)
{
- int ret;
-
- ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_PERSONALITY,
- NULL, 0);
- if (ret < 0)
- return -errno;
-
- return ret;
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_PERSONALITY, NULL, 0);
}
int io_uring_unregister_personality(struct io_uring *ring, int id)
{
- int ret;
-
- ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_PERSONALITY,
- NULL, id);
- if (ret < 0)
- return -errno;
-
- return ret;
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_UNREGISTER_PERSONALITY, NULL, id);
}
int io_uring_register_restrictions(struct io_uring *ring,
@@ -274,61 +230,34 @@ int io_uring_register_restrictions(struct io_uring *ring,
{
int ret;
- ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_RESTRICTIONS,
- res, nr_res);
- if (ret < 0)
- return -errno;
-
- return 0;
+ ret = ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_RESTRICTIONS, res,
+ nr_res);
+ return (ret < 0) ? ret : 0;
}
int io_uring_enable_rings(struct io_uring *ring)
{
- int ret;
-
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_ENABLE_RINGS, NULL, 0);
- if (ret < 0)
- return -errno;
-
- return ret;
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_ENABLE_RINGS, NULL, 0);
}
int io_uring_register_iowq_aff(struct io_uring *ring, size_t cpusz,
const cpu_set_t *mask)
{
- int ret;
-
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_IOWQ_AFF, mask, cpusz);
- if (ret < 0)
- return -errno;
-
- return ret;
+ return ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_IOWQ_AFF,
+ mask, cpusz);
}
int io_uring_unregister_iowq_aff(struct io_uring *ring)
{
- int ret;
-
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_IOWQ_AFF, NULL, 0);
- if (ret < 0)
- return -errno;
-
- return ret;
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_IOWQ_AFF, NULL, 0);
}
int io_uring_register_iowq_max_workers(struct io_uring *ring, unsigned int *val)
{
- int ret;
-
- ret = __sys_io_uring_register(ring->ring_fd,
- IORING_REGISTER_IOWQ_MAX_WORKERS,
- val, 2);
- if (ret < 0)
- return -errno;
-
- return ret;
-
+ return ____sys_io_uring_register(ring->ring_fd,
+ IORING_REGISTER_IOWQ_MAX_WORKERS, val,
+ 2);
}
diff --git a/src/setup.c b/src/setup.c
index 54225e8..edfe94e 100644
--- a/src/setup.c
+++ b/src/setup.c
@@ -140,9 +140,9 @@ int io_uring_queue_init_params(unsigned entries, struct io_uring *ring,
{
int fd, ret;
- fd = __sys_io_uring_setup(entries, p);
+ fd = ____sys_io_uring_setup(entries, p);
if (fd < 0)
- return -errno;
+ return fd;
ret = io_uring_queue_mmap(fd, p, ring);
if (ret) {
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 RFC liburing 3/3] Wrap all syscalls in a kernel style return value
2021-10-03 10:17 [PATCHSET v4 RFC liburing 0/4] Implement the kernel style return value Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 1/3] src/syscall: Wrap `errno` for `__sys_io_uring_{register,setup,enter{2,}}` Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 2/3] src/{queue,register,setup}: Don't use `__sys_io_uring*` Ammar Faizi
@ 2021-10-03 10:17 ` Ammar Faizi
2021-10-03 13:17 ` Jens Axboe
2021-10-03 10:43 ` [PATCHSET v4 RFC liburing 0/4] Implement the " Ammar Faizi
3 siblings, 1 reply; 7+ messages in thread
From: Ammar Faizi @ 2021-10-03 10:17 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov
Cc: io-uring Mailing List, Bedirhan KURT, Louvian Lyndal, Ammar Faizi
Add new syscall wrapper functions:
1) `uring_mmap`
2) `uring_munmap`
3) `uring_madvise`
4) `uring_getrlimit`
5) `uring_setrlimit`
All of them are `static inline`.
Use them to wrap the syscalls in a kernel style return value. The
main purpose of this change is to make it possible to remove the
dependency of `errno` variable in liburing C sources, so that later
we can support build without libc.
Extra new helpers:
1) `ERR_PTR()`
2) `PTR_ERR()`
3) `IS_ERR()`
These helpers are used to deal with syscalls that return a pointer.
Currently only `uring_mmap()` that depends on these.
Link: https://github.com/axboe/liburing/issues/443#issuecomment-927873932
Cc: Bedirhan KURT <[email protected]>
Suggested-by: Louvian Lyndal <[email protected]>
Signed-off-by: Ammar Faizi <[email protected]>
---
src/queue.c | 1 -
src/register.c | 10 +++++----
src/setup.c | 53 ++++++++++++++++++++++----------------------
src/syscall.h | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 92 insertions(+), 32 deletions(-)
diff --git a/src/queue.c b/src/queue.c
index c2881e9..31aa17c 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -5,7 +5,6 @@
#include <sys/stat.h>
#include <sys/mman.h>
#include <unistd.h>
-#include <errno.h>
#include <string.h>
#include <stdbool.h>
diff --git a/src/register.c b/src/register.c
index cb09dea..fec144d 100644
--- a/src/register.c
+++ b/src/register.c
@@ -6,7 +6,6 @@
#include <sys/mman.h>
#include <sys/resource.h>
#include <unistd.h>
-#include <errno.h>
#include <string.h>
#include "liburing/compat.h"
@@ -104,13 +103,16 @@ int io_uring_register_files_update(struct io_uring *ring, unsigned off,
static int increase_rlimit_nofile(unsigned nr)
{
+ int ret;
struct rlimit rlim;
- if (getrlimit(RLIMIT_NOFILE, &rlim) < 0)
- return -errno;
+ ret = uring_getrlimit(RLIMIT_NOFILE, &rlim);
+ if (ret < 0)
+ return ret;
+
if (rlim.rlim_cur < nr) {
rlim.rlim_cur += nr;
- setrlimit(RLIMIT_NOFILE, &rlim);
+ return uring_setrlimit(RLIMIT_NOFILE, &rlim);
}
return 0;
diff --git a/src/setup.c b/src/setup.c
index edfe94e..bdbf97c 100644
--- a/src/setup.c
+++ b/src/setup.c
@@ -3,9 +3,7 @@
#include <sys/types.h>
#include <sys/stat.h>
-#include <sys/mman.h>
#include <unistd.h>
-#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <signal.h>
@@ -18,9 +16,9 @@
static void io_uring_unmap_rings(struct io_uring_sq *sq, struct io_uring_cq *cq)
{
- munmap(sq->ring_ptr, sq->ring_sz);
+ uring_munmap(sq->ring_ptr, sq->ring_sz);
if (cq->ring_ptr && cq->ring_ptr != sq->ring_ptr)
- munmap(cq->ring_ptr, cq->ring_sz);
+ uring_munmap(cq->ring_ptr, cq->ring_sz);
}
static int io_uring_mmap(int fd, struct io_uring_params *p,
@@ -37,19 +35,21 @@ static int io_uring_mmap(int fd, struct io_uring_params *p,
sq->ring_sz = cq->ring_sz;
cq->ring_sz = sq->ring_sz;
}
- sq->ring_ptr = mmap(0, sq->ring_sz, PROT_READ | PROT_WRITE,
- MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_SQ_RING);
- if (sq->ring_ptr == MAP_FAILED)
- return -errno;
+ sq->ring_ptr = uring_mmap(0, sq->ring_sz, PROT_READ | PROT_WRITE,
+ MAP_SHARED | MAP_POPULATE, fd,
+ IORING_OFF_SQ_RING);
+ if (IS_ERR(sq->ring_ptr))
+ return PTR_ERR(sq->ring_ptr);
if (p->features & IORING_FEAT_SINGLE_MMAP) {
cq->ring_ptr = sq->ring_ptr;
} else {
- cq->ring_ptr = mmap(0, cq->ring_sz, PROT_READ | PROT_WRITE,
- MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_CQ_RING);
- if (cq->ring_ptr == MAP_FAILED) {
+ cq->ring_ptr = uring_mmap(0, cq->ring_sz, PROT_READ | PROT_WRITE,
+ MAP_SHARED | MAP_POPULATE, fd,
+ IORING_OFF_CQ_RING);
+ if (IS_ERR(cq->ring_ptr)) {
+ ret = PTR_ERR(cq->ring_ptr);
cq->ring_ptr = NULL;
- ret = -errno;
goto err;
}
}
@@ -63,11 +63,10 @@ static int io_uring_mmap(int fd, struct io_uring_params *p,
sq->array = sq->ring_ptr + p->sq_off.array;
size = p->sq_entries * sizeof(struct io_uring_sqe);
- sq->sqes = mmap(0, size, PROT_READ | PROT_WRITE,
- MAP_SHARED | MAP_POPULATE, fd,
- IORING_OFF_SQES);
- if (sq->sqes == MAP_FAILED) {
- ret = -errno;
+ sq->sqes = uring_mmap(0, size, PROT_READ | PROT_WRITE,
+ MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_SQES);
+ if (IS_ERR(sq->sqes)) {
+ ret = PTR_ERR(sq->sqes);
err:
io_uring_unmap_rings(sq, cq);
return ret;
@@ -116,20 +115,20 @@ int io_uring_ring_dontfork(struct io_uring *ring)
return -EINVAL;
len = *ring->sq.kring_entries * sizeof(struct io_uring_sqe);
- ret = madvise(ring->sq.sqes, len, MADV_DONTFORK);
- if (ret == -1)
- return -errno;
+ ret = uring_madvise(ring->sq.sqes, len, MADV_DONTFORK);
+ if (ret < 0)
+ return ret;
len = ring->sq.ring_sz;
- ret = madvise(ring->sq.ring_ptr, len, MADV_DONTFORK);
- if (ret == -1)
- return -errno;
+ ret = uring_madvise(ring->sq.ring_ptr, len, MADV_DONTFORK);
+ if (ret < 0)
+ return ret;
if (ring->cq.ring_ptr != ring->sq.ring_ptr) {
len = ring->cq.ring_sz;
- ret = madvise(ring->cq.ring_ptr, len, MADV_DONTFORK);
- if (ret == -1)
- return -errno;
+ ret = uring_madvise(ring->cq.ring_ptr, len, MADV_DONTFORK);
+ if (ret < 0)
+ return ret;
}
return 0;
@@ -173,7 +172,7 @@ void io_uring_queue_exit(struct io_uring *ring)
struct io_uring_sq *sq = &ring->sq;
struct io_uring_cq *cq = &ring->cq;
- munmap(sq->sqes, *sq->kring_entries * sizeof(struct io_uring_sqe));
+ uring_munmap(sq->sqes, *sq->kring_entries * sizeof(struct io_uring_sqe));
io_uring_unmap_rings(sq, cq);
close(ring->ring_fd);
}
diff --git a/src/syscall.h b/src/syscall.h
index f7f63aa..3e964ed 100644
--- a/src/syscall.h
+++ b/src/syscall.h
@@ -4,11 +4,15 @@
#include <errno.h>
#include <signal.h>
+#include <stdint.h>
#include <unistd.h>
+#include <stdbool.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/resource.h>
+#include <liburing.h>
+
#ifdef __alpha__
/*
* alpha and mips are exception, other architectures have
@@ -60,6 +64,21 @@ int __sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
unsigned int nr_args);
+static inline void *ERR_PTR(intptr_t n)
+{
+ return (void *) n;
+}
+
+
+static inline intptr_t PTR_ERR(void *ptr)
+{
+ return (intptr_t) ptr;
+}
+
+static inline bool IS_ERR(void *ptr)
+{
+ return uring_unlikely((uintptr_t) ptr >= (uintptr_t) -4095UL);
+}
static inline int ____sys_io_uring_register(int fd, unsigned opcode,
const void *arg, unsigned nr_args)
@@ -98,4 +117,45 @@ static inline int ____sys_io_uring_enter(int fd, unsigned to_submit,
_NSIG / 8);
}
+static inline void *uring_mmap(void *addr, size_t length, int prot, int flags,
+ int fd, off_t offset)
+{
+ void *ret;
+
+ ret = mmap(addr, length, prot, flags, fd, offset);
+ return (ret == MAP_FAILED) ? ERR_PTR(-errno) : ret;
+}
+
+static inline int uring_munmap(void *addr, size_t length)
+{
+ int ret;
+
+ ret = munmap(addr, length);
+ return (ret < 0) ? -errno : ret;
+}
+
+static inline int uring_madvise(void *addr, size_t length, int advice)
+{
+ int ret;
+
+ ret = madvise(addr, length, advice);
+ return (ret < 0) ? -errno : ret;
+}
+
+static inline int uring_getrlimit(int resource, struct rlimit *rlim)
+{
+ int ret;
+
+ ret = getrlimit(resource, rlim);
+ return (ret < 0) ? -errno : ret;
+}
+
+static inline int uring_setrlimit(int resource, const struct rlimit *rlim)
+{
+ int ret;
+
+ ret = setrlimit(resource, rlim);
+ return (ret < 0) ? -errno : ret;
+}
+
#endif
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCHSET v4 RFC liburing 0/4] Implement the kernel style return value
2021-10-03 10:17 [PATCHSET v4 RFC liburing 0/4] Implement the kernel style return value Ammar Faizi
` (2 preceding siblings ...)
2021-10-03 10:17 ` [PATCH v4 RFC liburing 3/3] Wrap all syscalls in a kernel style return value Ammar Faizi
@ 2021-10-03 10:43 ` Ammar Faizi
3 siblings, 0 replies; 7+ messages in thread
From: Ammar Faizi @ 2021-10-03 10:43 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov
Cc: io-uring Mailing List, Bedirhan KURT, Louvian Lyndal
Sorry, wrong subject.
It should have been
```
[PATCHSET v4 RFC liburing 0/3]
```
not
```
[PATCHSET v4 RFC liburing 0/4]
```
Apologize for that.
--
Ammar Faizi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 RFC liburing 3/3] Wrap all syscalls in a kernel style return value
2021-10-03 10:17 ` [PATCH v4 RFC liburing 3/3] Wrap all syscalls in a kernel style return value Ammar Faizi
@ 2021-10-03 13:17 ` Jens Axboe
2021-10-03 15:13 ` Ammar Faizi
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2021-10-03 13:17 UTC (permalink / raw)
To: Ammar Faizi, Pavel Begunkov
Cc: io-uring Mailing List, Bedirhan KURT, Louvian Lyndal
> diff --git a/src/register.c b/src/register.c
> index cb09dea..fec144d 100644
> --- a/src/register.c
> +++ b/src/register.c
> @@ -6,7 +6,6 @@
> #include <sys/mman.h>
> #include <sys/resource.h>
> #include <unistd.h>
> -#include <errno.h>
> #include <string.h>
>
> #include "liburing/compat.h"
> @@ -104,13 +103,16 @@ int io_uring_register_files_update(struct io_uring *ring, unsigned off,
>
> static int increase_rlimit_nofile(unsigned nr)
> {
> + int ret;
> struct rlimit rlim;
>
> - if (getrlimit(RLIMIT_NOFILE, &rlim) < 0)
> - return -errno;
> + ret = uring_getrlimit(RLIMIT_NOFILE, &rlim);
> + if (ret < 0)
> + return ret;
> +
> if (rlim.rlim_cur < nr) {
> rlim.rlim_cur += nr;
> - setrlimit(RLIMIT_NOFILE, &rlim);
> + return uring_setrlimit(RLIMIT_NOFILE, &rlim);
> }
This isn't a functionally equivalent transformation, and it's
purposefully not returning failure to increase. It may still succeed if
we fail here, relying on failure later for the actual operation that
needs an increase in files.
> diff --git a/src/syscall.h b/src/syscall.h
> index f7f63aa..3e964ed 100644
> --- a/src/syscall.h
> +++ b/src/syscall.h
> @@ -4,11 +4,15 @@
>
> #include <errno.h>
> #include <signal.h>
> +#include <stdint.h>
> #include <unistd.h>
> +#include <stdbool.h>
> #include <sys/mman.h>
> #include <sys/syscall.h>
> #include <sys/resource.h>
>
> +#include <liburing.h>
> +
> #ifdef __alpha__
> /*
> * alpha and mips are exception, other architectures have
> @@ -60,6 +64,21 @@ int __sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
> unsigned int nr_args);
>
>
> +static inline void *ERR_PTR(intptr_t n)
> +{
> + return (void *) n;
> +}
> +
> +
Extra newline here.
Apart from those two, starting to look pretty reasonable.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 RFC liburing 3/3] Wrap all syscalls in a kernel style return value
2021-10-03 13:17 ` Jens Axboe
@ 2021-10-03 15:13 ` Ammar Faizi
0 siblings, 0 replies; 7+ messages in thread
From: Ammar Faizi @ 2021-10-03 15:13 UTC (permalink / raw)
To: Jens Axboe, Pavel Begunkov
Cc: io-uring Mailing List, Bedirhan KURT, Louvian Lyndal
On Sun, Oct 3, 2021 at 8:19 PM Jens Axboe <[email protected]> wrote:
>> diff --git a/src/register.c b/src/register.c
>> index cb09dea..fec144d 100644
>> --- a/src/register.c
>> +++ b/src/register.c
>> @@ -6,7 +6,6 @@
>> #include <sys/mman.h>
>> #include <sys/resource.h>
>> #include <unistd.h>
>> -#include <errno.h>
>> #include <string.h>
>>
>> #include "liburing/compat.h"
>> @@ -104,13 +103,16 @@ int io_uring_register_files_update(struct io_uring *ring, unsigned off,
>>
>> static int increase_rlimit_nofile(unsigned nr)
>> {
>> + int ret;
>> struct rlimit rlim;
>>
>> - if (getrlimit(RLIMIT_NOFILE, &rlim) < 0)
>> - return -errno;
>> + ret = uring_getrlimit(RLIMIT_NOFILE, &rlim);
>> + if (ret < 0)
>> + return ret;
>> +
>> if (rlim.rlim_cur < nr) {
>> rlim.rlim_cur += nr;
>> - setrlimit(RLIMIT_NOFILE, &rlim);
>> + return uring_setrlimit(RLIMIT_NOFILE, &rlim);
>> }
>
>This isn't a functionally equivalent transformation, and it's
>purposefully not returning failure to increase. It may still succeed if
>we fail here, relying on failure later for the actual operation that
>needs an increase in files.
>
>> diff --git a/src/syscall.h b/src/syscall.h
>> index f7f63aa..3e964ed 100644
>> --- a/src/syscall.h
>> +++ b/src/syscall.h
>> @@ -4,11 +4,15 @@
>>
>> #include <errno.h>
>> #include <signal.h>
>> +#include <stdint.h>
>> #include <unistd.h>
>> +#include <stdbool.h>
>> #include <sys/mman.h>
>> #include <sys/syscall.h>
>> #include <sys/resource.h>
>>
>> +#include <liburing.h>
>> +
>> #ifdef __alpha__
>> /*
>> * alpha and mips are exception, other architectures have
>> @@ -60,6 +64,21 @@ int __sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
>> unsigned int nr_args);
>>
>>
>> +static inline void *ERR_PTR(intptr_t n)
>> +{
>> + return (void *) n;
>> +}
>> +
>> +
>
>Extra newline here.
>
>Apart from those two, starting to look pretty reasonable.
>
>--
>Jens Axboe
>
Thanks for the review, I will address those two and send the v5 of
this series.
--
Ammar Faizi
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-10-03 15:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-03 10:17 [PATCHSET v4 RFC liburing 0/4] Implement the kernel style return value Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 1/3] src/syscall: Wrap `errno` for `__sys_io_uring_{register,setup,enter{2,}}` Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 2/3] src/{queue,register,setup}: Don't use `__sys_io_uring*` Ammar Faizi
2021-10-03 10:17 ` [PATCH v4 RFC liburing 3/3] Wrap all syscalls in a kernel style return value Ammar Faizi
2021-10-03 13:17 ` Jens Axboe
2021-10-03 15:13 ` Ammar Faizi
2021-10-03 10:43 ` [PATCHSET v4 RFC liburing 0/4] Implement the " Ammar Faizi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox