* io_uring is a regression over 16 year old aio/io_submit, 2+ decades of Microsoft NT, and *BSD circa 1997-2001
@ 2023-05-04 19:20 Reece
2023-05-04 19:27 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Reece @ 2023-05-04 19:20 UTC (permalink / raw)
To: io-uring; +Cc: axboe
I don't know how you guys have done it, but somehow you've already
messed up at no greater than 4th operation of io_uring_register, namely
IORING_REGISTER_EVENTFD.
Let's look at aio subsystem for a moment, the iocb structure has a
aio_flags and a aio_resfd member. Latterly, this member can be used to
fire an eventfd/signalfd subsystem file descriptor.
struct iocb
{
__u64 aio_data;
[...]
__u16 aio_lio_opcode;
[...]
__u64 aio_buf;
__u64 aio_nbytes;
__s64 aio_offset;
[...]
__u32 aio_flags;
__u32 aio_resfd;
};
For fun, let's look at a competing operating system, Windows 2000.
typedef struct _OVERLAPPED {
ULONG_PTR Internal;
ULONG_PTR InternalHigh;
union {
struct {
DWORD Offset;
DWORD OffsetHigh;
};
PVOID Pointer;
};
HANDLE hEvent;
} OVERLAPPED, *LPOVERLAPPED;
https://learn.microsoft.com/en-us/windows/win32/api/minwinbase/ns-minwinbase-overlapped
Or what about the BSD family of operating systems?
...Well, they natively support the POSIX AIO apis of aio_readv,
aio_writev (circa 1997 - POSIX SID, Issue 5), and the synchronization of
these aio contexts under kevent/kqueues using the EVFILT_AIO event
filter (FreeBSD: Apr 16, 2000).
https://pubs.opengroup.org/onlinepubs/9693999499/toc.pdf
https://github.com/freebsd/freebsd-src/commit/3ee12e4fe3c884db74bc236f6f76dfb7539eb0d1
An interesting pattern is starting to emerge. For each IO transaction,
these IO subsystems allows the developer to register an IO event object
(or equivalent) to become signaled upon the IO transactions completion.
Now let's take look at this super duper "we've finally figured out how
to do asynchronous IO transactions in Linux" subsystem.
It’s possible to use eventfd(2) to get notified of completion events
on an io_uring instance. If this is desired, an eventfd file
descriptor can be registered through this operation. /arg/ must
contain a pointer to the eventfd file descriptor, and /nr_args/ must
be 1. Available since 5.2.
Wa, wa, waaaa. Fail.
We went from being able to signal file descriptors/HANDLEs per IO
transaction over a decade ago, to now being able to listening to gnats
fart in a singular IO batching context. It's my understanding that the
whole purpose of io_uring is to perform IO on a single IO thread or two,
over the annoying synchronous read/write syscalls Linux has been stuck
with since the inception of UNIX. The BSD family of operating systems
had no problem adopting kevent support for POSIX AIO, meanwhile Linux
[glibc] only ever had a dumb polyfil solution of "lol lets just spam a
bunch of threads and hope for the best." I digress. The whole point of
this dumb interface should be to allow for batching of what would
otherwise be blocking IO operations on a single thread. It therefore
makes sense to have some way to signal a unique handle that a
transaction is complete for the sake of synchronizing other threads
against the IO submitter thread, should a different thread need to poll
against the result/completion status of the work submitted (AND ONLY THE
TRANSACTION). Sure, you could use userland semaphores signaled by the IO
thread with some added latency, but that's not the point; by far the
easiest way to batch waits of completable objects while yielding the
current task/thread context is to simply look past user-land scheduling
and switching, and to look towards kernels native event/io
synchronization objects. For whatever reason, this functionality does
not exist in io_uring. What's worse, a less efficient spammier io signal
trigger exists in its' place.
I'm sure there will be "you're doing it wrong" responses from the Linux
community, as always; however io_uring was supposed to fix the issues of
AIO, not regress the existing functionality of Linux into being less
useful than a two decade old Microsoft operating system and a FreeBSD
build from Apr 16, 2000.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: io_uring is a regression over 16 year old aio/io_submit, 2+ decades of Microsoft NT, and *BSD circa 1997-2001
2023-05-04 19:20 io_uring is a regression over 16 year old aio/io_submit, 2+ decades of Microsoft NT, and *BSD circa 1997-2001 Reece
@ 2023-05-04 19:27 ` Jens Axboe
[not found] ` <[email protected]>
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2023-05-04 19:27 UTC (permalink / raw)
To: Reece, io-uring
On 5/4/23 1:20?PM, Reece wrote a pile of garbage.
If you have constructive criticism, the list is open to discuss it.
Sending trolly garbage anonymously is, however, neither welcome nor a
productive approach.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-05-04 19:37 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-04 19:20 io_uring is a regression over 16 year old aio/io_submit, 2+ decades of Microsoft NT, and *BSD circa 1997-2001 Reece
2023-05-04 19:27 ` Jens Axboe
[not found] ` <[email protected]>
[not found] ` <[email protected]>
2023-05-04 19:37 ` Reece
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox