* IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used
@ 2020-07-13 22:55 Daniele Salvatore Albano
2020-07-14 2:20 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Daniele Salvatore Albano @ 2020-07-13 22:55 UTC (permalink / raw)
To: io-uring
Hi everyone,
I am trying to use IORING_OP_FILES_UPDATE in combination with
IOSQE_IO_LINK and IORING_OP_RECV / IORING_OP_RECV as linked op but I
keep getting ECANCELED (errno 125).
I am using io_uring (kernel 5.8.0-rc4 built 3 days ago) and liburing (tag 0.7).
I went through the test cases and I wasn't able to find any
combination of the OP and the flag and I can't find any related docs
so I am not sure if the combo isn't allowed.
Although I have found
https://github.com/torvalds/linux/blob/v5.8-rc5/fs/io_uring.c#L4926
if (sqe->flags || sqe->ioprio || sqe->rw_flags)
return -EINVAL;
Not sure if this is the reason for which the linked operation is
failing, I don't see in the other *_prep sqe->flags being broadly
checked in general.
I wrote two simple test cases that perform the following sequence of operations:
- open a local file (for the two test cases below /proc/cmdline)
- IORING_OP_FILES_UPDATE + IOSQE_IO_LINK (only in the first test case)
- IORING_OP_READ + IOSQE_FIXED_FILE
Here a small test case to trigger the issue I am facing
int main() {
struct io_uring ring = {0};
uint32_t head, count = 0;
struct io_uring_sqe *sqe = NULL;
struct io_uring_cqe *cqe = NULL;
uint32_t files_map_count = 16;
const int *files_map_registered = malloc(sizeof(int) * files_map_count);
memset((void*)files_map_registered, 0, sizeof(int) * files_map_count);
io_uring_queue_init(16, &ring, 0);
io_uring_register_files(&ring, files_map_registered, files_map_count);
int fd = open("/proc/cmdline", O_RDONLY);
int fd_index = 10;
sqe = io_uring_get_sqe(&ring);
io_uring_prep_files_update(sqe, &fd, 1, fd_index);
io_uring_sqe_set_flags(sqe, IOSQE_IO_LINK);
sqe->user_data = 1;
char buffer[512] = {0};
sqe = io_uring_get_sqe(&ring);
io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0);
io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE);
sqe->user_data = 2;
io_uring_submit_and_wait(&ring, 2);
io_uring_for_each_cqe(&ring, head, cqe) {
count++;
fprintf(stdout, "count = %d\n", count);
fprintf(stdout, "cqe->res = %d\n", cqe->res);
fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data);
fprintf(stdout, "cqe->flags = %u\n", cqe->flags);
}
io_uring_cq_advance(&ring, count);
io_uring_unregister_files(&ring);
io_uring_queue_exit(&ring);
return 0;
}
It will report for both the cqes res = -125
Instead if the code doesn't link and wait for the read it works as I
am expecting.
int main() {
struct io_uring ring = {0};
uint32_t head, count = 0;
char buffer[512] = {0};
struct io_uring_sqe *sqe = NULL;
struct io_uring_cqe *cqe = NULL;
uint32_t files_map_count = 16;
const int *files_map_registered = malloc(sizeof(int) * files_map_count);
memset((void*)files_map_registered, 0, sizeof(int) * files_map_count);
io_uring_queue_init(16, &ring, 0);
io_uring_register_files(&ring, files_map_registered, files_map_count);
int fd = open("/proc/cmdline", O_RDONLY);
int fd_index = 10;
sqe = io_uring_get_sqe(&ring);
io_uring_prep_files_update(sqe, &fd, 1, fd_index);
io_uring_sqe_set_flags(sqe, 0);
sqe->user_data = 1;
int exit_loop = 0;
do {
io_uring_submit_and_wait(&ring, 1);
io_uring_for_each_cqe(&ring, head, cqe) {
count++;
fprintf(stdout, "count = %d\n", count);
fprintf(stdout, "cqe->res = %d\n", cqe->res);
fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data);
fprintf(stdout, "cqe->flags = %u\n", cqe->flags);
if (cqe->user_data == 1) {
sqe = io_uring_get_sqe(&ring);
io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0);
io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE);
sqe->user_data = 2;
} else {
if (cqe->res >= 0) {
fprintf(stdout, "buffer = <");
fwrite(buffer, cqe->res, 1, stdout);
fprintf(stdout, ">\n");
}
exit_loop = 1;
}
}
io_uring_cq_advance(&ring, count);
} while(exit_loop == 0);
io_uring_unregister_files(&ring);
io_uring_queue_exit(&ring);
return 0;
}
The output here is
count = 1
cqe->res = 1
cqe->user_data = 1
cqe->flags = 0
count = 2
cqe->res = 58
cqe->user_data = 2
cqe->flags = 0
buffer = <initrd=....yada yada yada...>
Is this the expected behaviour? If no, any hint? What am I doing wrong?
If the expected behaviour is that IORING_OP_FILES_UPDATE can't be
linked, how can I use it properly to issue a read or another op that
requires to work with fds in a chain of operations?
Sorry for the silly questions!
Thanks!
Daniele
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used
2020-07-13 22:55 IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used Daniele Salvatore Albano
@ 2020-07-14 2:20 ` Jens Axboe
[not found] ` <CAKq9yRhAQ_zyE8k5kDYvdh8qAwZZOANFvwaR_T_n66MkJ-=+3w@mail.gmail.com>
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2020-07-14 2:20 UTC (permalink / raw)
To: Daniele Salvatore Albano, io-uring
On 7/13/20 4:55 PM, Daniele Salvatore Albano wrote:
> Hi everyone,
>
> I am trying to use IORING_OP_FILES_UPDATE in combination with
> IOSQE_IO_LINK and IORING_OP_RECV / IORING_OP_RECV as linked op but I
> keep getting ECANCELED (errno 125).
>
> I am using io_uring (kernel 5.8.0-rc4 built 3 days ago) and liburing (tag 0.7).
>
> I went through the test cases and I wasn't able to find any
> combination of the OP and the flag and I can't find any related docs
> so I am not sure if the combo isn't allowed.
>
> Although I have found
> https://github.com/torvalds/linux/blob/v5.8-rc5/fs/io_uring.c#L4926
>
> if (sqe->flags || sqe->ioprio || sqe->rw_flags)
> return -EINVAL;
>
> Not sure if this is the reason for which the linked operation is
> failing, I don't see in the other *_prep sqe->flags being broadly
> checked in general.
>
> I wrote two simple test cases that perform the following sequence of operations:
> - open a local file (for the two test cases below /proc/cmdline)
> - IORING_OP_FILES_UPDATE + IOSQE_IO_LINK (only in the first test case)
> - IORING_OP_READ + IOSQE_FIXED_FILE
>
> Here a small test case to trigger the issue I am facing
>
> int main() {
> struct io_uring ring = {0};
> uint32_t head, count = 0;
> struct io_uring_sqe *sqe = NULL;
> struct io_uring_cqe *cqe = NULL;
> uint32_t files_map_count = 16;
> const int *files_map_registered = malloc(sizeof(int) * files_map_count);
> memset((void*)files_map_registered, 0, sizeof(int) * files_map_count);
>
> io_uring_queue_init(16, &ring, 0);
> io_uring_register_files(&ring, files_map_registered, files_map_count);
>
> int fd = open("/proc/cmdline", O_RDONLY);
> int fd_index = 10;
>
> sqe = io_uring_get_sqe(&ring);
> io_uring_prep_files_update(sqe, &fd, 1, fd_index);
> io_uring_sqe_set_flags(sqe, IOSQE_IO_LINK);
> sqe->user_data = 1;
>
> char buffer[512] = {0};
> sqe = io_uring_get_sqe(&ring);
> io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0);
> io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE);
> sqe->user_data = 2;
>
> io_uring_submit_and_wait(&ring, 2);
>
> io_uring_for_each_cqe(&ring, head, cqe) {
> count++;
>
> fprintf(stdout, "count = %d\n", count);
> fprintf(stdout, "cqe->res = %d\n", cqe->res);
> fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data);
> fprintf(stdout, "cqe->flags = %u\n", cqe->flags);
> }
>
> io_uring_cq_advance(&ring, count);
>
> io_uring_unregister_files(&ring);
> io_uring_queue_exit(&ring);
>
> return 0;
> }
>
> It will report for both the cqes res = -125
>
> Instead if the code doesn't link and wait for the read it works as I
> am expecting.
>
> int main() {
> struct io_uring ring = {0};
> uint32_t head, count = 0;
> char buffer[512] = {0};
> struct io_uring_sqe *sqe = NULL;
> struct io_uring_cqe *cqe = NULL;
> uint32_t files_map_count = 16;
> const int *files_map_registered = malloc(sizeof(int) * files_map_count);
> memset((void*)files_map_registered, 0, sizeof(int) * files_map_count);
>
> io_uring_queue_init(16, &ring, 0);
> io_uring_register_files(&ring, files_map_registered, files_map_count);
>
> int fd = open("/proc/cmdline", O_RDONLY);
> int fd_index = 10;
>
> sqe = io_uring_get_sqe(&ring);
> io_uring_prep_files_update(sqe, &fd, 1, fd_index);
> io_uring_sqe_set_flags(sqe, 0);
> sqe->user_data = 1;
>
> int exit_loop = 0;
> do {
> io_uring_submit_and_wait(&ring, 1);
>
> io_uring_for_each_cqe(&ring, head, cqe) {
> count++;
>
> fprintf(stdout, "count = %d\n", count);
> fprintf(stdout, "cqe->res = %d\n", cqe->res);
> fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data);
> fprintf(stdout, "cqe->flags = %u\n", cqe->flags);
>
> if (cqe->user_data == 1) {
> sqe = io_uring_get_sqe(&ring);
> io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0);
> io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE);
> sqe->user_data = 2;
> } else {
> if (cqe->res >= 0) {
> fprintf(stdout, "buffer = <");
> fwrite(buffer, cqe->res, 1, stdout);
> fprintf(stdout, ">\n");
> }
>
> exit_loop = 1;
> }
> }
>
> io_uring_cq_advance(&ring, count);
> } while(exit_loop == 0);
>
> io_uring_unregister_files(&ring);
> io_uring_queue_exit(&ring);
>
> return 0;
> }
>
> The output here is
> count = 1
> cqe->res = 1
> cqe->user_data = 1
> cqe->flags = 0
> count = 2
> cqe->res = 58
> cqe->user_data = 2
> cqe->flags = 0
> buffer = <initrd=....yada yada yada...>
>
> Is this the expected behaviour? If no, any hint? What am I doing wrong?
>
> If the expected behaviour is that IORING_OP_FILES_UPDATE can't be
> linked, how can I use it properly to issue a read or another op that
> requires to work with fds in a chain of operations?
I think the sqe->flags should just be removed, as you're alluding to.
Care to send a patch for that? Then I can queue it up. We can even mark
it stable to ensure it gets back to older kernels.
Probably just want to make it something ala:
unsigned flags = READ_ONCE(sqe->flags);
if (flags & (IOSQE_FIXED_FILE | IOSQE_BUFFER_SELECT))
return -EINVAL;
as those flags don't make sense, but the rest do.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used
[not found] ` <CAKq9yRhAQ_zyE8k5kDYvdh8qAwZZOANFvwaR_T_n66MkJ-=+3w@mail.gmail.com>
@ 2020-07-14 16:50 ` Daniele Salvatore Albano
0 siblings, 0 replies; 3+ messages in thread
From: Daniele Salvatore Albano @ 2020-07-14 16:50 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring
On Tue, 14 Jul 2020 at 08:51, Daniele Salvatore Albano
<[email protected]> wrote:
>
> Sure thing, I will send a patch later on.
>
> Thanks!
>
> On Tue, 14 Jul 2020, 03:20 Jens Axboe, <[email protected]> wrote:
>>
>> On 7/13/20 4:55 PM, Daniele Salvatore Albano wrote:
>> > Hi everyone,
>> >
>> > I am trying to use IORING_OP_FILES_UPDATE in combination with
>> > IOSQE_IO_LINK and IORING_OP_RECV / IORING_OP_RECV as linked op but I
>> > keep getting ECANCELED (errno 125).
>> >
>> > I am using io_uring (kernel 5.8.0-rc4 built 3 days ago) and liburing (tag 0.7).
>> >
>> > I went through the test cases and I wasn't able to find any
>> > combination of the OP and the flag and I can't find any related docs
>> > so I am not sure if the combo isn't allowed.
>> >
>> > Although I have found
>> > https://github.com/torvalds/linux/blob/v5.8-rc5/fs/io_uring.c#L4926
>> >
>> > if (sqe->flags || sqe->ioprio || sqe->rw_flags)
>> > return -EINVAL;
>> >
>> > Not sure if this is the reason for which the linked operation is
>> > failing, I don't see in the other *_prep sqe->flags being broadly
>> > checked in general.
>> >
>> > I wrote two simple test cases that perform the following sequence of operations:
>> > - open a local file (for the two test cases below /proc/cmdline)
>> > - IORING_OP_FILES_UPDATE + IOSQE_IO_LINK (only in the first test case)
>> > - IORING_OP_READ + IOSQE_FIXED_FILE
>> >
>> > Here a small test case to trigger the issue I am facing
>> >
>> > int main() {
>> > struct io_uring ring = {0};
>> > uint32_t head, count = 0;
>> > struct io_uring_sqe *sqe = NULL;
>> > struct io_uring_cqe *cqe = NULL;
>> > uint32_t files_map_count = 16;
>> > const int *files_map_registered = malloc(sizeof(int) * files_map_count);
>> > memset((void*)files_map_registered, 0, sizeof(int) * files_map_count);
>> >
>> > io_uring_queue_init(16, &ring, 0);
>> > io_uring_register_files(&ring, files_map_registered, files_map_count);
>> >
>> > int fd = open("/proc/cmdline", O_RDONLY);
>> > int fd_index = 10;
>> >
>> > sqe = io_uring_get_sqe(&ring);
>> > io_uring_prep_files_update(sqe, &fd, 1, fd_index);
>> > io_uring_sqe_set_flags(sqe, IOSQE_IO_LINK);
>> > sqe->user_data = 1;
>> >
>> > char buffer[512] = {0};
>> > sqe = io_uring_get_sqe(&ring);
>> > io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0);
>> > io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE);
>> > sqe->user_data = 2;
>> >
>> > io_uring_submit_and_wait(&ring, 2);
>> >
>> > io_uring_for_each_cqe(&ring, head, cqe) {
>> > count++;
>> >
>> > fprintf(stdout, "count = %d\n", count);
>> > fprintf(stdout, "cqe->res = %d\n", cqe->res);
>> > fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data);
>> > fprintf(stdout, "cqe->flags = %u\n", cqe->flags);
>> > }
>> >
>> > io_uring_cq_advance(&ring, count);
>> >
>> > io_uring_unregister_files(&ring);
>> > io_uring_queue_exit(&ring);
>> >
>> > return 0;
>> > }
>> >
>> > It will report for both the cqes res = -125
>> >
>> > Instead if the code doesn't link and wait for the read it works as I
>> > am expecting.
>> >
>> > int main() {
>> > struct io_uring ring = {0};
>> > uint32_t head, count = 0;
>> > char buffer[512] = {0};
>> > struct io_uring_sqe *sqe = NULL;
>> > struct io_uring_cqe *cqe = NULL;
>> > uint32_t files_map_count = 16;
>> > const int *files_map_registered = malloc(sizeof(int) * files_map_count);
>> > memset((void*)files_map_registered, 0, sizeof(int) * files_map_count);
>> >
>> > io_uring_queue_init(16, &ring, 0);
>> > io_uring_register_files(&ring, files_map_registered, files_map_count);
>> >
>> > int fd = open("/proc/cmdline", O_RDONLY);
>> > int fd_index = 10;
>> >
>> > sqe = io_uring_get_sqe(&ring);
>> > io_uring_prep_files_update(sqe, &fd, 1, fd_index);
>> > io_uring_sqe_set_flags(sqe, 0);
>> > sqe->user_data = 1;
>> >
>> > int exit_loop = 0;
>> > do {
>> > io_uring_submit_and_wait(&ring, 1);
>> >
>> > io_uring_for_each_cqe(&ring, head, cqe) {
>> > count++;
>> >
>> > fprintf(stdout, "count = %d\n", count);
>> > fprintf(stdout, "cqe->res = %d\n", cqe->res);
>> > fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data);
>> > fprintf(stdout, "cqe->flags = %u\n", cqe->flags);
>> >
>> > if (cqe->user_data == 1) {
>> > sqe = io_uring_get_sqe(&ring);
>> > io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0);
>> > io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE);
>> > sqe->user_data = 2;
>> > } else {
>> > if (cqe->res >= 0) {
>> > fprintf(stdout, "buffer = <");
>> > fwrite(buffer, cqe->res, 1, stdout);
>> > fprintf(stdout, ">\n");
>> > }
>> >
>> > exit_loop = 1;
>> > }
>> > }
>> >
>> > io_uring_cq_advance(&ring, count);
>> > } while(exit_loop == 0);
>> >
>> > io_uring_unregister_files(&ring);
>> > io_uring_queue_exit(&ring);
>> >
>> > return 0;
>> > }
>> >
>> > The output here is
>> > count = 1
>> > cqe->res = 1
>> > cqe->user_data = 1
>> > cqe->flags = 0
>> > count = 2
>> > cqe->res = 58
>> > cqe->user_data = 2
>> > cqe->flags = 0
>> > buffer = <initrd=....yada yada yada...>
>> >
>> > Is this the expected behaviour? If no, any hint? What am I doing wrong?
>> >
>> > If the expected behaviour is that IORING_OP_FILES_UPDATE can't be
>> > linked, how can I use it properly to issue a read or another op that
>> > requires to work with fds in a chain of operations?
>>
>> I think the sqe->flags should just be removed, as you're alluding to.
>> Care to send a patch for that? Then I can queue it up. We can even mark
>> it stable to ensure it gets back to older kernels.
>>
>> Probably just want to make it something ala:
>>
>> unsigned flags = READ_ONCE(sqe->flags);
>>
>> if (flags & (IOSQE_FIXED_FILE | IOSQE_BUFFER_SELECT))
>> return -EINVAL;
>>
>> as those flags don't make sense, but the rest do.
>>
>> --
>> Jens Axboe
>>
TLDR;
Change done but the linked op still need to do not use
IOSQE_FIXED_FILE and use the original fd
I don't think that this is a problem at all because it still allows
the approach "submit & forget" as long as the first subsequent io op
is using the original fd and it's not using the registered index.
I can imagine people may expect to be able to use IOSQE_FIXED_FILE
with the linked op after IORING_OP_FILES_UPDATE but documenting it may
be enough also because the additional work may slow down various
things.
******
I applied the change but it wasn't still working as I was expecting,
after further investigation I discovered that it was trying to read
from the stdin because in my test case fd_index is 0.
When in sequence the userland code submits
- IORING_OP_FILES_UPDATE + IOSQE_IO_LINK
- IORING_OP_READ + IOSQE_FIXED_FILE
On the kernel side I am seeing (almost other calls):
- io_files_update_prep
- io_req_set_file
- io_read_prep
- io_files_update
- io_read
When io_req_set_file is executed, before io_files_update, the
registered files at offset 10 contains 0 that is the stdin.
This happens because, to my understanding, io_req_set_file it is
invoked by io_init_req in io_submit_sqes that is directly invoked by
the io_uring_enter syscal or the sq_thread.
For this to work it would have to re-fetch the new file pointer and
update the proper structs in req.
Having IORING_OP_FILES_UPDATE working with IOSQE_IO_LINK is already
great because it's just possible to invoke it and forget, using the
normal fd for the operations linked directly and without having really
keep tracking of the original fd: most likely if you are submitting
the files update you are doing it to submit an io op right after as
well so you probably have easy access or may have easy access to the
original fd.
With a test program I am now able to:
- submit IORING_OP_FILES_UPDATE with IOSQE_IO_LINK
- submit IORING_OP_READ (without IOSQE_FIXED_FILE)
- on completion
- submit IORING_OP_READ with IOSQE_FIXED_FILE
To be honest, I am not sure it's worth addressing this on the kernel
side because although I can easily imagine the expected behaviour is
"I set IORING_OP_FILES_UPDATE so I can use IOSQE_FIXED_FILE even in
the chain" (or at least it was mine).
This would require a series of changes for a very small number of use
cases, especially because the FD should be reloaded for all the ops in
the chain meanwhile they are processed:
- it's necessary to access the original fd so this would have to be
tracked down somewhere in io_kiocb and because all the io ops that can
use IOSQE_FIXED_FILE would require to potentially reload the fd it
would have to be added to a number of places
- it's of course possible to re-use something else but I don't have
the necessary expertise in io_uring to understand what
- all the ops using IOSQE_FIXED_FILE would have to check for
IOSQE_FIXED_FILE_RELOADFD and in case reload the fd
As consequence:
- if anything else can't be reused to store the fd, potentially grow io_kiocb
- slow down the io ops when have to reload the fd
- having to trigger an additional if because it would make sense
adding unlikely, it's and edge case and should be in the slow path
At the end, having to reload the fd may slightly impact the
performances of all the io ops and at the same time slow down a lot
the few operations for which would be required to reload the fd so
would just easier (and better) to pass the original fd in the
initially chained ops and use the registered fd index after.
******
What do you think?
Thanks!
Daniele
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-07-14 16:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-07-13 22:55 IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used Daniele Salvatore Albano
2020-07-14 2:20 ` Jens Axboe
[not found] ` <CAKq9yRhAQ_zyE8k5kDYvdh8qAwZZOANFvwaR_T_n66MkJ-=+3w@mail.gmail.com>
2020-07-14 16:50 ` Daniele Salvatore Albano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox