* IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used @ 2020-07-13 22:55 Daniele Salvatore Albano 2020-07-14 2:20 ` Jens Axboe 0 siblings, 1 reply; 3+ messages in thread From: Daniele Salvatore Albano @ 2020-07-13 22:55 UTC (permalink / raw) To: io-uring Hi everyone, I am trying to use IORING_OP_FILES_UPDATE in combination with IOSQE_IO_LINK and IORING_OP_RECV / IORING_OP_RECV as linked op but I keep getting ECANCELED (errno 125). I am using io_uring (kernel 5.8.0-rc4 built 3 days ago) and liburing (tag 0.7). I went through the test cases and I wasn't able to find any combination of the OP and the flag and I can't find any related docs so I am not sure if the combo isn't allowed. Although I have found https://github.com/torvalds/linux/blob/v5.8-rc5/fs/io_uring.c#L4926 if (sqe->flags || sqe->ioprio || sqe->rw_flags) return -EINVAL; Not sure if this is the reason for which the linked operation is failing, I don't see in the other *_prep sqe->flags being broadly checked in general. I wrote two simple test cases that perform the following sequence of operations: - open a local file (for the two test cases below /proc/cmdline) - IORING_OP_FILES_UPDATE + IOSQE_IO_LINK (only in the first test case) - IORING_OP_READ + IOSQE_FIXED_FILE Here a small test case to trigger the issue I am facing int main() { struct io_uring ring = {0}; uint32_t head, count = 0; struct io_uring_sqe *sqe = NULL; struct io_uring_cqe *cqe = NULL; uint32_t files_map_count = 16; const int *files_map_registered = malloc(sizeof(int) * files_map_count); memset((void*)files_map_registered, 0, sizeof(int) * files_map_count); io_uring_queue_init(16, &ring, 0); io_uring_register_files(&ring, files_map_registered, files_map_count); int fd = open("/proc/cmdline", O_RDONLY); int fd_index = 10; sqe = io_uring_get_sqe(&ring); io_uring_prep_files_update(sqe, &fd, 1, fd_index); io_uring_sqe_set_flags(sqe, IOSQE_IO_LINK); sqe->user_data = 1; char buffer[512] = {0}; sqe = io_uring_get_sqe(&ring); io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0); io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE); sqe->user_data = 2; io_uring_submit_and_wait(&ring, 2); io_uring_for_each_cqe(&ring, head, cqe) { count++; fprintf(stdout, "count = %d\n", count); fprintf(stdout, "cqe->res = %d\n", cqe->res); fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data); fprintf(stdout, "cqe->flags = %u\n", cqe->flags); } io_uring_cq_advance(&ring, count); io_uring_unregister_files(&ring); io_uring_queue_exit(&ring); return 0; } It will report for both the cqes res = -125 Instead if the code doesn't link and wait for the read it works as I am expecting. int main() { struct io_uring ring = {0}; uint32_t head, count = 0; char buffer[512] = {0}; struct io_uring_sqe *sqe = NULL; struct io_uring_cqe *cqe = NULL; uint32_t files_map_count = 16; const int *files_map_registered = malloc(sizeof(int) * files_map_count); memset((void*)files_map_registered, 0, sizeof(int) * files_map_count); io_uring_queue_init(16, &ring, 0); io_uring_register_files(&ring, files_map_registered, files_map_count); int fd = open("/proc/cmdline", O_RDONLY); int fd_index = 10; sqe = io_uring_get_sqe(&ring); io_uring_prep_files_update(sqe, &fd, 1, fd_index); io_uring_sqe_set_flags(sqe, 0); sqe->user_data = 1; int exit_loop = 0; do { io_uring_submit_and_wait(&ring, 1); io_uring_for_each_cqe(&ring, head, cqe) { count++; fprintf(stdout, "count = %d\n", count); fprintf(stdout, "cqe->res = %d\n", cqe->res); fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data); fprintf(stdout, "cqe->flags = %u\n", cqe->flags); if (cqe->user_data == 1) { sqe = io_uring_get_sqe(&ring); io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0); io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE); sqe->user_data = 2; } else { if (cqe->res >= 0) { fprintf(stdout, "buffer = <"); fwrite(buffer, cqe->res, 1, stdout); fprintf(stdout, ">\n"); } exit_loop = 1; } } io_uring_cq_advance(&ring, count); } while(exit_loop == 0); io_uring_unregister_files(&ring); io_uring_queue_exit(&ring); return 0; } The output here is count = 1 cqe->res = 1 cqe->user_data = 1 cqe->flags = 0 count = 2 cqe->res = 58 cqe->user_data = 2 cqe->flags = 0 buffer = <initrd=....yada yada yada...> Is this the expected behaviour? If no, any hint? What am I doing wrong? If the expected behaviour is that IORING_OP_FILES_UPDATE can't be linked, how can I use it properly to issue a read or another op that requires to work with fds in a chain of operations? Sorry for the silly questions! Thanks! Daniele ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used 2020-07-13 22:55 IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used Daniele Salvatore Albano @ 2020-07-14 2:20 ` Jens Axboe [not found] ` <CAKq9yRhAQ_zyE8k5kDYvdh8qAwZZOANFvwaR_T_n66MkJ-=+3w@mail.gmail.com> 0 siblings, 1 reply; 3+ messages in thread From: Jens Axboe @ 2020-07-14 2:20 UTC (permalink / raw) To: Daniele Salvatore Albano, io-uring On 7/13/20 4:55 PM, Daniele Salvatore Albano wrote: > Hi everyone, > > I am trying to use IORING_OP_FILES_UPDATE in combination with > IOSQE_IO_LINK and IORING_OP_RECV / IORING_OP_RECV as linked op but I > keep getting ECANCELED (errno 125). > > I am using io_uring (kernel 5.8.0-rc4 built 3 days ago) and liburing (tag 0.7). > > I went through the test cases and I wasn't able to find any > combination of the OP and the flag and I can't find any related docs > so I am not sure if the combo isn't allowed. > > Although I have found > https://github.com/torvalds/linux/blob/v5.8-rc5/fs/io_uring.c#L4926 > > if (sqe->flags || sqe->ioprio || sqe->rw_flags) > return -EINVAL; > > Not sure if this is the reason for which the linked operation is > failing, I don't see in the other *_prep sqe->flags being broadly > checked in general. > > I wrote two simple test cases that perform the following sequence of operations: > - open a local file (for the two test cases below /proc/cmdline) > - IORING_OP_FILES_UPDATE + IOSQE_IO_LINK (only in the first test case) > - IORING_OP_READ + IOSQE_FIXED_FILE > > Here a small test case to trigger the issue I am facing > > int main() { > struct io_uring ring = {0}; > uint32_t head, count = 0; > struct io_uring_sqe *sqe = NULL; > struct io_uring_cqe *cqe = NULL; > uint32_t files_map_count = 16; > const int *files_map_registered = malloc(sizeof(int) * files_map_count); > memset((void*)files_map_registered, 0, sizeof(int) * files_map_count); > > io_uring_queue_init(16, &ring, 0); > io_uring_register_files(&ring, files_map_registered, files_map_count); > > int fd = open("/proc/cmdline", O_RDONLY); > int fd_index = 10; > > sqe = io_uring_get_sqe(&ring); > io_uring_prep_files_update(sqe, &fd, 1, fd_index); > io_uring_sqe_set_flags(sqe, IOSQE_IO_LINK); > sqe->user_data = 1; > > char buffer[512] = {0}; > sqe = io_uring_get_sqe(&ring); > io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0); > io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE); > sqe->user_data = 2; > > io_uring_submit_and_wait(&ring, 2); > > io_uring_for_each_cqe(&ring, head, cqe) { > count++; > > fprintf(stdout, "count = %d\n", count); > fprintf(stdout, "cqe->res = %d\n", cqe->res); > fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data); > fprintf(stdout, "cqe->flags = %u\n", cqe->flags); > } > > io_uring_cq_advance(&ring, count); > > io_uring_unregister_files(&ring); > io_uring_queue_exit(&ring); > > return 0; > } > > It will report for both the cqes res = -125 > > Instead if the code doesn't link and wait for the read it works as I > am expecting. > > int main() { > struct io_uring ring = {0}; > uint32_t head, count = 0; > char buffer[512] = {0}; > struct io_uring_sqe *sqe = NULL; > struct io_uring_cqe *cqe = NULL; > uint32_t files_map_count = 16; > const int *files_map_registered = malloc(sizeof(int) * files_map_count); > memset((void*)files_map_registered, 0, sizeof(int) * files_map_count); > > io_uring_queue_init(16, &ring, 0); > io_uring_register_files(&ring, files_map_registered, files_map_count); > > int fd = open("/proc/cmdline", O_RDONLY); > int fd_index = 10; > > sqe = io_uring_get_sqe(&ring); > io_uring_prep_files_update(sqe, &fd, 1, fd_index); > io_uring_sqe_set_flags(sqe, 0); > sqe->user_data = 1; > > int exit_loop = 0; > do { > io_uring_submit_and_wait(&ring, 1); > > io_uring_for_each_cqe(&ring, head, cqe) { > count++; > > fprintf(stdout, "count = %d\n", count); > fprintf(stdout, "cqe->res = %d\n", cqe->res); > fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data); > fprintf(stdout, "cqe->flags = %u\n", cqe->flags); > > if (cqe->user_data == 1) { > sqe = io_uring_get_sqe(&ring); > io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0); > io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE); > sqe->user_data = 2; > } else { > if (cqe->res >= 0) { > fprintf(stdout, "buffer = <"); > fwrite(buffer, cqe->res, 1, stdout); > fprintf(stdout, ">\n"); > } > > exit_loop = 1; > } > } > > io_uring_cq_advance(&ring, count); > } while(exit_loop == 0); > > io_uring_unregister_files(&ring); > io_uring_queue_exit(&ring); > > return 0; > } > > The output here is > count = 1 > cqe->res = 1 > cqe->user_data = 1 > cqe->flags = 0 > count = 2 > cqe->res = 58 > cqe->user_data = 2 > cqe->flags = 0 > buffer = <initrd=....yada yada yada...> > > Is this the expected behaviour? If no, any hint? What am I doing wrong? > > If the expected behaviour is that IORING_OP_FILES_UPDATE can't be > linked, how can I use it properly to issue a read or another op that > requires to work with fds in a chain of operations? I think the sqe->flags should just be removed, as you're alluding to. Care to send a patch for that? Then I can queue it up. We can even mark it stable to ensure it gets back to older kernels. Probably just want to make it something ala: unsigned flags = READ_ONCE(sqe->flags); if (flags & (IOSQE_FIXED_FILE | IOSQE_BUFFER_SELECT)) return -EINVAL; as those flags don't make sense, but the rest do. -- Jens Axboe ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <CAKq9yRhAQ_zyE8k5kDYvdh8qAwZZOANFvwaR_T_n66MkJ-=+3w@mail.gmail.com>]
* Re: IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used [not found] ` <CAKq9yRhAQ_zyE8k5kDYvdh8qAwZZOANFvwaR_T_n66MkJ-=+3w@mail.gmail.com> @ 2020-07-14 16:50 ` Daniele Salvatore Albano 0 siblings, 0 replies; 3+ messages in thread From: Daniele Salvatore Albano @ 2020-07-14 16:50 UTC (permalink / raw) To: Jens Axboe; +Cc: io-uring On Tue, 14 Jul 2020 at 08:51, Daniele Salvatore Albano <[email protected]> wrote: > > Sure thing, I will send a patch later on. > > Thanks! > > On Tue, 14 Jul 2020, 03:20 Jens Axboe, <[email protected]> wrote: >> >> On 7/13/20 4:55 PM, Daniele Salvatore Albano wrote: >> > Hi everyone, >> > >> > I am trying to use IORING_OP_FILES_UPDATE in combination with >> > IOSQE_IO_LINK and IORING_OP_RECV / IORING_OP_RECV as linked op but I >> > keep getting ECANCELED (errno 125). >> > >> > I am using io_uring (kernel 5.8.0-rc4 built 3 days ago) and liburing (tag 0.7). >> > >> > I went through the test cases and I wasn't able to find any >> > combination of the OP and the flag and I can't find any related docs >> > so I am not sure if the combo isn't allowed. >> > >> > Although I have found >> > https://github.com/torvalds/linux/blob/v5.8-rc5/fs/io_uring.c#L4926 >> > >> > if (sqe->flags || sqe->ioprio || sqe->rw_flags) >> > return -EINVAL; >> > >> > Not sure if this is the reason for which the linked operation is >> > failing, I don't see in the other *_prep sqe->flags being broadly >> > checked in general. >> > >> > I wrote two simple test cases that perform the following sequence of operations: >> > - open a local file (for the two test cases below /proc/cmdline) >> > - IORING_OP_FILES_UPDATE + IOSQE_IO_LINK (only in the first test case) >> > - IORING_OP_READ + IOSQE_FIXED_FILE >> > >> > Here a small test case to trigger the issue I am facing >> > >> > int main() { >> > struct io_uring ring = {0}; >> > uint32_t head, count = 0; >> > struct io_uring_sqe *sqe = NULL; >> > struct io_uring_cqe *cqe = NULL; >> > uint32_t files_map_count = 16; >> > const int *files_map_registered = malloc(sizeof(int) * files_map_count); >> > memset((void*)files_map_registered, 0, sizeof(int) * files_map_count); >> > >> > io_uring_queue_init(16, &ring, 0); >> > io_uring_register_files(&ring, files_map_registered, files_map_count); >> > >> > int fd = open("/proc/cmdline", O_RDONLY); >> > int fd_index = 10; >> > >> > sqe = io_uring_get_sqe(&ring); >> > io_uring_prep_files_update(sqe, &fd, 1, fd_index); >> > io_uring_sqe_set_flags(sqe, IOSQE_IO_LINK); >> > sqe->user_data = 1; >> > >> > char buffer[512] = {0}; >> > sqe = io_uring_get_sqe(&ring); >> > io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0); >> > io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE); >> > sqe->user_data = 2; >> > >> > io_uring_submit_and_wait(&ring, 2); >> > >> > io_uring_for_each_cqe(&ring, head, cqe) { >> > count++; >> > >> > fprintf(stdout, "count = %d\n", count); >> > fprintf(stdout, "cqe->res = %d\n", cqe->res); >> > fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data); >> > fprintf(stdout, "cqe->flags = %u\n", cqe->flags); >> > } >> > >> > io_uring_cq_advance(&ring, count); >> > >> > io_uring_unregister_files(&ring); >> > io_uring_queue_exit(&ring); >> > >> > return 0; >> > } >> > >> > It will report for both the cqes res = -125 >> > >> > Instead if the code doesn't link and wait for the read it works as I >> > am expecting. >> > >> > int main() { >> > struct io_uring ring = {0}; >> > uint32_t head, count = 0; >> > char buffer[512] = {0}; >> > struct io_uring_sqe *sqe = NULL; >> > struct io_uring_cqe *cqe = NULL; >> > uint32_t files_map_count = 16; >> > const int *files_map_registered = malloc(sizeof(int) * files_map_count); >> > memset((void*)files_map_registered, 0, sizeof(int) * files_map_count); >> > >> > io_uring_queue_init(16, &ring, 0); >> > io_uring_register_files(&ring, files_map_registered, files_map_count); >> > >> > int fd = open("/proc/cmdline", O_RDONLY); >> > int fd_index = 10; >> > >> > sqe = io_uring_get_sqe(&ring); >> > io_uring_prep_files_update(sqe, &fd, 1, fd_index); >> > io_uring_sqe_set_flags(sqe, 0); >> > sqe->user_data = 1; >> > >> > int exit_loop = 0; >> > do { >> > io_uring_submit_and_wait(&ring, 1); >> > >> > io_uring_for_each_cqe(&ring, head, cqe) { >> > count++; >> > >> > fprintf(stdout, "count = %d\n", count); >> > fprintf(stdout, "cqe->res = %d\n", cqe->res); >> > fprintf(stdout, "cqe->user_data = %llu\n", cqe->user_data); >> > fprintf(stdout, "cqe->flags = %u\n", cqe->flags); >> > >> > if (cqe->user_data == 1) { >> > sqe = io_uring_get_sqe(&ring); >> > io_uring_prep_read(sqe, fd_index, &buffer, sizeof(buffer), 0); >> > io_uring_sqe_set_flags(sqe, IOSQE_FIXED_FILE); >> > sqe->user_data = 2; >> > } else { >> > if (cqe->res >= 0) { >> > fprintf(stdout, "buffer = <"); >> > fwrite(buffer, cqe->res, 1, stdout); >> > fprintf(stdout, ">\n"); >> > } >> > >> > exit_loop = 1; >> > } >> > } >> > >> > io_uring_cq_advance(&ring, count); >> > } while(exit_loop == 0); >> > >> > io_uring_unregister_files(&ring); >> > io_uring_queue_exit(&ring); >> > >> > return 0; >> > } >> > >> > The output here is >> > count = 1 >> > cqe->res = 1 >> > cqe->user_data = 1 >> > cqe->flags = 0 >> > count = 2 >> > cqe->res = 58 >> > cqe->user_data = 2 >> > cqe->flags = 0 >> > buffer = <initrd=....yada yada yada...> >> > >> > Is this the expected behaviour? If no, any hint? What am I doing wrong? >> > >> > If the expected behaviour is that IORING_OP_FILES_UPDATE can't be >> > linked, how can I use it properly to issue a read or another op that >> > requires to work with fds in a chain of operations? >> >> I think the sqe->flags should just be removed, as you're alluding to. >> Care to send a patch for that? Then I can queue it up. We can even mark >> it stable to ensure it gets back to older kernels. >> >> Probably just want to make it something ala: >> >> unsigned flags = READ_ONCE(sqe->flags); >> >> if (flags & (IOSQE_FIXED_FILE | IOSQE_BUFFER_SELECT)) >> return -EINVAL; >> >> as those flags don't make sense, but the rest do. >> >> -- >> Jens Axboe >> TLDR; Change done but the linked op still need to do not use IOSQE_FIXED_FILE and use the original fd I don't think that this is a problem at all because it still allows the approach "submit & forget" as long as the first subsequent io op is using the original fd and it's not using the registered index. I can imagine people may expect to be able to use IOSQE_FIXED_FILE with the linked op after IORING_OP_FILES_UPDATE but documenting it may be enough also because the additional work may slow down various things. ****** I applied the change but it wasn't still working as I was expecting, after further investigation I discovered that it was trying to read from the stdin because in my test case fd_index is 0. When in sequence the userland code submits - IORING_OP_FILES_UPDATE + IOSQE_IO_LINK - IORING_OP_READ + IOSQE_FIXED_FILE On the kernel side I am seeing (almost other calls): - io_files_update_prep - io_req_set_file - io_read_prep - io_files_update - io_read When io_req_set_file is executed, before io_files_update, the registered files at offset 10 contains 0 that is the stdin. This happens because, to my understanding, io_req_set_file it is invoked by io_init_req in io_submit_sqes that is directly invoked by the io_uring_enter syscal or the sq_thread. For this to work it would have to re-fetch the new file pointer and update the proper structs in req. Having IORING_OP_FILES_UPDATE working with IOSQE_IO_LINK is already great because it's just possible to invoke it and forget, using the normal fd for the operations linked directly and without having really keep tracking of the original fd: most likely if you are submitting the files update you are doing it to submit an io op right after as well so you probably have easy access or may have easy access to the original fd. With a test program I am now able to: - submit IORING_OP_FILES_UPDATE with IOSQE_IO_LINK - submit IORING_OP_READ (without IOSQE_FIXED_FILE) - on completion - submit IORING_OP_READ with IOSQE_FIXED_FILE To be honest, I am not sure it's worth addressing this on the kernel side because although I can easily imagine the expected behaviour is "I set IORING_OP_FILES_UPDATE so I can use IOSQE_FIXED_FILE even in the chain" (or at least it was mine). This would require a series of changes for a very small number of use cases, especially because the FD should be reloaded for all the ops in the chain meanwhile they are processed: - it's necessary to access the original fd so this would have to be tracked down somewhere in io_kiocb and because all the io ops that can use IOSQE_FIXED_FILE would require to potentially reload the fd it would have to be added to a number of places - it's of course possible to re-use something else but I don't have the necessary expertise in io_uring to understand what - all the ops using IOSQE_FIXED_FILE would have to check for IOSQE_FIXED_FILE_RELOADFD and in case reload the fd As consequence: - if anything else can't be reused to store the fd, potentially grow io_kiocb - slow down the io ops when have to reload the fd - having to trigger an additional if because it would make sense adding unlikely, it's and edge case and should be in the slow path At the end, having to reload the fd may slightly impact the performances of all the io ops and at the same time slow down a lot the few operations for which would be required to reload the fd so would just easier (and better) to pass the original fd in the initially chained ops and use the registered fd index after. ****** What do you think? Thanks! Daniele ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-07-14 16:50 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-07-13 22:55 IORING_OP_FILES_UPDATE fail with ECANCELED when IOSQE_IO_LINK is used Daniele Salvatore Albano 2020-07-14 2:20 ` Jens Axboe [not found] ` <CAKq9yRhAQ_zyE8k5kDYvdh8qAwZZOANFvwaR_T_n66MkJ-=+3w@mail.gmail.com> 2020-07-14 16:50 ` Daniele Salvatore Albano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox