On 18.01.2022 19:32, Jens Axboe wrote: > On 1/18/22 4:36 PM, Jens Axboe wrote: > > On 1/18/22 1:05 PM, Florian Fischer wrote: > >>>> After reading the io_uring_enter(2) man page a IORING_OP_ASYNC_CANCEL's return value of -EALREADY apparently > >>>> may not cause the request to terminate. At least that is our interpretation of "…res field will contain -EALREADY. > >>>> In this case, the request may or may not terminate." > >>> > >>> I took a look at this, and my theory is that the request cancelation > >>> ends up happening right in between when the work item is moved between > >>> the work list and to the worker itself. The way the async queue works, > >>> the work item is sitting in a list until it gets assigned by a worker. > >>> When that assignment happens, it's removed from the general work list > >>> and then assigned to the worker itself. There's a small gap there where > >>> the work cannot be found in the general list, and isn't yet findable in > >>> the worker itself either. > >>> > >>> Do you always see -ENOENT from the cancel when you get the hang > >>> condition? > >> > >> No we also and actually more commonly observe cancel returning > >> -EALREADY and the canceled read request never gets completed. > >> > >> As shown in the log snippet I included below. > > > > I think there are a couple of different cases here. Can you try the > > below patch? It's against current -git. > > Cleaned it up and split it into functional bits, end result is here: > > https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-5.17 Thanks. I have build and tested your patches. The most common error we observed (read -> read -> write -> 2x cancel) is no longer reproducible and our originally test case works flawless :) Nor could I reproduce any hangs with cancel returning -ENOENT. But I still can reliably reproduce stuck threads when not incrementing the evfd count and thus never completing reads due to available data to read. (read -> read -> write (do not increment evfd count) -> 2x cancel) I further reduced the attached C program to reproduce the above problem. The code is also available now at our gitlab [1]. The following log output was created with a less 'minimal' version still including log functionality: 75 Collect write completion: 8 75 Collect cancel read 1 completion: 0 75 Collect cancel read 2 completion: -114 75 Collect read 1 completion: -125 75 Collect read 2 completion: -4 75 Collect write completion: 8 75 Collect cancel read 1 completion: 0 75 Collect cancel read 2 completion: -114 75 Collect read 1 completion: -125 thread 75 stuck here The scenario seams extremely artificial but non or less I think it should work regardless of its usefulness. Flo Fischer [1]: https://gitlab.cs.fau.de/aj46ezos/io_uring-cancel/-/tree/minimal-write0