public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, Josef <[email protected]>
Cc: Norman Maurer <[email protected]>,
	Dmitry Kadashev <[email protected]>,
	io-uring <[email protected]>
Subject: Re: "Cannot allocate memory" on ring creation (not RLIMIT_MEMLOCK)
Date: Sun, 20 Dec 2020 01:57:28 +0000	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>

On 20/12/2020 00:25, Jens Axboe wrote:
> On 12/19/20 4:42 PM, Pavel Begunkov wrote:
>> On 19/12/2020 23:13, Jens Axboe wrote:
>>> On 12/19/20 2:54 PM, Jens Axboe wrote:
>>>> On 12/19/20 1:51 PM, Josef wrote:
>>>>>> And even more so, it's IOSQE_ASYNC on the IORING_OP_READ on an eventfd
>>>>>> file descriptor. You probably don't want/mean to do that as it's
>>>>>> pollable, I guess it's done because you just set it on all reads for the
>>>>>> test?
>>>>>
>>>>> yes exactly, eventfd fd is blocking, so it actually makes no sense to
>>>>> use IOSQE_ASYNC
>>>>
>>>> Right, and it's pollable too.
>>>>
>>>>> I just tested eventfd without the IOSQE_ASYNC flag, it seems to work
>>>>> in my tests, thanks a lot :)
>>>>>
>>>>>> In any case, it should of course work. This is the leftover trace when
>>>>>> we should be exiting, but an io-wq worker is still trying to get data
>>>>>> from the eventfd:
>>>>>
>>>>> interesting, btw what kind of tool do you use for kernel debugging?
>>>>
>>>> Just poking at it and thinking about it, no hidden magic I'm afraid...
>>>
>>> Josef, can you try with this added? Looks bigger than it is, most of it
>>> is just moving one function below another.
>>
>> Hmm, which kernel revision are you poking? Seems it doesn't match
>> io_uring-5.10, and for 5.11 io_uring_cancel_files() is never called with
>> NULL files.
>>
>> if (!files)
>> 	__io_uring_cancel_task_requests(ctx, task);
>> else
>> 	io_uring_cancel_files(ctx, task, files);
> 
> Yeah, I think I messed up. If files == NULL, then the task is going away.
> So we should cancel all requests that match 'task', not just ones that
> match task && files.
> 
> Not sure I have much more time to look into this before next week, but
> something like that.
> 
> The problem case is the async worker being queued, long before the task
> is killed and the contexts go away. But from exit_files(), we're only
> concerned with canceling if we have inflight. Doesn't look right to me.

Josef, can you test the patch below instead? Following Jens' idea it
cancels more aggressively when a task is killed or exits. It's based
on [1] but would probably apply fine to for-next.

[1] git://git.kernel.dk/linux-block
branch io_uring-5.11, commit dd20166236953c8cd14f4c668bf972af32f0c6be


diff --git a/fs/io_uring.c b/fs/io_uring.c
index f3690dfdd564..3a98e6dd71c0 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8919,8 +8919,6 @@ void __io_uring_files_cancel(struct files_struct *files)
 		struct io_ring_ctx *ctx = file->private_data;
 
 		io_uring_cancel_task_requests(ctx, files);
-		if (files)
-			io_uring_del_task_file(file);
 	}
 
 	atomic_dec(&tctx->in_idle);
@@ -8960,6 +8958,8 @@ static s64 tctx_inflight(struct io_uring_task *tctx)
 void __io_uring_task_cancel(void)
 {
 	struct io_uring_task *tctx = current->io_uring;
+	struct file *file;
+	unsigned long index;
 	DEFINE_WAIT(wait);
 	s64 inflight;
 
@@ -8986,6 +8986,9 @@ void __io_uring_task_cancel(void)
 
 	finish_wait(&tctx->wait, &wait);
 	atomic_dec(&tctx->in_idle);
+
+	xa_for_each(&tctx->xa, index, file)
+		io_uring_del_task_file(file);
 }
 
 static int io_uring_flush(struct file *file, void *data)
diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 35b2d845704d..54925c74aa88 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -48,7 +48,7 @@ static inline void io_uring_task_cancel(void)
 static inline void io_uring_files_cancel(struct files_struct *files)
 {
 	if (current->io_uring && !xa_empty(&current->io_uring->xa))
-		__io_uring_files_cancel(files);
+		__io_uring_task_cancel();
 }
 static inline void io_uring_free(struct task_struct *tsk)
 {

-- 
Pavel Begunkov

  parent reply	other threads:[~2020-12-20  2:01 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-17  8:19 "Cannot allocate memory" on ring creation (not RLIMIT_MEMLOCK) Dmitry Kadashev
2020-12-17  8:26 ` Norman Maurer
2020-12-17  8:36   ` Dmitry Kadashev
2020-12-17  8:40     ` Dmitry Kadashev
2020-12-17 10:38       ` Josef
2020-12-17 11:10         ` Dmitry Kadashev
2020-12-17 13:43           ` Victor Stewart
2020-12-18  9:20             ` Dmitry Kadashev
2020-12-18 17:22               ` Jens Axboe
2020-12-18 15:26 ` Jens Axboe
2020-12-18 17:21   ` Josef
2020-12-18 17:23     ` Jens Axboe
2020-12-19  2:49       ` Josef
2020-12-19 16:13         ` Jens Axboe
2020-12-19 16:29           ` Jens Axboe
2020-12-19 17:11             ` Jens Axboe
2020-12-19 17:34               ` Norman Maurer
2020-12-19 17:38                 ` Jens Axboe
2020-12-19 20:51                   ` Josef
2020-12-19 21:54                     ` Jens Axboe
2020-12-19 23:13                       ` Jens Axboe
2020-12-19 23:42                         ` Josef
2020-12-19 23:42                         ` Pavel Begunkov
2020-12-20  0:25                           ` Jens Axboe
2020-12-20  0:55                             ` Pavel Begunkov
2020-12-21 10:35                               ` Dmitry Kadashev
2020-12-21 10:49                                 ` Dmitry Kadashev
2020-12-21 11:00                                 ` Dmitry Kadashev
2020-12-21 15:36                                   ` Pavel Begunkov
2020-12-22  3:35                                   ` Pavel Begunkov
2020-12-22  4:07                                     ` Pavel Begunkov
2020-12-22 11:04                                       ` Dmitry Kadashev
2020-12-22 11:06                                         ` Dmitry Kadashev
2020-12-22 13:13                                           ` Dmitry Kadashev
2020-12-22 16:33                                         ` Pavel Begunkov
2020-12-23  8:39                                           ` Dmitry Kadashev
2020-12-23  9:38                                             ` Dmitry Kadashev
2020-12-23 11:48                                               ` Dmitry Kadashev
2020-12-23 12:27                                                 ` Pavel Begunkov
2020-12-20  1:57                             ` Pavel Begunkov [this message]
2020-12-20  7:13                               ` Josef
2020-12-20 13:00                                 ` Pavel Begunkov
2020-12-20 14:19                                   ` Pavel Begunkov
2020-12-20 15:56                                     ` Josef
2020-12-20 15:58                                       ` Pavel Begunkov
2020-12-20 16:14                                   ` Jens Axboe
2020-12-20 16:59                                     ` Josef
2020-12-20 18:23                                       ` Josef
2020-12-20 18:41                                         ` Pavel Begunkov
2020-12-21  8:22                                           ` Josef
2020-12-21 15:30                                             ` Pavel Begunkov
2020-12-21 10:31               ` Dmitry Kadashev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox