public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.11 24/44] io_uring: fix ->flags races by linked timeouts
       [not found] <[email protected]>
@ 2021-03-25 11:24 ` Sasha Levin
  2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 25/44] io_uring: halt SQO submission on ctx exit Sasha Levin
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-03-25 11:24 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Pavel Begunkov, Jens Axboe, Sasha Levin, io-uring, linux-fsdevel

From: Pavel Begunkov <[email protected]>

[ Upstream commit efe814a471e0e58f28f1efaf430c8784a4f36626 ]

It's racy to modify req->flags from a not owning context, e.g. linked
timeout calling req_set_fail_links() for the master request might race
with that request setting/clearing flags while being executed
concurrently. Just remove req_set_fail_links(prev) from
io_link_timeout_fn(), io_async_find_and_cancel() and functions down the
line take care of setting the fail bit.

Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
 fs/io_uring.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 262fd4cfd3ad..b82fe753a6d0 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6493,7 +6493,6 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
 	spin_unlock_irqrestore(&ctx->completion_lock, flags);
 
 	if (prev) {
-		req_set_fail_links(prev);
 		io_async_find_and_cancel(ctx, req, prev->user_data, -ETIME);
 		io_put_req_deferred(prev, 1);
 	} else {
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.11 25/44] io_uring: halt SQO submission on ctx exit
       [not found] <[email protected]>
  2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 24/44] io_uring: fix ->flags races by linked timeouts Sasha Levin
@ 2021-03-25 11:24 ` Sasha Levin
  2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 37/44] io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls Sasha Levin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-03-25 11:24 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Pavel Begunkov, Jens Axboe, Sasha Levin, linux-fsdevel, io-uring

From: Pavel Begunkov <[email protected]>

[ Upstream commit f6d54255f4235448d4bbe442362d4caa62da97d5 ]

io_sq_thread_finish() is called in io_ring_ctx_free(), so SQPOLL task is
potentially running submitting new requests. It's not a disaster because
of using a "try" variant of percpu_ref_get, but is far from nice.

Remove ctx from the sqd ctx list earlier, before cancellation loop, so
SQPOLL can't find it and so won't submit new requests.

Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
 fs/io_uring.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index b82fe753a6d0..8e45331d1fed 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8797,6 +8797,14 @@ static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
 		__io_cqring_overflow_flush(ctx, true, NULL, NULL);
 	mutex_unlock(&ctx->uring_lock);
 
+	/* prevent SQPOLL from submitting new requests */
+	if (ctx->sq_data) {
+		io_sq_thread_park(ctx->sq_data);
+		list_del_init(&ctx->sqd_list);
+		io_sqd_update_thread_idle(ctx->sq_data);
+		io_sq_thread_unpark(ctx->sq_data);
+	}
+
 	io_kill_timeouts(ctx, NULL, NULL);
 	io_poll_remove_all(ctx, NULL, NULL);
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.11 37/44] io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls
       [not found] <[email protected]>
  2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 24/44] io_uring: fix ->flags races by linked timeouts Sasha Levin
  2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 25/44] io_uring: halt SQO submission on ctx exit Sasha Levin
@ 2021-03-25 11:24 ` Sasha Levin
  2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 44/44] io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL Sasha Levin
       [not found] ` <[email protected]>
  4 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-03-25 11:24 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Stefan Metzmacher, Jens Axboe, Sasha Levin, io-uring,
	linux-fsdevel

From: Stefan Metzmacher <[email protected]>

[ Upstream commit 76cd979f4f38a27df22efb5773a0d567181a9392 ]

We never want to generate any SIGPIPE, -EPIPE only is much better.

Signed-off-by: Stefan Metzmacher <[email protected]>
Link: https://lore.kernel.org/r/38961085c3ec49fd21550c7788f214d1ff02d2d4.1615908477.git.metze@samba.org
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
 fs/io_uring.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8e45331d1fed..4ac24e75ae63 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4645,7 +4645,7 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock,
 		kmsg = &iomsg;
 	}
 
-	flags = req->sr_msg.msg_flags;
+	flags = req->sr_msg.msg_flags | MSG_NOSIGNAL;
 	if (flags & MSG_DONTWAIT)
 		req->flags |= REQ_F_NOWAIT;
 	else if (force_nonblock)
@@ -4689,7 +4689,7 @@ static int io_send(struct io_kiocb *req, bool force_nonblock,
 	msg.msg_controllen = 0;
 	msg.msg_namelen = 0;
 
-	flags = req->sr_msg.msg_flags;
+	flags = req->sr_msg.msg_flags | MSG_NOSIGNAL;
 	if (flags & MSG_DONTWAIT)
 		req->flags |= REQ_F_NOWAIT;
 	else if (force_nonblock)
@@ -4883,7 +4883,7 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock,
 				1, req->sr_msg.len);
 	}
 
-	flags = req->sr_msg.msg_flags;
+	flags = req->sr_msg.msg_flags | MSG_NOSIGNAL;
 	if (flags & MSG_DONTWAIT)
 		req->flags |= REQ_F_NOWAIT;
 	else if (force_nonblock)
@@ -4941,7 +4941,7 @@ static int io_recv(struct io_kiocb *req, bool force_nonblock,
 	msg.msg_iocb = NULL;
 	msg.msg_flags = 0;
 
-	flags = req->sr_msg.msg_flags;
+	flags = req->sr_msg.msg_flags | MSG_NOSIGNAL;
 	if (flags & MSG_DONTWAIT)
 		req->flags |= REQ_F_NOWAIT;
 	else if (force_nonblock)
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.11 44/44] io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL
       [not found] <[email protected]>
                   ` (2 preceding siblings ...)
  2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 37/44] io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls Sasha Levin
@ 2021-03-25 11:24 ` Sasha Levin
       [not found] ` <[email protected]>
  4 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-03-25 11:24 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Stefan Metzmacher, netdev, Jens Axboe, Sasha Levin, io-uring,
	linux-fsdevel

From: Stefan Metzmacher <[email protected]>

[ Upstream commit 0031275d119efe16711cd93519b595e6f9b4b330 ]

Without that it's not safe to use them in a linked combination with
others.

Now combinations like IORING_OP_SENDMSG followed by IORING_OP_SPLICE
should be possible.

We already handle short reads and writes for the following opcodes:

- IORING_OP_READV
- IORING_OP_READ_FIXED
- IORING_OP_READ
- IORING_OP_WRITEV
- IORING_OP_WRITE_FIXED
- IORING_OP_WRITE
- IORING_OP_SPLICE
- IORING_OP_TEE

Now we have it for these as well:

- IORING_OP_SENDMSG
- IORING_OP_SEND
- IORING_OP_RECVMSG
- IORING_OP_RECV

For IORING_OP_RECVMSG we also check for the MSG_TRUNC and MSG_CTRUNC
flags in order to call req_set_fail_links().

There might be applications arround depending on the behavior
that even short send[msg]()/recv[msg]() retuns continue an
IOSQE_IO_LINK chain.

It's very unlikely that such applications pass in MSG_WAITALL,
which is only defined in 'man 2 recvmsg', but not in 'man 2 sendmsg'.

It's expected that the low level sock_sendmsg() call just ignores
MSG_WAITALL, as MSG_ZEROCOPY is also ignored without explicitly set
SO_ZEROCOPY.

We also expect the caller to know about the implicit truncation to
MAX_RW_COUNT, which we don't detect.

cc: [email protected]
Link: https://lore.kernel.org/r/c4e1a4cc0d905314f4d5dc567e65a7b09621aab3.1615908477.git.metze@samba.org
Signed-off-by: Stefan Metzmacher <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
 fs/io_uring.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4ac24e75ae63..c06237c87527 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4625,6 +4625,7 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock,
 	struct io_async_msghdr iomsg, *kmsg;
 	struct socket *sock;
 	unsigned flags;
+	int min_ret = 0;
 	int ret;
 
 	sock = sock_from_file(req->file);
@@ -4651,6 +4652,9 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock,
 	else if (force_nonblock)
 		flags |= MSG_DONTWAIT;
 
+	if (flags & MSG_WAITALL)
+		min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+
 	ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags);
 	if (force_nonblock && ret == -EAGAIN)
 		return io_setup_async_msg(req, kmsg);
@@ -4660,7 +4664,7 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock,
 	if (kmsg->iov != kmsg->fast_iov)
 		kfree(kmsg->iov);
 	req->flags &= ~REQ_F_NEED_CLEANUP;
-	if (ret < 0)
+	if (ret < min_ret)
 		req_set_fail_links(req);
 	__io_req_complete(req, ret, 0, cs);
 	return 0;
@@ -4674,6 +4678,7 @@ static int io_send(struct io_kiocb *req, bool force_nonblock,
 	struct iovec iov;
 	struct socket *sock;
 	unsigned flags;
+	int min_ret = 0;
 	int ret;
 
 	sock = sock_from_file(req->file);
@@ -4695,6 +4700,9 @@ static int io_send(struct io_kiocb *req, bool force_nonblock,
 	else if (force_nonblock)
 		flags |= MSG_DONTWAIT;
 
+	if (flags & MSG_WAITALL)
+		min_ret = iov_iter_count(&msg.msg_iter);
+
 	msg.msg_flags = flags;
 	ret = sock_sendmsg(sock, &msg);
 	if (force_nonblock && ret == -EAGAIN)
@@ -4702,7 +4710,7 @@ static int io_send(struct io_kiocb *req, bool force_nonblock,
 	if (ret == -ERESTARTSYS)
 		ret = -EINTR;
 
-	if (ret < 0)
+	if (ret < min_ret)
 		req_set_fail_links(req);
 	__io_req_complete(req, ret, 0, cs);
 	return 0;
@@ -4854,6 +4862,7 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock,
 	struct socket *sock;
 	struct io_buffer *kbuf;
 	unsigned flags;
+	int min_ret = 0;
 	int ret, cflags = 0;
 
 	sock = sock_from_file(req->file);
@@ -4889,6 +4898,9 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock,
 	else if (force_nonblock)
 		flags |= MSG_DONTWAIT;
 
+	if (flags & MSG_WAITALL)
+		min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+
 	ret = __sys_recvmsg_sock(sock, &kmsg->msg, req->sr_msg.umsg,
 					kmsg->uaddr, flags);
 	if (force_nonblock && ret == -EAGAIN)
@@ -4901,7 +4913,7 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock,
 	if (kmsg->iov != kmsg->fast_iov)
 		kfree(kmsg->iov);
 	req->flags &= ~REQ_F_NEED_CLEANUP;
-	if (ret < 0)
+	if (ret < min_ret || ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))))
 		req_set_fail_links(req);
 	__io_req_complete(req, ret, cflags, cs);
 	return 0;
@@ -4917,6 +4929,7 @@ static int io_recv(struct io_kiocb *req, bool force_nonblock,
 	struct socket *sock;
 	struct iovec iov;
 	unsigned flags;
+	int min_ret = 0;
 	int ret, cflags = 0;
 
 	sock = sock_from_file(req->file);
@@ -4947,6 +4960,9 @@ static int io_recv(struct io_kiocb *req, bool force_nonblock,
 	else if (force_nonblock)
 		flags |= MSG_DONTWAIT;
 
+	if (flags & MSG_WAITALL)
+		min_ret = iov_iter_count(&msg.msg_iter);
+
 	ret = sock_recvmsg(sock, &msg, flags);
 	if (force_nonblock && ret == -EAGAIN)
 		return -EAGAIN;
@@ -4955,7 +4971,7 @@ static int io_recv(struct io_kiocb *req, bool force_nonblock,
 out_free:
 	if (req->flags & REQ_F_BUFFER_SELECTED)
 		cflags = io_put_recv_kbuf(req);
-	if (ret < 0)
+	if (ret < min_ret || ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))))
 		req_set_fail_links(req);
 	__io_req_complete(req, ret, cflags, cs);
 	return 0;
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads
       [not found]     ` <[email protected]>
@ 2021-03-25 12:11       ` Stefan Metzmacher
  2021-03-25 13:38         ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Metzmacher @ 2021-03-25 12:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Sasha Levin, linux-kernel, stable, Jens Axboe, io-uring,
	Linus Torvalds


Am 25.03.21 um 13:04 schrieb Eric W. Biederman:
> Stefan Metzmacher <[email protected]> writes:
> 
>> Am 25.03.21 um 12:24 schrieb Sasha Levin:
>>> From: "Eric W. Biederman" <[email protected]>
>>>
>>> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ]
>>>
>>> Just like we don't allow normal signals to IO threads, don't deliver a
>>> STOP to a task that has PF_IO_WORKER set. The IO threads don't take
>>> signals in general, and have no means of flushing out a stop either.
>>>
>>> Longer term, we may want to look into allowing stop of these threads,
>>> as it relates to eg process freezing. For now, this prevents a spin
>>> issue if a SIGSTOP is delivered to the parent task.
>>>
>>> Reported-by: Stefan Metzmacher <[email protected]>
>>> Signed-off-by: Jens Axboe <[email protected]>
>>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>>> Signed-off-by: Sasha Levin <[email protected]>
>>> ---
>>>  kernel/signal.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>> index 55526b941011..00a3840f6037 100644
>>> --- a/kernel/signal.c
>>> +++ b/kernel/signal.c
>>> @@ -288,7 +288,8 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
>>>  			JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
>>>  	BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
>>>  
>>> -	if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
>>> +	if (unlikely(fatal_signal_pending(task) ||
>>> +		     (task->flags & (PF_EXITING | PF_IO_WORKER))))
>>>  		return false;
>>>  
>>>  	if (mask & JOBCTL_STOP_SIGMASK)
>>>
>>
>> Again, why is this proposed for 5.11 and 5.10 already?
> 
> Has the bit about the io worker kthreads been backported?
> If so this isn't horrible.  If not this is nonsense.

I don't know, I hope not...

But I just tested v5.12-rc4 and attaching to
an application with iothreads with gdb is still not possible,
it still loops forever trying to attach to the iothreads.

And I tested 'kill -9 $pidofiothread', and it feezed the whole
machine...

So there's still work to do in order to get 5.12 stable.

I'm short on time currently, but I hope to send more details soon.

metze

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads
  2021-03-25 12:11       ` [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads Stefan Metzmacher
@ 2021-03-25 13:38         ` Jens Axboe
  2021-03-25 13:56           ` Stefan Metzmacher
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-03-25 13:38 UTC (permalink / raw)
  To: Stefan Metzmacher, Eric W. Biederman
  Cc: Sasha Levin, linux-kernel, stable, io-uring, Linus Torvalds

On 3/25/21 6:11 AM, Stefan Metzmacher wrote:
> 
> Am 25.03.21 um 13:04 schrieb Eric W. Biederman:
>> Stefan Metzmacher <[email protected]> writes:
>>
>>> Am 25.03.21 um 12:24 schrieb Sasha Levin:
>>>> From: "Eric W. Biederman" <[email protected]>
>>>>
>>>> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ]
>>>>
>>>> Just like we don't allow normal signals to IO threads, don't deliver a
>>>> STOP to a task that has PF_IO_WORKER set. The IO threads don't take
>>>> signals in general, and have no means of flushing out a stop either.
>>>>
>>>> Longer term, we may want to look into allowing stop of these threads,
>>>> as it relates to eg process freezing. For now, this prevents a spin
>>>> issue if a SIGSTOP is delivered to the parent task.
>>>>
>>>> Reported-by: Stefan Metzmacher <[email protected]>
>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>>>> Signed-off-by: Sasha Levin <[email protected]>
>>>> ---
>>>>  kernel/signal.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>>> index 55526b941011..00a3840f6037 100644
>>>> --- a/kernel/signal.c
>>>> +++ b/kernel/signal.c
>>>> @@ -288,7 +288,8 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
>>>>  			JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
>>>>  	BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
>>>>  
>>>> -	if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
>>>> +	if (unlikely(fatal_signal_pending(task) ||
>>>> +		     (task->flags & (PF_EXITING | PF_IO_WORKER))))
>>>>  		return false;
>>>>  
>>>>  	if (mask & JOBCTL_STOP_SIGMASK)
>>>>
>>>
>>> Again, why is this proposed for 5.11 and 5.10 already?
>>
>> Has the bit about the io worker kthreads been backported?
>> If so this isn't horrible.  If not this is nonsense.

No not yet - my plan is to do that, but not until we're 100% satisfied
with it.

> I don't know, I hope not...
> 
> But I just tested v5.12-rc4 and attaching to
> an application with iothreads with gdb is still not possible,
> it still loops forever trying to attach to the iothreads.

I do see the looping, gdb apparently doesn't give up when it gets
-EPERM trying to attach to the threads. Which isn't really a kernel
thing, but:

> And I tested 'kill -9 $pidofiothread', and it feezed the whole
> machine...

that sounds very strange, I haven't seen anything like that running
the exact same scenario.

> So there's still work to do in order to get 5.12 stable.
> 
> I'm short on time currently, but I hope to send more details soon.

Thanks! I'll play with it this morning and see if I can provoke
something odd related to STOP/attach.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads
  2021-03-25 13:38         ` Jens Axboe
@ 2021-03-25 13:56           ` Stefan Metzmacher
  2021-03-25 14:02             ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Metzmacher @ 2021-03-25 13:56 UTC (permalink / raw)
  To: Jens Axboe, Eric W. Biederman
  Cc: Sasha Levin, linux-kernel, stable, io-uring, Linus Torvalds

Am 25.03.21 um 14:38 schrieb Jens Axboe:
> On 3/25/21 6:11 AM, Stefan Metzmacher wrote:
>>
>> Am 25.03.21 um 13:04 schrieb Eric W. Biederman:
>>> Stefan Metzmacher <[email protected]> writes:
>>>
>>>> Am 25.03.21 um 12:24 schrieb Sasha Levin:
>>>>> From: "Eric W. Biederman" <[email protected]>
>>>>>
>>>>> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ]
>>>>>
>>>>> Just like we don't allow normal signals to IO threads, don't deliver a
>>>>> STOP to a task that has PF_IO_WORKER set. The IO threads don't take
>>>>> signals in general, and have no means of flushing out a stop either.
>>>>>
>>>>> Longer term, we may want to look into allowing stop of these threads,
>>>>> as it relates to eg process freezing. For now, this prevents a spin
>>>>> issue if a SIGSTOP is delivered to the parent task.
>>>>>
>>>>> Reported-by: Stefan Metzmacher <[email protected]>
>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>>>>> Signed-off-by: Sasha Levin <[email protected]>
>>>>> ---
>>>>>  kernel/signal.c | 3 ++-
>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>>>> index 55526b941011..00a3840f6037 100644
>>>>> --- a/kernel/signal.c
>>>>> +++ b/kernel/signal.c
>>>>> @@ -288,7 +288,8 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
>>>>>  			JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
>>>>>  	BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
>>>>>  
>>>>> -	if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
>>>>> +	if (unlikely(fatal_signal_pending(task) ||
>>>>> +		     (task->flags & (PF_EXITING | PF_IO_WORKER))))
>>>>>  		return false;
>>>>>  
>>>>>  	if (mask & JOBCTL_STOP_SIGMASK)
>>>>>
>>>>
>>>> Again, why is this proposed for 5.11 and 5.10 already?
>>>
>>> Has the bit about the io worker kthreads been backported?
>>> If so this isn't horrible.  If not this is nonsense.
> 
> No not yet - my plan is to do that, but not until we're 100% satisfied
> with it.

Do you understand why the patches where autoselected for 5.11 and 5.10?

>> I don't know, I hope not...
>>
>> But I just tested v5.12-rc4 and attaching to
>> an application with iothreads with gdb is still not possible,
>> it still loops forever trying to attach to the iothreads.
> 
> I do see the looping, gdb apparently doesn't give up when it gets
> -EPERM trying to attach to the threads. Which isn't really a kernel
> thing, but:

Maybe we need to remove the iothreads from /proc/pid/tasks/

>> And I tested 'kill -9 $pidofiothread', and it feezed the whole
>> machine...
> 
> that sounds very strange, I haven't seen anything like that running
> the exact same scenario.
> 
>> So there's still work to do in order to get 5.12 stable.
>>
>> I'm short on time currently, but I hope to send more details soon.
> 
> Thanks! I'll play with it this morning and see if I can provoke
> something odd related to STOP/attach.

Thanks!

Somehow I have the impression that your same_thread_group_account patch
may fix a lot of things...

metze


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads
  2021-03-25 13:56           ` Stefan Metzmacher
@ 2021-03-25 14:02             ` Jens Axboe
  2021-03-25 15:00               ` Sasha Levin
  2021-03-25 15:10               ` Jens Axboe
  0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2021-03-25 14:02 UTC (permalink / raw)
  To: Stefan Metzmacher, Eric W. Biederman
  Cc: Sasha Levin, linux-kernel, stable, io-uring, Linus Torvalds

On 3/25/21 7:56 AM, Stefan Metzmacher wrote:
> Am 25.03.21 um 14:38 schrieb Jens Axboe:
>> On 3/25/21 6:11 AM, Stefan Metzmacher wrote:
>>>
>>> Am 25.03.21 um 13:04 schrieb Eric W. Biederman:
>>>> Stefan Metzmacher <[email protected]> writes:
>>>>
>>>>> Am 25.03.21 um 12:24 schrieb Sasha Levin:
>>>>>> From: "Eric W. Biederman" <[email protected]>
>>>>>>
>>>>>> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ]
>>>>>>
>>>>>> Just like we don't allow normal signals to IO threads, don't deliver a
>>>>>> STOP to a task that has PF_IO_WORKER set. The IO threads don't take
>>>>>> signals in general, and have no means of flushing out a stop either.
>>>>>>
>>>>>> Longer term, we may want to look into allowing stop of these threads,
>>>>>> as it relates to eg process freezing. For now, this prevents a spin
>>>>>> issue if a SIGSTOP is delivered to the parent task.
>>>>>>
>>>>>> Reported-by: Stefan Metzmacher <[email protected]>
>>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>>>>>> Signed-off-by: Sasha Levin <[email protected]>
>>>>>> ---
>>>>>>  kernel/signal.c | 3 ++-
>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>>>>> index 55526b941011..00a3840f6037 100644
>>>>>> --- a/kernel/signal.c
>>>>>> +++ b/kernel/signal.c
>>>>>> @@ -288,7 +288,8 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
>>>>>>  			JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
>>>>>>  	BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
>>>>>>  
>>>>>> -	if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
>>>>>> +	if (unlikely(fatal_signal_pending(task) ||
>>>>>> +		     (task->flags & (PF_EXITING | PF_IO_WORKER))))
>>>>>>  		return false;
>>>>>>  
>>>>>>  	if (mask & JOBCTL_STOP_SIGMASK)
>>>>>>
>>>>>
>>>>> Again, why is this proposed for 5.11 and 5.10 already?
>>>>
>>>> Has the bit about the io worker kthreads been backported?
>>>> If so this isn't horrible.  If not this is nonsense.
>>
>> No not yet - my plan is to do that, but not until we're 100% satisfied
>> with it.
> 
> Do you understand why the patches where autoselected for 5.11 and 5.10?

As far as I know, selections like these (AUTOSEL) are done by some bot
that uses whatever criteria to see if they should be applied for earlier
revisions. I'm sure Sasha can expand on that :-)

Hence it's reasonable to expect that sometimes it'll pick patches that
should not go into stable, at least not just yet. It's important to
understand that this message is just a notice that it's queued up for
stable -rc, not that it's _in_ stable just yet. There's time to object.

>>> I don't know, I hope not...
>>>
>>> But I just tested v5.12-rc4 and attaching to
>>> an application with iothreads with gdb is still not possible,
>>> it still loops forever trying to attach to the iothreads.
>>
>> I do see the looping, gdb apparently doesn't give up when it gets
>> -EPERM trying to attach to the threads. Which isn't really a kernel
>> thing, but:
> 
> Maybe we need to remove the iothreads from /proc/pid/tasks/

Is that how it finds them? It's arguably a bug in gdb that it just
keeps retrying, but it would be nice if we can ensure that it just
ignores them. Because if gdb triggers something like that, probably
others too...

>>> And I tested 'kill -9 $pidofiothread', and it feezed the whole
>>> machine...
>>
>> that sounds very strange, I haven't seen anything like that running
>> the exact same scenario.
>>
>>> So there's still work to do in order to get 5.12 stable.
>>>
>>> I'm short on time currently, but I hope to send more details soon.
>>
>> Thanks! I'll play with it this morning and see if I can provoke
>> something odd related to STOP/attach.
> 
> Thanks!
> 
> Somehow I have the impression that your same_thread_group_account patch
> may fix a lot of things...

Maybe? I'll look closer.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads
  2021-03-25 14:02             ` Jens Axboe
@ 2021-03-25 15:00               ` Sasha Levin
  2021-03-25 15:10               ` Jens Axboe
  1 sibling, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-03-25 15:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Stefan Metzmacher, Eric W. Biederman, linux-kernel, stable,
	io-uring, Linus Torvalds

On Thu, Mar 25, 2021 at 08:02:11AM -0600, Jens Axboe wrote:
>On 3/25/21 7:56 AM, Stefan Metzmacher wrote:
>> Am 25.03.21 um 14:38 schrieb Jens Axboe:
>>> On 3/25/21 6:11 AM, Stefan Metzmacher wrote:
>>>>
>>>> Am 25.03.21 um 13:04 schrieb Eric W. Biederman:
>>>>> Stefan Metzmacher <[email protected]> writes:
>>>>>
>>>>>> Am 25.03.21 um 12:24 schrieb Sasha Levin:
>>>>>>> From: "Eric W. Biederman" <[email protected]>
>>>>>>>
>>>>>>> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ]
>>>>>>>
>>>>>>> Just like we don't allow normal signals to IO threads, don't deliver a
>>>>>>> STOP to a task that has PF_IO_WORKER set. The IO threads don't take
>>>>>>> signals in general, and have no means of flushing out a stop either.
>>>>>>>
>>>>>>> Longer term, we may want to look into allowing stop of these threads,
>>>>>>> as it relates to eg process freezing. For now, this prevents a spin
>>>>>>> issue if a SIGSTOP is delivered to the parent task.
>>>>>>>
>>>>>>> Reported-by: Stefan Metzmacher <[email protected]>
>>>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>>>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>>>>>>> Signed-off-by: Sasha Levin <[email protected]>
>>>>>>> ---
>>>>>>>  kernel/signal.c | 3 ++-
>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>>>>>> index 55526b941011..00a3840f6037 100644
>>>>>>> --- a/kernel/signal.c
>>>>>>> +++ b/kernel/signal.c
>>>>>>> @@ -288,7 +288,8 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
>>>>>>>  			JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
>>>>>>>  	BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
>>>>>>>
>>>>>>> -	if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
>>>>>>> +	if (unlikely(fatal_signal_pending(task) ||
>>>>>>> +		     (task->flags & (PF_EXITING | PF_IO_WORKER))))
>>>>>>>  		return false;
>>>>>>>
>>>>>>>  	if (mask & JOBCTL_STOP_SIGMASK)
>>>>>>>
>>>>>>
>>>>>> Again, why is this proposed for 5.11 and 5.10 already?
>>>>>
>>>>> Has the bit about the io worker kthreads been backported?
>>>>> If so this isn't horrible.  If not this is nonsense.
>>>
>>> No not yet - my plan is to do that, but not until we're 100% satisfied
>>> with it.
>>
>> Do you understand why the patches where autoselected for 5.11 and 5.10?
>
>As far as I know, selections like these (AUTOSEL) are done by some bot
>that uses whatever criteria to see if they should be applied for earlier
>revisions. I'm sure Sasha can expand on that :-)

Right, it's just heuristics that help catch commits that don't have a
stable tag but should have one.

>Hence it's reasonable to expect that sometimes it'll pick patches that
>should not go into stable, at least not just yet. It's important to
>understand that this message is just a notice that it's queued up for
>stable -rc, not that it's _in_ stable just yet. There's time to object.

Right, it's even more than that: this mail (tagged with "AUTOSEL") is a
notification that happens at least a week before the patch will go in
the stable queue.

If you think any AUTOSEL patches don't need to be backported, it's
usually enough to just quickly nack them.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads
  2021-03-25 14:02             ` Jens Axboe
  2021-03-25 15:00               ` Sasha Levin
@ 2021-03-25 15:10               ` Jens Axboe
  1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2021-03-25 15:10 UTC (permalink / raw)
  To: Stefan Metzmacher, Eric W. Biederman
  Cc: Sasha Levin, linux-kernel, stable, io-uring, Linus Torvalds

On 3/25/21 8:02 AM, Jens Axboe wrote:
> On 3/25/21 7:56 AM, Stefan Metzmacher wrote:
>> Am 25.03.21 um 14:38 schrieb Jens Axboe:
>>> On 3/25/21 6:11 AM, Stefan Metzmacher wrote:
>>>>
>>>> Am 25.03.21 um 13:04 schrieb Eric W. Biederman:
>>>>> Stefan Metzmacher <[email protected]> writes:
>>>>>
>>>>>> Am 25.03.21 um 12:24 schrieb Sasha Levin:
>>>>>>> From: "Eric W. Biederman" <[email protected]>
>>>>>>>
>>>>>>> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ]
>>>>>>>
>>>>>>> Just like we don't allow normal signals to IO threads, don't deliver a
>>>>>>> STOP to a task that has PF_IO_WORKER set. The IO threads don't take
>>>>>>> signals in general, and have no means of flushing out a stop either.
>>>>>>>
>>>>>>> Longer term, we may want to look into allowing stop of these threads,
>>>>>>> as it relates to eg process freezing. For now, this prevents a spin
>>>>>>> issue if a SIGSTOP is delivered to the parent task.
>>>>>>>
>>>>>>> Reported-by: Stefan Metzmacher <[email protected]>
>>>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>>>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>>>>>>> Signed-off-by: Sasha Levin <[email protected]>
>>>>>>> ---
>>>>>>>  kernel/signal.c | 3 ++-
>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>>>>>> index 55526b941011..00a3840f6037 100644
>>>>>>> --- a/kernel/signal.c
>>>>>>> +++ b/kernel/signal.c
>>>>>>> @@ -288,7 +288,8 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
>>>>>>>  			JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
>>>>>>>  	BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
>>>>>>>  
>>>>>>> -	if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
>>>>>>> +	if (unlikely(fatal_signal_pending(task) ||
>>>>>>> +		     (task->flags & (PF_EXITING | PF_IO_WORKER))))
>>>>>>>  		return false;
>>>>>>>  
>>>>>>>  	if (mask & JOBCTL_STOP_SIGMASK)
>>>>>>>
>>>>>>
>>>>>> Again, why is this proposed for 5.11 and 5.10 already?
>>>>>
>>>>> Has the bit about the io worker kthreads been backported?
>>>>> If so this isn't horrible.  If not this is nonsense.
>>>
>>> No not yet - my plan is to do that, but not until we're 100% satisfied
>>> with it.
>>
>> Do you understand why the patches where autoselected for 5.11 and 5.10?
> 
> As far as I know, selections like these (AUTOSEL) are done by some bot
> that uses whatever criteria to see if they should be applied for earlier
> revisions. I'm sure Sasha can expand on that :-)
> 
> Hence it's reasonable to expect that sometimes it'll pick patches that
> should not go into stable, at least not just yet. It's important to
> understand that this message is just a notice that it's queued up for
> stable -rc, not that it's _in_ stable just yet. There's time to object.
> 
>>>> I don't know, I hope not...
>>>>
>>>> But I just tested v5.12-rc4 and attaching to
>>>> an application with iothreads with gdb is still not possible,
>>>> it still loops forever trying to attach to the iothreads.
>>>
>>> I do see the looping, gdb apparently doesn't give up when it gets
>>> -EPERM trying to attach to the threads. Which isn't really a kernel
>>> thing, but:
>>
>> Maybe we need to remove the iothreads from /proc/pid/tasks/
> 
> Is that how it finds them? It's arguably a bug in gdb that it just
> keeps retrying, but it would be nice if we can ensure that it just
> ignores them. Because if gdb triggers something like that, probably
> others too...
> 
>>>> And I tested 'kill -9 $pidofiothread', and it feezed the whole
>>>> machine...
>>>
>>> that sounds very strange, I haven't seen anything like that running
>>> the exact same scenario.
>>>
>>>> So there's still work to do in order to get 5.12 stable.
>>>>
>>>> I'm short on time currently, but I hope to send more details soon.
>>>
>>> Thanks! I'll play with it this morning and see if I can provoke
>>> something odd related to STOP/attach.
>>
>> Thanks!
>>
>> Somehow I have the impression that your same_thread_group_account patch
>> may fix a lot of things...
> 
> Maybe? I'll look closer.

It needs a bit more love than that. If you have threads already in your
app, then we just want to skip over the PF_IO_WORKER threads. We can't
just terminate the loop.

Something like the below works for me.


diff --git a/fs/proc/base.c b/fs/proc/base.c
index 3851bfcdba56..abff2fe10bfa 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3723,7 +3723,7 @@ static struct task_struct *first_tid(struct pid *pid, int tid, loff_t f_pos,
 	 */
 	pos = task = task->group_leader;
 	do {
-		if (!nr--)
+		if (same_thread_group(task, pos) && !nr--)
 			goto found;
 	} while_each_thread(task, pos);
 fail:
@@ -3744,16 +3744,22 @@ static struct task_struct *first_tid(struct pid *pid, int tid, loff_t f_pos,
  */
 static struct task_struct *next_tid(struct task_struct *start)
 {
-	struct task_struct *pos = NULL;
+	struct task_struct *tmp, *pos = NULL;
+
 	rcu_read_lock();
-	if (pid_alive(start)) {
-		pos = next_thread(start);
-		if (thread_group_leader(pos))
-			pos = NULL;
-		else
-			get_task_struct(pos);
+	if (!pid_alive(start))
+		goto no_thread;
+	list_for_each_entry_rcu(tmp, &start->thread_group, thread_group) {
+		if (!thread_group_leader(tmp) && same_thread_group(start, tmp)) {
+			get_task_struct(tmp);
+			pos = tmp;
+			break;
+		}
 	}
+no_thread:
 	rcu_read_unlock();
+	if (!pos)
+		return NULL;
 	put_task_struct(start);
 	return pos;
 }
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 3f6a0fcaa10c..4f621e386abf 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -668,11 +668,18 @@ static inline bool thread_group_leader(struct task_struct *p)
 }
 
 static inline
-bool same_thread_group(struct task_struct *p1, struct task_struct *p2)
+bool same_thread_group_account(struct task_struct *p1, struct task_struct *p2)
 {
 	return p1->signal == p2->signal;
 }
 
+static inline
+bool same_thread_group(struct task_struct *p1, struct task_struct *p2)
+{
+	return same_thread_group_account(p1, p2) &&
+			!((p1->flags | p2->flags) & PF_IO_WORKER);
+}
+
 static inline struct task_struct *next_thread(const struct task_struct *p)
 {
 	return list_entry_rcu(p->thread_group.next,
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5f611658eeab..625110cacc2a 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -307,7 +307,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
 	 * those pending times and rely only on values updated on tick or
 	 * other scheduler action.
 	 */
-	if (same_thread_group(current, tsk))
+	if (same_thread_group_account(current, tsk))
 		(void) task_sched_runtime(current);
 
 	rcu_read_lock();

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-25 15:11 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <[email protected]>
2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 24/44] io_uring: fix ->flags races by linked timeouts Sasha Levin
2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 25/44] io_uring: halt SQO submission on ctx exit Sasha Levin
2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 37/44] io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls Sasha Levin
2021-03-25 11:24 ` [PATCH AUTOSEL 5.11 44/44] io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL Sasha Levin
     [not found] ` <[email protected]>
     [not found]   ` <[email protected]>
     [not found]     ` <[email protected]>
2021-03-25 12:11       ` [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads Stefan Metzmacher
2021-03-25 13:38         ` Jens Axboe
2021-03-25 13:56           ` Stefan Metzmacher
2021-03-25 14:02             ` Jens Axboe
2021-03-25 15:00               ` Sasha Levin
2021-03-25 15:10               ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox