public inbox for [email protected]
 help / color / mirror / Atom feed
* [GIT PULL] io_uring fix for 5.11 final
@ 2021-02-12 14:08 Jens Axboe
  2021-02-12 19:51 ` Linus Torvalds
  2021-02-12 20:12 ` pr-tracker-bot
  0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2021-02-12 14:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring

Hi Linus,

Revert of a patch from this release that caused a regression. Please
pull!


The following changes since commit aec18a57edad562d620f7d19016de1fc0cc2208c:

  io_uring: drop mm/files between task_work_submit (2021-02-04 12:42:58 -0700)

are available in the Git repository at:

  git://git.kernel.dk/linux-block.git tags/io_uring-5.11-2021-02-12

for you to fetch changes up to 92c75f7594d5060a4cb240f0e987a802f8486b11:

  Revert "io_uring: don't take fs for recvmsg/sendmsg" (2021-02-10 12:37:58 -0700)

----------------------------------------------------------------
io_uring-5.11-2021-02-12

----------------------------------------------------------------
Jens Axboe (1):
      Revert "io_uring: don't take fs for recvmsg/sendmsg"

 fs/io_uring.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 14:08 [GIT PULL] io_uring fix for 5.11 final Jens Axboe
@ 2021-02-12 19:51 ` Linus Torvalds
  2021-02-12 20:04   ` Jens Axboe
  2021-02-12 20:12 ` pr-tracker-bot
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2021-02-12 19:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On Fri, Feb 12, 2021 at 6:08 AM Jens Axboe <[email protected]> wrote:
>
> Revert of a patch from this release that caused a regression.

I really wish that revert had more of an explanation of _why_ it
needed to retain the ->fs pointer.

I can guess at it, but considering that we got it wrong once, it might
be better to not have to guess, and have it documented.

            Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 19:51 ` Linus Torvalds
@ 2021-02-12 20:04   ` Jens Axboe
  2021-02-12 20:07     ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-02-12 20:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring

On 2/12/21 12:51 PM, Linus Torvalds wrote:
> On Fri, Feb 12, 2021 at 6:08 AM Jens Axboe <[email protected]> wrote:
>>
>> Revert of a patch from this release that caused a regression.
> 
> I really wish that revert had more of an explanation of _why_ it
> needed to retain the ->fs pointer.
> 
> I can guess at it, but considering that we got it wrong once, it might
> be better to not have to guess, and have it documented.

We end up doing path resolution for certain cases, so having the
right fs_struct ends up being paramount. Do you want me to update the
commit with a fuller description?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 20:04   ` Jens Axboe
@ 2021-02-12 20:07     ` Linus Torvalds
  2021-02-12 20:09       ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2021-02-12 20:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On Fri, Feb 12, 2021 at 12:04 PM Jens Axboe <[email protected]> wrote:
>
> We end up doing path resolution for certain cases, so having the
> right fs_struct ends up being paramount. Do you want me to update the
> commit with a fuller description?

I've pulled it, but please keep it in mind.

What are the "certain cases" that need path resolution for sendmsg?
I'm assuming AF_UNIX dgram case, but it really would be interesting to
just document this.

Which was kind of my point.

               Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 20:07     ` Linus Torvalds
@ 2021-02-12 20:09       ` Jens Axboe
  2021-02-12 20:15         ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-02-12 20:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring

On 2/12/21 1:07 PM, Linus Torvalds wrote:
> On Fri, Feb 12, 2021 at 12:04 PM Jens Axboe <[email protected]> wrote:
>>
>> We end up doing path resolution for certain cases, so having the
>> right fs_struct ends up being paramount. Do you want me to update the
>> commit with a fuller description?
> 
> I've pulled it, but please keep it in mind.

OK thanks, will keep it in mind for future changes.

> What are the "certain cases" that need path resolution for sendmsg?
> I'm assuming AF_UNIX dgram case, but it really would be interesting to
> just document this.

Right, it's exactly the AF_UNIX dgram case. Working on adding some checks
that means we'll catch this sort of thing upfront while testing.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 14:08 [GIT PULL] io_uring fix for 5.11 final Jens Axboe
  2021-02-12 19:51 ` Linus Torvalds
@ 2021-02-12 20:12 ` pr-tracker-bot
  1 sibling, 0 replies; 10+ messages in thread
From: pr-tracker-bot @ 2021-02-12 20:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linus Torvalds, io-uring

The pull request you sent on Fri, 12 Feb 2021 07:08:31 -0700:

> git://git.kernel.dk/linux-block.git tags/io_uring-5.11-2021-02-12

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/c6d8570e4d642a0c0bfbe7362ffa1b1433c72db1

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 20:09       ` Jens Axboe
@ 2021-02-12 20:15         ` Linus Torvalds
  2021-02-12 20:22           ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2021-02-12 20:15 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On Fri, Feb 12, 2021 at 12:10 PM Jens Axboe <[email protected]> wrote:
>
> Right, it's exactly the AF_UNIX dgram case. Working on adding some checks
> that means we'll catch this sort of thing upfront while testing.

You might also just add a comment to the IORING_OP_{SEND,RECV}MSG
cases to the work-flags.

It doesn't hurt to just mention those kinds of things explicitly.

Because maybe somebody decides that IO_WQ_WORK_FS is very expensive
for their workload. With the comment they might then be able to say
"let's set it only for the AF_UNIX case" or some similar optimization.

Yeah, it probably doesn't matter, but just as a policy, I think "we
got this wrong, so let's clarify" is a good idea.

              Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 20:15         ` Linus Torvalds
@ 2021-02-12 20:22           ` Jens Axboe
  2021-02-12 20:29             ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2021-02-12 20:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring

On 2/12/21 1:15 PM, Linus Torvalds wrote:
> On Fri, Feb 12, 2021 at 12:10 PM Jens Axboe <[email protected]> wrote:
>>
>> Right, it's exactly the AF_UNIX dgram case. Working on adding some checks
>> that means we'll catch this sort of thing upfront while testing.
> 
> You might also just add a comment to the IORING_OP_{SEND,RECV}MSG
> cases to the work-flags.
> 
> It doesn't hurt to just mention those kinds of things explicitly.
> 
> Because maybe somebody decides that IO_WQ_WORK_FS is very expensive
> for their workload. With the comment they might then be able to say
> "let's set it only for the AF_UNIX case" or some similar optimization.
> 
> Yeah, it probably doesn't matter, but just as a policy, I think "we
> got this wrong, so let's clarify" is a good idea.

I'll start with the comment, that's certainly the easiest part and
will help catch changes like that in the future. My other idea was
to add a check in path resolution that catches it, for future safe
guards outside of send/recvmsg. That's obviously a separate change
from the comment, but would be nice to have.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 20:22           ` Jens Axboe
@ 2021-02-12 20:29             ` Linus Torvalds
  2021-02-12 20:42               ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2021-02-12 20:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

On Fri, Feb 12, 2021 at 12:22 PM Jens Axboe <[email protected]> wrote:
>
>   My other idea was
> to add a check in path resolution that catches it, for future safe
> guards outside of send/recvmsg. That's obviously a separate change
> from the comment, but would be nice to have.

I wonder how painful it would be to just make sure that kernel threads
have a NULL ->fs and ->files by default.

But maybe the oops in a kernel thread ends up being too painful and
harder to debug and deal with, and a special check is preferred just
because a WARN_ON_ONCE() wouldn't cause any other downstream issues.

Yes, some kernel threads do need to do pathname lookups (and io_uring
isn't the only such thing), but I think they are generally fairly
special cases (ie things like usermode helper, firmware loaders, etc).

           Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] io_uring fix for 5.11 final
  2021-02-12 20:29             ` Linus Torvalds
@ 2021-02-12 20:42               ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2021-02-12 20:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: io-uring

On 2/12/21 1:29 PM, Linus Torvalds wrote:
> On Fri, Feb 12, 2021 at 12:22 PM Jens Axboe <[email protected]> wrote:
>>
>>   My other idea was
>> to add a check in path resolution that catches it, for future safe
>> guards outside of send/recvmsg. That's obviously a separate change
>> from the comment, but would be nice to have.
> 
> I wonder how painful it would be to just make sure that kernel threads
> have a NULL ->fs and ->files by default.
> 
> But maybe the oops in a kernel thread ends up being too painful and
> harder to debug and deal with, and a special check is preferred just
> because a WARN_ON_ONCE() wouldn't cause any other downstream issues.

That's not a bad idea, and better than checking against init_fs etc
at runtime.

> Yes, some kernel threads do need to do pathname lookups (and io_uring
> isn't the only such thing), but I think they are generally fairly
> special cases (ie things like usermode helper, firmware loaders, etc).

Either that, or just do it for the io_uring workers and call it Good
Enough for now. The downside is that we'd crash if we did have an
issue like this, here's the example from before you pulled that revert
with explicit clearing of fs/files:

BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1351
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 29189, name: io_wqe_worker-0
CPU: 0 PID: 29189 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc7+ #9248
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Call Trace:
 dump_stack+0x57/0x6a
 ___might_sleep.cold+0x87/0x94
 do_user_addr_fault+0x12b/0x470
 exc_page_fault+0x6d/0x150
 asm_exc_page_fault+0x1b/0x20
RIP: 0010:set_root+0x33/0xe0
Code: 41 55 41 54 65 48 8b 04 25 00 6d 01 00 53 4c 8b a0 58 07 00 00 8b 47 38 a9 00 00 18 00 0f 85 9e 00 00 00 a8 40 48 89 fb 74 50 <41> 8b 4c 24 08 f6 c1 01 75 42 49 8b 44 24 18 49 8b 54 24 20 48 89
RSP: 0018:ffffc9000074f8c0 EFLAGS: 00010202
RAX: 0000000000001041 RBX: ffffc9000074f970 RCX: ffffc9000074fa90
RDX: 0000000000000000 RSI: 0000000000000041 RDI: ffffc9000074f970
RBP: ffffc9000074f8d8 R08: 0000000000000000 R09: ffffc9000074fb28
R10: 0000000000000000 R11: ffff888100a0ae00 R12: 0000000000000000
R13: ffff888100bc5020 R14: 0000000000000040 R15: 0000000000000010
 nd_jump_root+0xc5/0x100
 path_init+0x2d7/0x400
 path_lookupat+0x1e/0x1c0
 filename_lookup+0xaa/0x170
 ? __kmalloc_node_track_caller+0x5a/0x2a0
 ? kmem_cache_alloc+0x3a/0x1e0
 ? getname_kernel+0x70/0xf0
 unix_find_other+0x47/0x2a0
 ? skb_copy_datagram_from_iter+0x61/0x1c0
 unix_dgram_sendmsg+0x550/0xa80
 sock_sendmsg+0x33/0x40
 ____sys_sendmsg+0x1f3/0x210
 __sys_sendmsg_sock+0x1d/0x30
 io_sendmsg+0x162/0x1e0
 ? free_unref_page_commit.constprop.0+0x85/0xf0
 ? unfreeze_partials+0x238/0x290
 io_issue_sqe+0x600/0x1110
 ? __io_free_req+0x6c/0xd0
 ? kmem_cache_free+0x26d/0x280
 io_wq_submit_work+0xa1/0x100
 ? io_assign_current_work+0x4e/0x70
 io_worker_handle_work+0x29f/0x4c0
 ? io_wqe_worker+0xad/0x3c0
 io_wqe_worker+0x30f/0x3c0
 ? io_worker_handle_work+0x4c0/0x4c0
 kthread+0x13b/0x160
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x1f/0x30
BUG: kernel NULL pointer dereference, address: 0000000000000008

which clearly shows the path to misery. Not an easy call, because one
path is a potential security issue, the other is crashing with a NULL
pointer deref...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-02-12 20:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-12 14:08 [GIT PULL] io_uring fix for 5.11 final Jens Axboe
2021-02-12 19:51 ` Linus Torvalds
2021-02-12 20:04   ` Jens Axboe
2021-02-12 20:07     ` Linus Torvalds
2021-02-12 20:09       ` Jens Axboe
2021-02-12 20:15         ` Linus Torvalds
2021-02-12 20:22           ` Jens Axboe
2021-02-12 20:29             ` Linus Torvalds
2021-02-12 20:42               ` Jens Axboe
2021-02-12 20:12 ` pr-tracker-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox