public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Linus Torvalds <[email protected]>
Cc: Olivier Langlois <[email protected]>,
	Jakub Kicinski <[email protected]>,
	io-uring <[email protected]>
Subject: Re: [GIT PULL] io_uring updates for 5.18-rc1
Date: Wed, 1 Jun 2022 13:10:32 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAHk-=wg9jtV5JWxBudYgoL0GkiYPefuRu47d=L+7701kLWoQaA@mail.gmail.com>

On 6/1/22 12:52 PM, Linus Torvalds wrote:
> On Wed, Jun 1, 2022 at 11:34 AM Jens Axboe <[email protected]> wrote:
>>
>> But as a first step, let's just mark it deprecated with a pr_warn() for
>> 5.20 and then plan to kill it off whenever a suitable amount of relases
>> have passed since that addition.
> 
> I'd love to, but it's not actually realistic as things stand now.
> epoll() is used in a *lot* of random libraries. A "pr_warn()" would
> just be senseless noise, I bet.

I mean only for the IORING_OP_EPOLL_CTL opcode, which is the only epoll
connection we have in there. It'd be jumping the gun to do it for the
epoll_ctl syscall for sure... And I really have no personal skin in that
game, other than having a better alternative. But that's obviously a
long pole type of deprecation.

> No, there's a reason that EPOLL is still there, still 'default y',
> even though I dislike it and think it was a mistake, and we've had
> several nasty bugs related to it over the years.
> 
> It really can be a very useful system call, it's just that it really
> doesn't work the way the actual ->poll() interface was designed, and
> it kind of hijacks it in ways that mostly work, but the have subtle
> lifetime issues that you don't see with a regular select/poll because
> those will always tear down the wait queues.
> 
> Realistically, the proper fix to epoll is likely to make it explicit,
> and make files and drivers that want to support it have to actually
> opt in. Because a lot of the problems have been due to epoll() looking
> *exactly* like a regular poll/select to a driver or a filesystem, but
> having those very subtle extended requirements.
> 
> (And no, the extended requirements aren't generally onerous, and
> regular ->poll() works fine for 99% of all cases. It's just that
> occasionally, special users are then fooled about special contexts).

It's not an uncommon approach to make the initial adoption /
implementation more palatable, though commonly then also ends up being a
mistake. I've certainly been guilty of that myself too...

> In other words, it's a bit like our bad old days when "splice()" ended
> up falling back to regular ->read()/->write() implementations with
> set_fs(KERNEL_DS). Yes, that worked fine for 99% of all cases, and we
> did it for years, but it also caused several really nasty issues for
> when the read/write actor did something slightly unusual.

Unfortunately that particular change I just had to deal with, and
noticed that we're up to more than two handfuls of fixes for that and I
bet we're not done. Not saying it wasn't the right choice in terms of
sanity, but it has been more painful than I thought it would be.

> So I may dislike epoll quite intensely, but I don't think we can
> *really* get rid of it. But we might be able to make it a bit more
> controlled.
> 
> But so far every time it has caused issues, we've worked around it by
> fixing it up in the particular driver or whatever that ended up being
> triggered by epoll semantics.

The io_uring side of the epoll management I'm very sure can go in a few
releases, and a pr_warn_once() for 5.20 is the right choice. epoll
itself, probably not even down the line, though I am hoping we can
continue to move people off of it. Maybe in another 20 years :-)

-- 
Jens Axboe



  reply	other threads:[~2022-06-01 20:54 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-18 21:59 [GIT PULL] io_uring updates for 5.18-rc1 Jens Axboe
2022-03-22  0:25 ` pr-tracker-bot
     [not found] ` <20220326122838.19d7193f@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
     [not found]   ` <[email protected]>
     [not found]     ` <20220326130615.2d3c6c85@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
     [not found]       ` <[email protected]>
     [not found]         ` <[email protected]>
     [not found]           ` <[email protected]>
2022-06-01  6:59             ` Olivier Langlois
2022-06-01 16:24               ` Jakub Kicinski
2022-06-01 18:09               ` Linus Torvalds
2022-06-01 18:21                 ` Jens Axboe
2022-06-01 18:28                   ` Linus Torvalds
2022-06-01 18:34                     ` Jens Axboe
2022-06-01 18:52                       ` Linus Torvalds
2022-06-01 19:10                         ` Jens Axboe [this message]
2022-06-01 19:20                           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox