public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: Linus Torvalds <[email protected]>
Cc: Olivier Langlois <[email protected]>,
	Jakub Kicinski <[email protected]>,
	io-uring <[email protected]>
Subject: Re: [GIT PULL] io_uring updates for 5.18-rc1
Date: Wed, 1 Jun 2022 13:10:32 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAHk-=wg9jtV5JWxBudYgoL0GkiYPefuRu47d=L+7701kLWoQaA@mail.gmail.com>

On 6/1/22 12:52 PM, Linus Torvalds wrote:
> On Wed, Jun 1, 2022 at 11:34 AM Jens Axboe <[email protected]> wrote:
>>
>> But as a first step, let's just mark it deprecated with a pr_warn() for
>> 5.20 and then plan to kill it off whenever a suitable amount of relases
>> have passed since that addition.
> 
> I'd love to, but it's not actually realistic as things stand now.
> epoll() is used in a *lot* of random libraries. A "pr_warn()" would
> just be senseless noise, I bet.

I mean only for the IORING_OP_EPOLL_CTL opcode, which is the only epoll
connection we have in there. It'd be jumping the gun to do it for the
epoll_ctl syscall for sure... And I really have no personal skin in that
game, other than having a better alternative. But that's obviously a
long pole type of deprecation.

> No, there's a reason that EPOLL is still there, still 'default y',
> even though I dislike it and think it was a mistake, and we've had
> several nasty bugs related to it over the years.
> 
> It really can be a very useful system call, it's just that it really
> doesn't work the way the actual ->poll() interface was designed, and
> it kind of hijacks it in ways that mostly work, but the have subtle
> lifetime issues that you don't see with a regular select/poll because
> those will always tear down the wait queues.
> 
> Realistically, the proper fix to epoll is likely to make it explicit,
> and make files and drivers that want to support it have to actually
> opt in. Because a lot of the problems have been due to epoll() looking
> *exactly* like a regular poll/select to a driver or a filesystem, but
> having those very subtle extended requirements.
> 
> (And no, the extended requirements aren't generally onerous, and
> regular ->poll() works fine for 99% of all cases. It's just that
> occasionally, special users are then fooled about special contexts).

It's not an uncommon approach to make the initial adoption /
implementation more palatable, though commonly then also ends up being a
mistake. I've certainly been guilty of that myself too...

> In other words, it's a bit like our bad old days when "splice()" ended
> up falling back to regular ->read()/->write() implementations with
> set_fs(KERNEL_DS). Yes, that worked fine for 99% of all cases, and we
> did it for years, but it also caused several really nasty issues for
> when the read/write actor did something slightly unusual.

Unfortunately that particular change I just had to deal with, and
noticed that we're up to more than two handfuls of fixes for that and I
bet we're not done. Not saying it wasn't the right choice in terms of
sanity, but it has been more painful than I thought it would be.

> So I may dislike epoll quite intensely, but I don't think we can
> *really* get rid of it. But we might be able to make it a bit more
> controlled.
> 
> But so far every time it has caused issues, we've worked around it by
> fixing it up in the particular driver or whatever that ended up being
> triggered by epoll semantics.

The io_uring side of the epoll management I'm very sure can go in a few
releases, and a pr_warn_once() for 5.20 is the right choice. epoll
itself, probably not even down the line, though I am hoping we can
continue to move people off of it. Maybe in another 20 years :-)

-- 
Jens Axboe


  reply	other threads:[~2022-06-01 20:54 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-18 21:59 [GIT PULL] io_uring updates for 5.18-rc1 Jens Axboe
2022-03-22  0:25 ` pr-tracker-bot
2022-03-26 19:28 ` Jakub Kicinski
2022-03-26 19:47   ` Jens Axboe
2022-03-26 20:06     ` Jakub Kicinski
2022-03-26 20:57       ` Jens Axboe
2022-03-26 21:06         ` Jens Axboe
2022-03-26 21:30           ` Jakub Kicinski
2022-03-30 23:30             ` Jakub Kicinski
2022-03-31  0:44               ` Jens Axboe
2022-06-01  6:59             ` Olivier Langlois
2022-06-01 16:24               ` Jakub Kicinski
2022-06-01 18:09               ` Linus Torvalds
2022-06-01 18:21                 ` Jens Axboe
2022-06-01 18:28                   ` Linus Torvalds
2022-06-01 18:34                     ` Jens Axboe
2022-06-01 18:52                       ` Linus Torvalds
2022-06-01 19:10                         ` Jens Axboe [this message]
2022-06-01 19:20                           ` Linus Torvalds
2022-08-16 15:53                           ` Deprecation of IORING_OP_EPOLL_CTL (Re: [GIT PULL] io_uring updates for 5.18-rc1) Stefan Metzmacher
2022-06-01  8:01             ` [GIT PULL] io_uring updates for 5.18-rc1 Olivier Langlois
2022-06-01  6:58       ` Olivier Langlois
2022-06-01  6:58   ` Olivier Langlois
2022-06-01 17:04     ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox