From: Jens Axboe <[email protected]>
To: Linus Torvalds <[email protected]>
Cc: Olivier Langlois <[email protected]>,
Jakub Kicinski <[email protected]>,
io-uring <[email protected]>
Subject: Re: [GIT PULL] io_uring updates for 5.18-rc1
Date: Wed, 1 Jun 2022 13:10:32 -0600 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAHk-=wg9jtV5JWxBudYgoL0GkiYPefuRu47d=L+7701kLWoQaA@mail.gmail.com>
On 6/1/22 12:52 PM, Linus Torvalds wrote:
> On Wed, Jun 1, 2022 at 11:34 AM Jens Axboe <[email protected]> wrote:
>>
>> But as a first step, let's just mark it deprecated with a pr_warn() for
>> 5.20 and then plan to kill it off whenever a suitable amount of relases
>> have passed since that addition.
>
> I'd love to, but it's not actually realistic as things stand now.
> epoll() is used in a *lot* of random libraries. A "pr_warn()" would
> just be senseless noise, I bet.
I mean only for the IORING_OP_EPOLL_CTL opcode, which is the only epoll
connection we have in there. It'd be jumping the gun to do it for the
epoll_ctl syscall for sure... And I really have no personal skin in that
game, other than having a better alternative. But that's obviously a
long pole type of deprecation.
> No, there's a reason that EPOLL is still there, still 'default y',
> even though I dislike it and think it was a mistake, and we've had
> several nasty bugs related to it over the years.
>
> It really can be a very useful system call, it's just that it really
> doesn't work the way the actual ->poll() interface was designed, and
> it kind of hijacks it in ways that mostly work, but the have subtle
> lifetime issues that you don't see with a regular select/poll because
> those will always tear down the wait queues.
>
> Realistically, the proper fix to epoll is likely to make it explicit,
> and make files and drivers that want to support it have to actually
> opt in. Because a lot of the problems have been due to epoll() looking
> *exactly* like a regular poll/select to a driver or a filesystem, but
> having those very subtle extended requirements.
>
> (And no, the extended requirements aren't generally onerous, and
> regular ->poll() works fine for 99% of all cases. It's just that
> occasionally, special users are then fooled about special contexts).
It's not an uncommon approach to make the initial adoption /
implementation more palatable, though commonly then also ends up being a
mistake. I've certainly been guilty of that myself too...
> In other words, it's a bit like our bad old days when "splice()" ended
> up falling back to regular ->read()/->write() implementations with
> set_fs(KERNEL_DS). Yes, that worked fine for 99% of all cases, and we
> did it for years, but it also caused several really nasty issues for
> when the read/write actor did something slightly unusual.
Unfortunately that particular change I just had to deal with, and
noticed that we're up to more than two handfuls of fixes for that and I
bet we're not done. Not saying it wasn't the right choice in terms of
sanity, but it has been more painful than I thought it would be.
> So I may dislike epoll quite intensely, but I don't think we can
> *really* get rid of it. But we might be able to make it a bit more
> controlled.
>
> But so far every time it has caused issues, we've worked around it by
> fixing it up in the particular driver or whatever that ended up being
> triggered by epoll semantics.
The io_uring side of the epoll management I'm very sure can go in a few
releases, and a pr_warn_once() for 5.20 is the right choice. epoll
itself, probably not even down the line, though I am hoping we can
continue to move people off of it. Maybe in another 20 years :-)
--
Jens Axboe
next prev parent reply other threads:[~2022-06-01 20:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-18 21:59 [GIT PULL] io_uring updates for 5.18-rc1 Jens Axboe
2022-03-22 0:25 ` pr-tracker-bot
[not found] ` <20220326122838.19d7193f@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
[not found] ` <[email protected]>
[not found] ` <20220326130615.2d3c6c85@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
[not found] ` <[email protected]>
[not found] ` <[email protected]>
[not found] ` <[email protected]>
2022-06-01 6:59 ` Olivier Langlois
2022-06-01 16:24 ` Jakub Kicinski
2022-06-01 18:09 ` Linus Torvalds
2022-06-01 18:21 ` Jens Axboe
2022-06-01 18:28 ` Linus Torvalds
2022-06-01 18:34 ` Jens Axboe
2022-06-01 18:52 ` Linus Torvalds
2022-06-01 19:10 ` Jens Axboe [this message]
2022-06-01 19:20 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox