From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, Andres Freund <[email protected]>
Cc: [email protected]
Subject: Re: Deduplicate io_*_prep calls?
Date: Mon, 24 Feb 2020 18:44:08 +0300 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
[-- Attachment #1.1: Type: text/plain, Size: 4939 bytes --]
On 24/02/2020 18:40, Jens Axboe wrote:
> On 2/24/20 12:12 AM, Andres Freund wrote:
>> Hi,
>>
>> On 2020-02-23 20:52:26 -0700, Jens Axboe wrote:
>>> The fast case is not being deferred, that's by far the common (and hot)
>>> case, which means io_issue() is called with sqe != NULL. My worry is
>>> that by moving it into a prep helper, the compiler isn't smart enough to
>>> not make that basically two switches.
>>
>> I'm not sure that benefit of a single switch isn't offset by the lower
>> code density due to the additional per-opcode branches. Not inlining
>> the prepare function results in:
>>
>> $ size fs/io_uring.o fs/io_uring.before.o
>> text data bss dec hex filename
>> 75383 8237 8 83628 146ac fs/io_uring.o
>> 76959 8237 8 85204 14cd4 fs/io_uring.before.o
>>
>> symbol size
>> -io_close_prep 0000000000000066
>> -io_connect_prep 0000000000000051
>> -io_epoll_ctl_prep 0000000000000051
>> -io_issue_sqe 0000000000001101
>> +io_issue_sqe 0000000000000de9
>> -io_openat2_prep 00000000000000ed
>> -io_openat_prep 0000000000000089
>> -io_poll_add_prep 0000000000000056
>> -io_prep_fsync 0000000000000053
>> -io_prep_sfr 000000000000004e
>> -io_read_prep 00000000000000ca
>> -io_recvmsg_prep 0000000000000079
>> -io_req_defer_prep 000000000000058e
>> +io_req_defer_prep 0000000000000160
>> +io_req_prep 0000000000000d26
>> -io_sendmsg_prep 000000000000006b
>> -io_statx_prep 00000000000000ed
>> -io_write_prep 00000000000000cd
>>
>>
>>
>>> Feel free to prove me wrong, I'd love to reduce it ;-)
>>
>> With a bit of handholding the compiler can deduplicate the switches. It
>> can't recognize on its own that req->opcode can't change between the
>> switch for prep and issue. Can be solved by moving the opcode into a
>> temporary variable. Also needs an inline for io_req_prep (not surpring,
>> it's a bit large).
>>
>> That results in a bit bigger code. That's partially because of more
>> inlining:
>> text data bss dec hex filename
>> 78291 8237 8 86536 15208 fs/io_uring.o
>> 76959 8237 8 85204 14cd4 fs/io_uring.before.o
>>
>> symbol size
>> +get_order 0000000000000015
>> -io_close_prep 0000000000000066
>> -io_connect_prep 0000000000000051
>> -io_epoll_ctl_prep 0000000000000051
>> -io_issue_sqe 0000000000001101
>> +io_issue_sqe 00000000000018fa
>> -io_openat2_prep 00000000000000ed
>> -io_openat_prep 0000000000000089
>> -io_poll_add_prep 0000000000000056
>> -io_prep_fsync 0000000000000053
>> -io_prep_sfr 000000000000004e
>> -io_read_prep 00000000000000ca
>> -io_recvmsg_prep 0000000000000079
>> -io_req_defer_prep 000000000000058e
>> +io_req_defer_prep 0000000000000f12
>> -io_sendmsg_prep 000000000000006b
>> -io_statx_prep 00000000000000ed
>> -io_write_prep 00000000000000cd
>>
>>
>> There's still some unnecessary branching on force_nonblocking. The
>> second patch just separates the cases needing force_nonblocking
>> out. Probably not quite the right structure.
>>
>>
>> Oddly enough gcc decides that io_queue_async_work() wouldn't be inlined
>> anymore after that. I'm quite doubtful it's a good candidate anyway?
>> Seems mighty complex, and not likely to win much. That's a noticable
>> win:
>> text data bss dec hex filename
>> 72857 8141 8 81006 13c6e fs/io_uring.o
>> 76959 8237 8 85204 14cd4 fs/io_uring.before.o
>> --- /tmp/before.txt 2020-02-23 21:00:16.316753022 -0800
>> +++ /tmp/after.txt 2020-02-23 23:10:44.979496728 -0800
>> -io_commit_cqring 00000000000003ef
>> +io_commit_cqring 000000000000012c
>> +io_free_req 000000000000005e
>> -io_free_req 00000000000002ed
>> -io_issue_sqe 0000000000001101
>> +io_issue_sqe 0000000000000e86
>> -io_poll_remove_one 0000000000000308
>> +io_poll_remove_one 0000000000000074
>> -io_poll_wake 0000000000000498
>> +io_poll_wake 000000000000021c
>> +io_queue_async_work 00000000000002a0
>> -io_queue_sqe 00000000000008cc
>> +io_queue_sqe 0000000000000391
>
> That's OK, it's slow path, I'd prefer it not to be inlined.
>
>> Not quite sure what the policy is with attaching POC patches? Also send
>> as separate emails?
>
> Fine like this, though easier if you inline the patches so it's easier
> to comment on them.
>
> Agree that the first patch looks fine, though I don't quite see why
> you want to pass in opcode as a separate argument as it's always
> req->opcode. Seeing it separate makes me a bit nervous, thinking that
> someone is reading it again from the sqe, or maybe not passing in
> the right opcode for the given request. So that seems fragile and it
> should go away.
I suppose it's to hint a compiler, that opcode haven't been changed inside the
first switch. And any compiler I used breaks analysis there pretty easy.
Optimising C is such a pain...
--
Pavel Begunkov
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2020-02-24 15:44 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-24 1:07 Deduplicate io_*_prep calls? Andres Freund
2020-02-24 3:17 ` Jens Axboe
2020-02-24 3:33 ` Andres Freund
2020-02-24 3:52 ` Jens Axboe
2020-02-24 7:12 ` Andres Freund
2020-02-24 9:10 ` Pavel Begunkov
2020-02-24 15:40 ` Jens Axboe
2020-02-24 15:44 ` Pavel Begunkov [this message]
2020-02-24 15:46 ` Jens Axboe
2020-02-24 15:50 ` Pavel Begunkov
2020-02-24 15:53 ` Jens Axboe
2020-02-24 15:56 ` Pavel Begunkov
2020-02-24 16:02 ` Jens Axboe
2020-02-24 16:18 ` Pavel Begunkov
2020-02-24 17:08 ` Andres Freund
2020-02-24 17:16 ` Pavel Begunkov
2020-02-25 9:26 ` Pavel Begunkov
2020-02-27 21:06 ` Andres Freund
2020-02-24 16:53 ` Andres Freund
2020-02-24 17:19 ` Jens Axboe
2020-02-24 17:30 ` Jens Axboe
2020-02-24 17:37 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox