public inbox for [email protected]
 help / color / mirror / Atom feed
From: Pavel Begunkov <[email protected]>
To: Jens Axboe <[email protected]>, Andres Freund <[email protected]>
Cc: [email protected]
Subject: Re: Deduplicate io_*_prep calls?
Date: Mon, 24 Feb 2020 18:44:08 +0300	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>


[-- Attachment #1.1: Type: text/plain, Size: 4939 bytes --]

On 24/02/2020 18:40, Jens Axboe wrote:
> On 2/24/20 12:12 AM, Andres Freund wrote:
>> Hi,
>>
>> On 2020-02-23 20:52:26 -0700, Jens Axboe wrote:
>>> The fast case is not being deferred, that's by far the common (and hot)
>>> case, which means io_issue() is called with sqe != NULL. My worry is
>>> that by moving it into a prep helper, the compiler isn't smart enough to
>>> not make that basically two switches.
>>
>> I'm not sure that benefit of a single switch isn't offset by the lower
>> code density due to the additional per-opcode branches.  Not inlining
>> the prepare function results in:
>>
>> $ size fs/io_uring.o fs/io_uring.before.o
>>    text	   data	    bss	    dec	    hex	filename
>>   75383	   8237	      8	  83628	  146ac	fs/io_uring.o
>>   76959	   8237	      8	  85204	  14cd4	fs/io_uring.before.o
>>
>> symbol size
>> -io_close_prep 0000000000000066
>> -io_connect_prep 0000000000000051
>> -io_epoll_ctl_prep 0000000000000051
>> -io_issue_sqe 0000000000001101
>> +io_issue_sqe 0000000000000de9
>> -io_openat2_prep 00000000000000ed
>> -io_openat_prep 0000000000000089
>> -io_poll_add_prep 0000000000000056
>> -io_prep_fsync 0000000000000053
>> -io_prep_sfr 000000000000004e
>> -io_read_prep 00000000000000ca
>> -io_recvmsg_prep 0000000000000079
>> -io_req_defer_prep 000000000000058e
>> +io_req_defer_prep 0000000000000160
>> +io_req_prep 0000000000000d26
>> -io_sendmsg_prep 000000000000006b
>> -io_statx_prep 00000000000000ed
>> -io_write_prep 00000000000000cd
>>
>>
>>
>>> Feel free to prove me wrong, I'd love to reduce it ;-)
>>
>> With a bit of handholding the compiler can deduplicate the switches. It
>> can't recognize on its own that req->opcode can't change between the
>> switch for prep and issue. Can be solved by moving the opcode into a
>> temporary variable. Also needs an inline for io_req_prep (not surpring,
>> it's a bit large).
>>
>> That results in a bit bigger code. That's partially because of more
>> inlining:
>>    text	   data	    bss	    dec	    hex	filename
>>   78291	   8237	      8	  86536	  15208	fs/io_uring.o
>>   76959	   8237	      8	  85204	  14cd4	fs/io_uring.before.o
>>
>> symbol size
>> +get_order 0000000000000015
>> -io_close_prep 0000000000000066
>> -io_connect_prep 0000000000000051
>> -io_epoll_ctl_prep 0000000000000051
>> -io_issue_sqe 0000000000001101
>> +io_issue_sqe 00000000000018fa
>> -io_openat2_prep 00000000000000ed
>> -io_openat_prep 0000000000000089
>> -io_poll_add_prep 0000000000000056
>> -io_prep_fsync 0000000000000053
>> -io_prep_sfr 000000000000004e
>> -io_read_prep 00000000000000ca
>> -io_recvmsg_prep 0000000000000079
>> -io_req_defer_prep 000000000000058e
>> +io_req_defer_prep 0000000000000f12
>> -io_sendmsg_prep 000000000000006b
>> -io_statx_prep 00000000000000ed
>> -io_write_prep 00000000000000cd
>>
>>
>> There's still some unnecessary branching on force_nonblocking. The
>> second patch just separates the cases needing force_nonblocking
>> out. Probably not quite the right structure.
>>
>>
>> Oddly enough gcc decides that io_queue_async_work() wouldn't be inlined
>> anymore after that. I'm quite doubtful it's a good candidate anyway?
>> Seems mighty complex, and not likely to win much. That's a noticable
>> win:
>>    text	   data	    bss	    dec	    hex	filename
>>   72857	   8141	      8	  81006	  13c6e	fs/io_uring.o
>>   76959	   8237	      8	  85204	  14cd4	fs/io_uring.before.o
>> --- /tmp/before.txt	2020-02-23 21:00:16.316753022 -0800
>> +++ /tmp/after.txt	2020-02-23 23:10:44.979496728 -0800
>> -io_commit_cqring 00000000000003ef
>> +io_commit_cqring 000000000000012c
>> +io_free_req 000000000000005e
>> -io_free_req 00000000000002ed
>> -io_issue_sqe 0000000000001101
>> +io_issue_sqe 0000000000000e86
>> -io_poll_remove_one 0000000000000308
>> +io_poll_remove_one 0000000000000074
>> -io_poll_wake 0000000000000498
>> +io_poll_wake 000000000000021c
>> +io_queue_async_work 00000000000002a0
>> -io_queue_sqe 00000000000008cc
>> +io_queue_sqe 0000000000000391
> 
> That's OK, it's slow path, I'd prefer it not to be inlined.
> 
>> Not quite sure what the policy is with attaching POC patches? Also send
>> as separate emails?
> 
> Fine like this, though easier if you inline the patches so it's easier
> to comment on them.
> 
> Agree that the first patch looks fine, though I don't quite see why
> you want to pass in opcode as a separate argument as it's always
> req->opcode. Seeing it separate makes me a bit nervous, thinking that
> someone is reading it again from the sqe, or maybe not passing in
> the right opcode for the given request. So that seems fragile and it
> should go away.

I suppose it's to hint a compiler, that opcode haven't been changed inside the
first switch. And any compiler I used breaks analysis there pretty easy.
Optimising C is such a pain...

-- 
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2020-02-24 15:44 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24  1:07 Deduplicate io_*_prep calls? Andres Freund
2020-02-24  3:17 ` Jens Axboe
2020-02-24  3:33   ` Andres Freund
2020-02-24  3:52     ` Jens Axboe
2020-02-24  7:12       ` Andres Freund
2020-02-24  9:10         ` Pavel Begunkov
2020-02-24 15:40         ` Jens Axboe
2020-02-24 15:44           ` Pavel Begunkov [this message]
2020-02-24 15:46             ` Jens Axboe
2020-02-24 15:50               ` Pavel Begunkov
2020-02-24 15:53                 ` Jens Axboe
2020-02-24 15:56                   ` Pavel Begunkov
2020-02-24 16:02                     ` Jens Axboe
2020-02-24 16:18                       ` Pavel Begunkov
2020-02-24 17:08                         ` Andres Freund
2020-02-24 17:16                           ` Pavel Begunkov
2020-02-25  9:26                 ` Pavel Begunkov
2020-02-27 21:06                   ` Andres Freund
2020-02-24 16:53           ` Andres Freund
2020-02-24 17:19             ` Jens Axboe
2020-02-24 17:30               ` Jens Axboe
2020-02-24 17:37               ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox