* [PATCH] block: reexpand iov_iter after read/write
@ 2021-04-01 7:18 yangerkun
2021-04-06 1:28 ` yangerkun
2021-04-09 14:49 ` Pavel Begunkov
0 siblings, 2 replies; 18+ messages in thread
From: yangerkun @ 2021-04-01 7:18 UTC (permalink / raw)
To: viro, axboe, asml.silence; +Cc: linux-fsdevel, linux-block, io-uring, yangerkun
We get a bug:
BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
lib/iov_iter.c:1139
Read of size 8 at addr ffff0000d3fb11f8 by task
CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
5.10.0-00843-g352c8610ccd2 #2
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132
show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x110/0x164 lib/dump_stack.c:118
print_address_description+0x78/0x5c8 mm/kasan/report.c:385
__kasan_report mm/kasan/report.c:545 [inline]
kasan_report+0x148/0x1e4 mm/kasan/report.c:562
check_memory_region_inline mm/kasan/generic.c:183 [inline]
__asan_load8+0xb4/0xbc mm/kasan/generic.c:252
iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
io_read fs/io_uring.c:3421 [inline]
io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
__io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
io_submit_sqe fs/io_uring.c:6395 [inline]
io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
__do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
__se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
__arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
Allocated by task 12570:
stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461
kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475
__kmalloc+0x23c/0x334 mm/slub.c:3970
kmalloc include/linux/slab.h:557 [inline]
__io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210
io_setup_async_rw fs/io_uring.c:3229 [inline]
io_read fs/io_uring.c:3436 [inline]
io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943
__io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
io_submit_sqe fs/io_uring.c:6395 [inline]
io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
__do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
__se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
__arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
Freed by task 12570:
stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x38/0x6c mm/kasan/common.c:56
kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355
__kasan_slab_free+0x124/0x150 mm/kasan/common.c:422
kasan_slab_free+0x10/0x1c mm/kasan/common.c:431
slab_free_hook mm/slub.c:1544 [inline]
slab_free_freelist_hook mm/slub.c:1577 [inline]
slab_free mm/slub.c:3142 [inline]
kfree+0x104/0x38c mm/slub.c:4124
io_dismantle_req fs/io_uring.c:1855 [inline]
__io_free_req+0x70/0x254 fs/io_uring.c:1867
io_put_req_find_next fs/io_uring.c:2173 [inline]
__io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279
__io_req_task_submit+0x154/0x21c fs/io_uring.c:2051
io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063
task_work_run+0xdc/0x128 kernel/task_work.c:151
get_signal+0x6f8/0x980 kernel/signal.c:2562
do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658
do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722
work_pending+0xc/0x180
blkdev_read_iter can truncate iov_iter's count since the count + pos may
exceed the size of the blkdev. This will confuse io_read that we have
consume the iovec. And once we do the iov_iter_revert in io_read, we
will trigger the slab-out-of-bounds. Fix it by reexpand the count with
size has been truncated.
blkdev_write_iter can trigger the problem too.
Signed-off-by: yangerkun <[email protected]>
---
fs/block_dev.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 92ed7d5df677..788e1014576f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
struct inode *bd_inode = bdev_file_inode(file);
loff_t size = i_size_read(bd_inode);
struct blk_plug plug;
+ size_t shorted = 0;
ssize_t ret;
if (bdev_read_only(I_BDEV(bd_inode)))
@@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
return -EOPNOTSUPP;
- iov_iter_truncate(from, size - iocb->ki_pos);
+ size -= iocb->ki_pos;
+ if (iov_iter_count(from) > size) {
+ shorted = iov_iter_count(from) - size;
+ iov_iter_truncate(from, size);
+ }
blk_start_plug(&plug);
ret = __generic_file_write_iter(iocb, from);
if (ret > 0)
ret = generic_write_sync(iocb, ret);
+ iov_iter_reexpand(from, iov_iter_count(from) + shorted);
blk_finish_plug(&plug);
return ret;
}
@@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
struct inode *bd_inode = bdev_file_inode(file);
loff_t size = i_size_read(bd_inode);
loff_t pos = iocb->ki_pos;
+ size_t shorted = 0;
+ ssize_t ret;
if (pos >= size)
return 0;
size -= pos;
- iov_iter_truncate(to, size);
- return generic_file_read_iter(iocb, to);
+ if (iov_iter_count(to) > size) {
+ shorted = iov_iter_count(to) - size;
+ iov_iter_truncate(to, size);
+ }
+
+ ret = generic_file_read_iter(iocb, to);
+ iov_iter_reexpand(to, iov_iter_count(to) + shorted);
+ return ret;
}
EXPORT_SYMBOL_GPL(blkdev_read_iter);
--
2.25.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-01 7:18 [PATCH] block: reexpand iov_iter after read/write yangerkun
@ 2021-04-06 1:28 ` yangerkun
2021-04-06 11:04 ` Pavel Begunkov
2021-04-09 14:49 ` Pavel Begunkov
1 sibling, 1 reply; 18+ messages in thread
From: yangerkun @ 2021-04-06 1:28 UTC (permalink / raw)
To: viro, axboe, asml.silence; +Cc: linux-fsdevel, linux-block, io-uring
Ping...
在 2021/4/1 15:18, yangerkun 写道:
> We get a bug:
>
> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
> lib/iov_iter.c:1139
> Read of size 8 at addr ffff0000d3fb11f8 by task
>
> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
> 5.10.0-00843-g352c8610ccd2 #2
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132
> show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x110/0x164 lib/dump_stack.c:118
> print_address_description+0x78/0x5c8 mm/kasan/report.c:385
> __kasan_report mm/kasan/report.c:545 [inline]
> kasan_report+0x148/0x1e4 mm/kasan/report.c:562
> check_memory_region_inline mm/kasan/generic.c:183 [inline]
> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
> io_read fs/io_uring.c:3421 [inline]
> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
> io_submit_sqe fs/io_uring.c:6395 [inline]
> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>
> Allocated by task 12570:
> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
> kasan_save_stack mm/kasan/common.c:48 [inline]
> kasan_set_track mm/kasan/common.c:56 [inline]
> __kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461
> kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475
> __kmalloc+0x23c/0x334 mm/slub.c:3970
> kmalloc include/linux/slab.h:557 [inline]
> __io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210
> io_setup_async_rw fs/io_uring.c:3229 [inline]
> io_read fs/io_uring.c:3436 [inline]
> io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943
> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
> io_submit_sqe fs/io_uring.c:6395 [inline]
> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>
> Freed by task 12570:
> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
> kasan_save_stack mm/kasan/common.c:48 [inline]
> kasan_set_track+0x38/0x6c mm/kasan/common.c:56
> kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355
> __kasan_slab_free+0x124/0x150 mm/kasan/common.c:422
> kasan_slab_free+0x10/0x1c mm/kasan/common.c:431
> slab_free_hook mm/slub.c:1544 [inline]
> slab_free_freelist_hook mm/slub.c:1577 [inline]
> slab_free mm/slub.c:3142 [inline]
> kfree+0x104/0x38c mm/slub.c:4124
> io_dismantle_req fs/io_uring.c:1855 [inline]
> __io_free_req+0x70/0x254 fs/io_uring.c:1867
> io_put_req_find_next fs/io_uring.c:2173 [inline]
> __io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279
> __io_req_task_submit+0x154/0x21c fs/io_uring.c:2051
> io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063
> task_work_run+0xdc/0x128 kernel/task_work.c:151
> get_signal+0x6f8/0x980 kernel/signal.c:2562
> do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658
> do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722
> work_pending+0xc/0x180
>
> blkdev_read_iter can truncate iov_iter's count since the count + pos may
> exceed the size of the blkdev. This will confuse io_read that we have
> consume the iovec. And once we do the iov_iter_revert in io_read, we
> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
> size has been truncated.
>
> blkdev_write_iter can trigger the problem too.
>
> Signed-off-by: yangerkun <[email protected]>
> ---
> fs/block_dev.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 92ed7d5df677..788e1014576f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
> struct inode *bd_inode = bdev_file_inode(file);
> loff_t size = i_size_read(bd_inode);
> struct blk_plug plug;
> + size_t shorted = 0;
> ssize_t ret;
>
> if (bdev_read_only(I_BDEV(bd_inode)))
> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
> return -EOPNOTSUPP;
>
> - iov_iter_truncate(from, size - iocb->ki_pos);
> + size -= iocb->ki_pos;
> + if (iov_iter_count(from) > size) {
> + shorted = iov_iter_count(from) - size;
> + iov_iter_truncate(from, size);
> + }
>
> blk_start_plug(&plug);
> ret = __generic_file_write_iter(iocb, from);
> if (ret > 0)
> ret = generic_write_sync(iocb, ret);
> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
> blk_finish_plug(&plug);
> return ret;
> }
> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
> struct inode *bd_inode = bdev_file_inode(file);
> loff_t size = i_size_read(bd_inode);
> loff_t pos = iocb->ki_pos;
> + size_t shorted = 0;
> + ssize_t ret;
>
> if (pos >= size)
> return 0;
>
> size -= pos;
> - iov_iter_truncate(to, size);
> - return generic_file_read_iter(iocb, to);
> + if (iov_iter_count(to) > size) {
> + shorted = iov_iter_count(to) - size;
> + iov_iter_truncate(to, size);
> + }
> +
> + ret = generic_file_read_iter(iocb, to);
> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
> + return ret;
> }
> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-06 1:28 ` yangerkun
@ 2021-04-06 11:04 ` Pavel Begunkov
2021-04-07 14:16 ` yangerkun
0 siblings, 1 reply; 18+ messages in thread
From: Pavel Begunkov @ 2021-04-06 11:04 UTC (permalink / raw)
To: yangerkun, viro, axboe; +Cc: linux-fsdevel, linux-block, io-uring
On 06/04/2021 02:28, yangerkun wrote:
> Ping...
It wasn't forgotten, but wouln't have worked because of
other reasons. With these two already queued, that's a
different story.
https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.12&id=07204f21577a1d882f0259590c3553fe6a476381
https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.12&id=230d50d448acb6639991440913299e50cacf1daf
Can you re-confirm, that the bug is still there (should be)
and your patch fixes it?
>
> 在 2021/4/1 15:18, yangerkun 写道:
>> We get a bug:
>>
>> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
>> lib/iov_iter.c:1139
>> Read of size 8 at addr ffff0000d3fb11f8 by task
>>
>> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
>> 5.10.0-00843-g352c8610ccd2 #2
>> Hardware name: linux,dummy-virt (DT)
>> Call trace:
>> dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132
>> show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196
>> __dump_stack lib/dump_stack.c:77 [inline]
>> dump_stack+0x110/0x164 lib/dump_stack.c:118
>> print_address_description+0x78/0x5c8 mm/kasan/report.c:385
>> __kasan_report mm/kasan/report.c:545 [inline]
>> kasan_report+0x148/0x1e4 mm/kasan/report.c:562
>> check_memory_region_inline mm/kasan/generic.c:183 [inline]
>> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
>> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
>> io_read fs/io_uring.c:3421 [inline]
>> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>> io_submit_sqe fs/io_uring.c:6395 [inline]
>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
>> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
>> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
>> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
>> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
>> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
>> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
>> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
>> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
>> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
>> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>>
>> Allocated by task 12570:
>> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
>> kasan_save_stack mm/kasan/common.c:48 [inline]
>> kasan_set_track mm/kasan/common.c:56 [inline]
>> __kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461
>> kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475
>> __kmalloc+0x23c/0x334 mm/slub.c:3970
>> kmalloc include/linux/slab.h:557 [inline]
>> __io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210
>> io_setup_async_rw fs/io_uring.c:3229 [inline]
>> io_read fs/io_uring.c:3436 [inline]
>> io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943
>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>> io_submit_sqe fs/io_uring.c:6395 [inline]
>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
>> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
>> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
>> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
>> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
>> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
>> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
>> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
>> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
>> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
>> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>>
>> Freed by task 12570:
>> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
>> kasan_save_stack mm/kasan/common.c:48 [inline]
>> kasan_set_track+0x38/0x6c mm/kasan/common.c:56
>> kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355
>> __kasan_slab_free+0x124/0x150 mm/kasan/common.c:422
>> kasan_slab_free+0x10/0x1c mm/kasan/common.c:431
>> slab_free_hook mm/slub.c:1544 [inline]
>> slab_free_freelist_hook mm/slub.c:1577 [inline]
>> slab_free mm/slub.c:3142 [inline]
>> kfree+0x104/0x38c mm/slub.c:4124
>> io_dismantle_req fs/io_uring.c:1855 [inline]
>> __io_free_req+0x70/0x254 fs/io_uring.c:1867
>> io_put_req_find_next fs/io_uring.c:2173 [inline]
>> __io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279
>> __io_req_task_submit+0x154/0x21c fs/io_uring.c:2051
>> io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063
>> task_work_run+0xdc/0x128 kernel/task_work.c:151
>> get_signal+0x6f8/0x980 kernel/signal.c:2562
>> do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658
>> do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722
>> work_pending+0xc/0x180
>>
>> blkdev_read_iter can truncate iov_iter's count since the count + pos may
>> exceed the size of the blkdev. This will confuse io_read that we have
>> consume the iovec. And once we do the iov_iter_revert in io_read, we
>> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
>> size has been truncated.
>>
>> blkdev_write_iter can trigger the problem too.
>>
>> Signed-off-by: yangerkun <[email protected]>
>> ---
>> fs/block_dev.c | 20 +++++++++++++++++---
>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index 92ed7d5df677..788e1014576f 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>> struct inode *bd_inode = bdev_file_inode(file);
>> loff_t size = i_size_read(bd_inode);
>> struct blk_plug plug;
>> + size_t shorted = 0;
>> ssize_t ret;
>> if (bdev_read_only(I_BDEV(bd_inode)))
>> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
>> return -EOPNOTSUPP;
>> - iov_iter_truncate(from, size - iocb->ki_pos);
>> + size -= iocb->ki_pos;
>> + if (iov_iter_count(from) > size) {
>> + shorted = iov_iter_count(from) - size;
>> + iov_iter_truncate(from, size);
>> + }
>> blk_start_plug(&plug);
>> ret = __generic_file_write_iter(iocb, from);
>> if (ret > 0)
>> ret = generic_write_sync(iocb, ret);
>> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
>> blk_finish_plug(&plug);
>> return ret;
>> }
>> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
>> struct inode *bd_inode = bdev_file_inode(file);
>> loff_t size = i_size_read(bd_inode);
>> loff_t pos = iocb->ki_pos;
>> + size_t shorted = 0;
>> + ssize_t ret;
>> if (pos >= size)
>> return 0;
>> size -= pos;
>> - iov_iter_truncate(to, size);
>> - return generic_file_read_iter(iocb, to);
>> + if (iov_iter_count(to) > size) {
>> + shorted = iov_iter_count(to) - size;
>> + iov_iter_truncate(to, size);
>> + }
>> +
>> + ret = generic_file_read_iter(iocb, to);
>> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
>> + return ret;
>> }
>> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-06 11:04 ` Pavel Begunkov
@ 2021-04-07 14:16 ` yangerkun
0 siblings, 0 replies; 18+ messages in thread
From: yangerkun @ 2021-04-07 14:16 UTC (permalink / raw)
To: Pavel Begunkov, viro, axboe; +Cc: linux-fsdevel, linux-block, io-uring
在 2021/4/6 19:04, Pavel Begunkov 写道:
> On 06/04/2021 02:28, yangerkun wrote:
>> Ping...
>
> It wasn't forgotten, but wouln't have worked because of
> other reasons. With these two already queued, that's a
> different story.
>
> https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.12&id=07204f21577a1d882f0259590c3553fe6a476381
> https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.12&id=230d50d448acb6639991440913299e50cacf1daf
>
> Can you re-confirm, that the bug is still there (should be)
> and your patch fixes it?
Hi,
This problem still exists in mainline (2d743660786e Merge branch 'fixes'
of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs), and this
patch will fix it.
The io_read for loop will return -EAGAIN. This will lead a
iov_iter_revert in io_read. Once we truncate iov_iter in
blkdev_read_iter, we will see this bug...
[ 181.204371][ T4241] loop0: detected capacity change from 0 to 232
[ 181.253683][ T4241]
==================================================================
[ 181.255313][ T4241] BUG: KASAN: slab-out-of-bounds in
iov_iter_revert+0xd0/0x3e0
[ 181.256723][ T4241] Read of size 8 at addr ffff0000cfbc8ff8 by task
a.out/4241
[ 181.257776][ T4241]
[ 181.258749][ T4241] CPU: 5 PID: 4241 Comm: a.out Not tainted
5.12.0-rc6-00006-g2d743660786e
#1
[ 181.260149][ T4241] Hardware name: linux,dummy-virt (DT)
[ 181.261468][ T4241] Call trace:
[ 181.262052][ T4241] dump_backtrace+0x0/0x348
[ 181.263139][ T4241] show_stack+0x28/0x38
[ 181.264234][ T4241] dump_stack+0x134/0x1a4
[ 181.265175][ T4241] print_address_description.constprop.0+0x68/0x304
[ 181.266430][ T4241] kasan_report+0x1d0/0x238
[ 181.267308][ T4241] __asan_load8+0x88/0xc0
[ 181.268317][ T4241] iov_iter_revert+0xd0/0x3e0
[ 181.269251][ T4241] io_read+0x310/0x5c0
[ 181.270208][ T4241] io_issue_sqe+0x3fc/0x25d8
[ 181.271134][ T4241] __io_queue_sqe+0xf8/0x480
[ 181.272142][ T4241] io_queue_sqe+0x3a4/0x4c8
[ 181.273053][ T4241] io_submit_sqes+0xd9c/0x22d0
[ 181.274375][ T4241] __arm64_sys_io_uring_enter+0x3d0/0xce0
[ 181.275554][ T4241] do_el0_svc+0xc4/0x228
[ 181.276411][ T4241] el0_svc+0x24/0x30
[ 181.277323][ T4241] el0_sync_handler+0x158/0x160
[ 181.278241][ T4241] el0_sync+0x13c/0x140
[ 181.279287][ T4241]
[ 181.279820][ T4241] Allocated by task 4241:
[ 181.280699][ T4241] kasan_save_stack+0x24/0x50
[ 181.281626][ T4241] __kasan_kmalloc+0x84/0xa8
[ 181.282578][ T4241] io_wq_create+0x94/0x668
[ 181.283469][ T4241] io_uring_alloc_task_context+0x164/0x368
[ 181.284748][ T4241] io_uring_add_task_file+0x1b0/0x208
[ 181.285865][ T4241] io_uring_setup+0xaac/0x12a0
[ 181.286823][ T4241] __arm64_sys_io_uring_setup+0x34/0x40
[ 181.287957][ T4241] do_el0_svc+0xc4/0x228
[ 181.288906][ T4241] el0_svc+0x24/0x30
[ 181.289816][ T4241] el0_sync_handler+0x158/0x160
[ 181.290751][ T4241] el0_sync+0x13c/0x140
[ 181.291697][ T4241]
>
>>
>> 在 2021/4/1 15:18, yangerkun 写道:
>>> We get a bug:
>>>
>>> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
>>> lib/iov_iter.c:1139
>>> Read of size 8 at addr ffff0000d3fb11f8 by task
>>>
>>> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
>>> 5.10.0-00843-g352c8610ccd2 #2
>>> Hardware name: linux,dummy-virt (DT)
>>> Call trace:
>>> dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132
>>> show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196
>>> __dump_stack lib/dump_stack.c:77 [inline]
>>> dump_stack+0x110/0x164 lib/dump_stack.c:118
>>> print_address_description+0x78/0x5c8 mm/kasan/report.c:385
>>> __kasan_report mm/kasan/report.c:545 [inline]
>>> kasan_report+0x148/0x1e4 mm/kasan/report.c:562
>>> check_memory_region_inline mm/kasan/generic.c:183 [inline]
>>> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
>>> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
>>> io_read fs/io_uring.c:3421 [inline]
>>> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
>>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>>> io_submit_sqe fs/io_uring.c:6395 [inline]
>>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
>>> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
>>> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
>>> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
>>> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
>>> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
>>> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
>>> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
>>> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
>>> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
>>> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>>>
>>> Allocated by task 12570:
>>> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
>>> kasan_save_stack mm/kasan/common.c:48 [inline]
>>> kasan_set_track mm/kasan/common.c:56 [inline]
>>> __kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461
>>> kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475
>>> __kmalloc+0x23c/0x334 mm/slub.c:3970
>>> kmalloc include/linux/slab.h:557 [inline]
>>> __io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210
>>> io_setup_async_rw fs/io_uring.c:3229 [inline]
>>> io_read fs/io_uring.c:3436 [inline]
>>> io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943
>>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>>> io_submit_sqe fs/io_uring.c:6395 [inline]
>>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
>>> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
>>> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
>>> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
>>> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
>>> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
>>> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
>>> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
>>> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
>>> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
>>> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>>>
>>> Freed by task 12570:
>>> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
>>> kasan_save_stack mm/kasan/common.c:48 [inline]
>>> kasan_set_track+0x38/0x6c mm/kasan/common.c:56
>>> kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355
>>> __kasan_slab_free+0x124/0x150 mm/kasan/common.c:422
>>> kasan_slab_free+0x10/0x1c mm/kasan/common.c:431
>>> slab_free_hook mm/slub.c:1544 [inline]
>>> slab_free_freelist_hook mm/slub.c:1577 [inline]
>>> slab_free mm/slub.c:3142 [inline]
>>> kfree+0x104/0x38c mm/slub.c:4124
>>> io_dismantle_req fs/io_uring.c:1855 [inline]
>>> __io_free_req+0x70/0x254 fs/io_uring.c:1867
>>> io_put_req_find_next fs/io_uring.c:2173 [inline]
>>> __io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279
>>> __io_req_task_submit+0x154/0x21c fs/io_uring.c:2051
>>> io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063
>>> task_work_run+0xdc/0x128 kernel/task_work.c:151
>>> get_signal+0x6f8/0x980 kernel/signal.c:2562
>>> do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658
>>> do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722
>>> work_pending+0xc/0x180
>>>
>>> blkdev_read_iter can truncate iov_iter's count since the count + pos may
>>> exceed the size of the blkdev. This will confuse io_read that we have
>>> consume the iovec. And once we do the iov_iter_revert in io_read, we
>>> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
>>> size has been truncated.
>>>
>>> blkdev_write_iter can trigger the problem too.
>>>
>>> Signed-off-by: yangerkun <[email protected]>
>>> ---
>>> fs/block_dev.c | 20 +++++++++++++++++---
>>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>>> index 92ed7d5df677..788e1014576f 100644
>>> --- a/fs/block_dev.c
>>> +++ b/fs/block_dev.c
>>> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>> struct inode *bd_inode = bdev_file_inode(file);
>>> loff_t size = i_size_read(bd_inode);
>>> struct blk_plug plug;
>>> + size_t shorted = 0;
>>> ssize_t ret;
>>> if (bdev_read_only(I_BDEV(bd_inode)))
>>> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
>>> return -EOPNOTSUPP;
>>> - iov_iter_truncate(from, size - iocb->ki_pos);
>>> + size -= iocb->ki_pos;
>>> + if (iov_iter_count(from) > size) {
>>> + shorted = iov_iter_count(from) - size;
>>> + iov_iter_truncate(from, size);
>>> + }
>>> blk_start_plug(&plug);
>>> ret = __generic_file_write_iter(iocb, from);
>>> if (ret > 0)
>>> ret = generic_write_sync(iocb, ret);
>>> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
>>> blk_finish_plug(&plug);
>>> return ret;
>>> }
>>> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
>>> struct inode *bd_inode = bdev_file_inode(file);
>>> loff_t size = i_size_read(bd_inode);
>>> loff_t pos = iocb->ki_pos;
>>> + size_t shorted = 0;
>>> + ssize_t ret;
>>> if (pos >= size)
>>> return 0;
>>> size -= pos;
>>> - iov_iter_truncate(to, size);
>>> - return generic_file_read_iter(iocb, to);
>>> + if (iov_iter_count(to) > size) {
>>> + shorted = iov_iter_count(to) - size;
>>> + iov_iter_truncate(to, size);
>>> + }
>>> +
>>> + ret = generic_file_read_iter(iocb, to);
>>> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
>>> + return ret;
>>> }
>>> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>>>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-01 7:18 [PATCH] block: reexpand iov_iter after read/write yangerkun
2021-04-06 1:28 ` yangerkun
@ 2021-04-09 14:49 ` Pavel Begunkov
2021-04-15 17:37 ` Pavel Begunkov
1 sibling, 1 reply; 18+ messages in thread
From: Pavel Begunkov @ 2021-04-09 14:49 UTC (permalink / raw)
To: yangerkun, axboe; +Cc: viro, linux-fsdevel, linux-block, io-uring
On 01/04/2021 08:18, yangerkun wrote:
> We get a bug:
>
> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
> lib/iov_iter.c:1139
> Read of size 8 at addr ffff0000d3fb11f8 by task
>
> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
> 5.10.0-00843-g352c8610ccd2 #2
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132
> show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x110/0x164 lib/dump_stack.c:118
> print_address_description+0x78/0x5c8 mm/kasan/report.c:385
> __kasan_report mm/kasan/report.c:545 [inline]
> kasan_report+0x148/0x1e4 mm/kasan/report.c:562
> check_memory_region_inline mm/kasan/generic.c:183 [inline]
> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
> io_read fs/io_uring.c:3421 [inline]
> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
> io_submit_sqe fs/io_uring.c:6395 [inline]
> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>
> Allocated by task 12570:
> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
> kasan_save_stack mm/kasan/common.c:48 [inline]
> kasan_set_track mm/kasan/common.c:56 [inline]
> __kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461
> kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475
> __kmalloc+0x23c/0x334 mm/slub.c:3970
> kmalloc include/linux/slab.h:557 [inline]
> __io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210
> io_setup_async_rw fs/io_uring.c:3229 [inline]
> io_read fs/io_uring.c:3436 [inline]
> io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943
> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
> io_submit_sqe fs/io_uring.c:6395 [inline]
> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
> __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
> __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
> __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
> __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
> invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
> el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
> do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
> el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
> el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
> el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
>
> Freed by task 12570:
> stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
> kasan_save_stack mm/kasan/common.c:48 [inline]
> kasan_set_track+0x38/0x6c mm/kasan/common.c:56
> kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355
> __kasan_slab_free+0x124/0x150 mm/kasan/common.c:422
> kasan_slab_free+0x10/0x1c mm/kasan/common.c:431
> slab_free_hook mm/slub.c:1544 [inline]
> slab_free_freelist_hook mm/slub.c:1577 [inline]
> slab_free mm/slub.c:3142 [inline]
> kfree+0x104/0x38c mm/slub.c:4124
> io_dismantle_req fs/io_uring.c:1855 [inline]
> __io_free_req+0x70/0x254 fs/io_uring.c:1867
> io_put_req_find_next fs/io_uring.c:2173 [inline]
> __io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279
> __io_req_task_submit+0x154/0x21c fs/io_uring.c:2051
> io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063
> task_work_run+0xdc/0x128 kernel/task_work.c:151
> get_signal+0x6f8/0x980 kernel/signal.c:2562
> do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658
> do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722
> work_pending+0xc/0x180
>
> blkdev_read_iter can truncate iov_iter's count since the count + pos may
> exceed the size of the blkdev. This will confuse io_read that we have
> consume the iovec. And once we do the iov_iter_revert in io_read, we
> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
> size has been truncated.
Looks right,
Acked-by: Pavel Begunkov <[email protected]>
>
> blkdev_write_iter can trigger the problem too.
>
> Signed-off-by: yangerkun <[email protected]>
> ---
> fs/block_dev.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 92ed7d5df677..788e1014576f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
> struct inode *bd_inode = bdev_file_inode(file);
> loff_t size = i_size_read(bd_inode);
> struct blk_plug plug;
> + size_t shorted = 0;
> ssize_t ret;
>
> if (bdev_read_only(I_BDEV(bd_inode)))
> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
> return -EOPNOTSUPP;
>
> - iov_iter_truncate(from, size - iocb->ki_pos);
> + size -= iocb->ki_pos;
> + if (iov_iter_count(from) > size) {
> + shorted = iov_iter_count(from) - size;
> + iov_iter_truncate(from, size);
> + }
>
> blk_start_plug(&plug);
> ret = __generic_file_write_iter(iocb, from);
> if (ret > 0)
> ret = generic_write_sync(iocb, ret);
> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
> blk_finish_plug(&plug);
> return ret;
> }
> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
> struct inode *bd_inode = bdev_file_inode(file);
> loff_t size = i_size_read(bd_inode);
> loff_t pos = iocb->ki_pos;
> + size_t shorted = 0;
> + ssize_t ret;
>
> if (pos >= size)
> return 0;
>
> size -= pos;
> - iov_iter_truncate(to, size);
> - return generic_file_read_iter(iocb, to);
> + if (iov_iter_count(to) > size) {
> + shorted = iov_iter_count(to) - size;
> + iov_iter_truncate(to, size);
> + }
> +
> + ret = generic_file_read_iter(iocb, to);
> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
> + return ret;
> }
> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-09 14:49 ` Pavel Begunkov
@ 2021-04-15 17:37 ` Pavel Begunkov
2021-04-15 17:39 ` Pavel Begunkov
0 siblings, 1 reply; 18+ messages in thread
From: Pavel Begunkov @ 2021-04-15 17:37 UTC (permalink / raw)
To: yangerkun, axboe; +Cc: viro, linux-fsdevel, linux-block, io-uring
On 09/04/2021 15:49, Pavel Begunkov wrote:
> On 01/04/2021 08:18, yangerkun wrote:
>> We get a bug:
>>
>> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
>> lib/iov_iter.c:1139
>> Read of size 8 at addr ffff0000d3fb11f8 by task
>>
>> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
>> 5.10.0-00843-g352c8610ccd2 #2
>> Hardware name: linux,dummy-virt (DT)
>> Call trace:
...
>> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
>> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
>> io_read fs/io_uring.c:3421 [inline]
>> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>> io_submit_sqe fs/io_uring.c:6395 [inline]
>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
...
>>
>> blkdev_read_iter can truncate iov_iter's count since the count + pos may
>> exceed the size of the blkdev. This will confuse io_read that we have
>> consume the iovec. And once we do the iov_iter_revert in io_read, we
>> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
>> size has been truncated.
>
> Looks right,
>
> Acked-by: Pavel Begunkov <[email protected]>
Fwiw, we need to forget to drag it through 5.13 + stable
>>
>> blkdev_write_iter can trigger the problem too.
>>
>> Signed-off-by: yangerkun <[email protected]>
>> ---
>> fs/block_dev.c | 20 +++++++++++++++++---
>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index 92ed7d5df677..788e1014576f 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>> struct inode *bd_inode = bdev_file_inode(file);
>> loff_t size = i_size_read(bd_inode);
>> struct blk_plug plug;
>> + size_t shorted = 0;
>> ssize_t ret;
>>
>> if (bdev_read_only(I_BDEV(bd_inode)))
>> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
>> return -EOPNOTSUPP;
>>
>> - iov_iter_truncate(from, size - iocb->ki_pos);
>> + size -= iocb->ki_pos;
>> + if (iov_iter_count(from) > size) {
>> + shorted = iov_iter_count(from) - size;
>> + iov_iter_truncate(from, size);
>> + }
>>
>> blk_start_plug(&plug);
>> ret = __generic_file_write_iter(iocb, from);
>> if (ret > 0)
>> ret = generic_write_sync(iocb, ret);
>> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
>> blk_finish_plug(&plug);
>> return ret;
>> }
>> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
>> struct inode *bd_inode = bdev_file_inode(file);
>> loff_t size = i_size_read(bd_inode);
>> loff_t pos = iocb->ki_pos;
>> + size_t shorted = 0;
>> + ssize_t ret;
>>
>> if (pos >= size)
>> return 0;
>>
>> size -= pos;
>> - iov_iter_truncate(to, size);
>> - return generic_file_read_iter(iocb, to);
>> + if (iov_iter_count(to) > size) {
>> + shorted = iov_iter_count(to) - size;
>> + iov_iter_truncate(to, size);
>> + }
>> +
>> + ret = generic_file_read_iter(iocb, to);
>> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
>> + return ret;
>> }
>> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>>
>>
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-15 17:37 ` Pavel Begunkov
@ 2021-04-15 17:39 ` Pavel Begunkov
2021-04-28 6:16 ` yangerkun
0 siblings, 1 reply; 18+ messages in thread
From: Pavel Begunkov @ 2021-04-15 17:39 UTC (permalink / raw)
To: yangerkun, axboe; +Cc: viro, linux-fsdevel, linux-block, io-uring
On 15/04/2021 18:37, Pavel Begunkov wrote:
> On 09/04/2021 15:49, Pavel Begunkov wrote:
>> On 01/04/2021 08:18, yangerkun wrote:
>>> We get a bug:
>>>
>>> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
>>> lib/iov_iter.c:1139
>>> Read of size 8 at addr ffff0000d3fb11f8 by task
>>>
>>> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
>>> 5.10.0-00843-g352c8610ccd2 #2
>>> Hardware name: linux,dummy-virt (DT)
>>> Call trace:
> ...
>>> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
>>> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
>>> io_read fs/io_uring.c:3421 [inline]
>>> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
>>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>>> io_submit_sqe fs/io_uring.c:6395 [inline]
>>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
> ...
>>>
>>> blkdev_read_iter can truncate iov_iter's count since the count + pos may
>>> exceed the size of the blkdev. This will confuse io_read that we have
>>> consume the iovec. And once we do the iov_iter_revert in io_read, we
>>> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
>>> size has been truncated.
>>
>> Looks right,
>>
>> Acked-by: Pavel Begunkov <[email protected]>
>
> Fwiw, we need to forget to drag it through 5.13 + stable
Err, yypo, to _not_ forget to 5.13 + stable...
>
>
>>>
>>> blkdev_write_iter can trigger the problem too.
>>>
>>> Signed-off-by: yangerkun <[email protected]>
>>> ---
>>> fs/block_dev.c | 20 +++++++++++++++++---
>>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>>> index 92ed7d5df677..788e1014576f 100644
>>> --- a/fs/block_dev.c
>>> +++ b/fs/block_dev.c
>>> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>> struct inode *bd_inode = bdev_file_inode(file);
>>> loff_t size = i_size_read(bd_inode);
>>> struct blk_plug plug;
>>> + size_t shorted = 0;
>>> ssize_t ret;
>>>
>>> if (bdev_read_only(I_BDEV(bd_inode)))
>>> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
>>> return -EOPNOTSUPP;
>>>
>>> - iov_iter_truncate(from, size - iocb->ki_pos);
>>> + size -= iocb->ki_pos;
>>> + if (iov_iter_count(from) > size) {
>>> + shorted = iov_iter_count(from) - size;
>>> + iov_iter_truncate(from, size);
>>> + }
>>>
>>> blk_start_plug(&plug);
>>> ret = __generic_file_write_iter(iocb, from);
>>> if (ret > 0)
>>> ret = generic_write_sync(iocb, ret);
>>> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
>>> blk_finish_plug(&plug);
>>> return ret;
>>> }
>>> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
>>> struct inode *bd_inode = bdev_file_inode(file);
>>> loff_t size = i_size_read(bd_inode);
>>> loff_t pos = iocb->ki_pos;
>>> + size_t shorted = 0;
>>> + ssize_t ret;
>>>
>>> if (pos >= size)
>>> return 0;
>>>
>>> size -= pos;
>>> - iov_iter_truncate(to, size);
>>> - return generic_file_read_iter(iocb, to);
>>> + if (iov_iter_count(to) > size) {
>>> + shorted = iov_iter_count(to) - size;
>>> + iov_iter_truncate(to, size);
>>> + }
>>> +
>>> + ret = generic_file_read_iter(iocb, to);
>>> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
>>> + return ret;
>>> }
>>> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>>>
>>>
>>
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-15 17:39 ` Pavel Begunkov
@ 2021-04-28 6:16 ` yangerkun
2021-04-30 12:57 ` Pavel Begunkov
0 siblings, 1 reply; 18+ messages in thread
From: yangerkun @ 2021-04-28 6:16 UTC (permalink / raw)
To: Pavel Begunkov, axboe; +Cc: viro, linux-fsdevel, linux-block, io-uring
Hi,
Should we pick this patch for 5.13?
在 2021/4/16 1:39, Pavel Begunkov 写道:
> On 15/04/2021 18:37, Pavel Begunkov wrote:
>> On 09/04/2021 15:49, Pavel Begunkov wrote:
>>> On 01/04/2021 08:18, yangerkun wrote:
>>>> We get a bug:
>>>>
>>>> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
>>>> lib/iov_iter.c:1139
>>>> Read of size 8 at addr ffff0000d3fb11f8 by task
>>>>
>>>> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
>>>> 5.10.0-00843-g352c8610ccd2 #2
>>>> Hardware name: linux,dummy-virt (DT)
>>>> Call trace:
>> ...
>>>> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
>>>> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
>>>> io_read fs/io_uring.c:3421 [inline]
>>>> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
>>>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>>>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>>>> io_submit_sqe fs/io_uring.c:6395 [inline]
>>>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
>> ...
>>>>
>>>> blkdev_read_iter can truncate iov_iter's count since the count + pos may
>>>> exceed the size of the blkdev. This will confuse io_read that we have
>>>> consume the iovec. And once we do the iov_iter_revert in io_read, we
>>>> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
>>>> size has been truncated.
>>>
>>> Looks right,
>>>
>>> Acked-by: Pavel Begunkov <[email protected]>
>>
>> Fwiw, we need to forget to drag it through 5.13 + stable
>
> Err, yypo, to _not_ forget to 5.13 + stable...
>
>>
>>
>>>>
>>>> blkdev_write_iter can trigger the problem too.
>>>>
>>>> Signed-off-by: yangerkun <[email protected]>
>>>> ---
>>>> fs/block_dev.c | 20 +++++++++++++++++---
>>>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>>>> index 92ed7d5df677..788e1014576f 100644
>>>> --- a/fs/block_dev.c
>>>> +++ b/fs/block_dev.c
>>>> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>>> struct inode *bd_inode = bdev_file_inode(file);
>>>> loff_t size = i_size_read(bd_inode);
>>>> struct blk_plug plug;
>>>> + size_t shorted = 0;
>>>> ssize_t ret;
>>>>
>>>> if (bdev_read_only(I_BDEV(bd_inode)))
>>>> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>>> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
>>>> return -EOPNOTSUPP;
>>>>
>>>> - iov_iter_truncate(from, size - iocb->ki_pos);
>>>> + size -= iocb->ki_pos;
>>>> + if (iov_iter_count(from) > size) {
>>>> + shorted = iov_iter_count(from) - size;
>>>> + iov_iter_truncate(from, size);
>>>> + }
>>>>
>>>> blk_start_plug(&plug);
>>>> ret = __generic_file_write_iter(iocb, from);
>>>> if (ret > 0)
>>>> ret = generic_write_sync(iocb, ret);
>>>> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
>>>> blk_finish_plug(&plug);
>>>> return ret;
>>>> }
>>>> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
>>>> struct inode *bd_inode = bdev_file_inode(file);
>>>> loff_t size = i_size_read(bd_inode);
>>>> loff_t pos = iocb->ki_pos;
>>>> + size_t shorted = 0;
>>>> + ssize_t ret;
>>>>
>>>> if (pos >= size)
>>>> return 0;
>>>>
>>>> size -= pos;
>>>> - iov_iter_truncate(to, size);
>>>> - return generic_file_read_iter(iocb, to);
>>>> + if (iov_iter_count(to) > size) {
>>>> + shorted = iov_iter_count(to) - size;
>>>> + iov_iter_truncate(to, size);
>>>> + }
>>>> +
>>>> + ret = generic_file_read_iter(iocb, to);
>>>> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
>>>> + return ret;
>>>> }
>>>> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>>>>
>>>>
>>>
>>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-28 6:16 ` yangerkun
@ 2021-04-30 12:57 ` Pavel Begunkov
2021-04-30 14:35 ` Al Viro
0 siblings, 1 reply; 18+ messages in thread
From: Pavel Begunkov @ 2021-04-30 12:57 UTC (permalink / raw)
To: yangerkun, axboe; +Cc: viro, linux-fsdevel, linux-block, io-uring
On 4/28/21 7:16 AM, yangerkun wrote:
> Hi,
>
> Should we pick this patch for 5.13?
Looks ok to me
>
> 在 2021/4/16 1:39, Pavel Begunkov 写道:
>> On 15/04/2021 18:37, Pavel Begunkov wrote:
>>> On 09/04/2021 15:49, Pavel Begunkov wrote:
>>>> On 01/04/2021 08:18, yangerkun wrote:
>>>>> We get a bug:
>>>>>
>>>>> BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
>>>>> lib/iov_iter.c:1139
>>>>> Read of size 8 at addr ffff0000d3fb11f8 by task
>>>>>
>>>>> CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
>>>>> 5.10.0-00843-g352c8610ccd2 #2
>>>>> Hardware name: linux,dummy-virt (DT)
>>>>> Call trace:
>>> ...
>>>>> __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
>>>>> iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
>>>>> io_read fs/io_uring.c:3421 [inline]
>>>>> io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
>>>>> __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
>>>>> io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
>>>>> io_submit_sqe fs/io_uring.c:6395 [inline]
>>>>> io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
>>> ...
>>>>>
>>>>> blkdev_read_iter can truncate iov_iter's count since the count + pos may
>>>>> exceed the size of the blkdev. This will confuse io_read that we have
>>>>> consume the iovec. And once we do the iov_iter_revert in io_read, we
>>>>> will trigger the slab-out-of-bounds. Fix it by reexpand the count with
>>>>> size has been truncated.
>>>>
>>>> Looks right,
>>>>
>>>> Acked-by: Pavel Begunkov <[email protected]>
>>>
>>> Fwiw, we need to forget to drag it through 5.13 + stable
>>
>> Err, yypo, to _not_ forget to 5.13 + stable...
>>
>>>
>>>
>>>>>
>>>>> blkdev_write_iter can trigger the problem too.
>>>>>
>>>>> Signed-off-by: yangerkun <[email protected]>
>>>>> ---
>>>>> fs/block_dev.c | 20 +++++++++++++++++---
>>>>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>>>>> index 92ed7d5df677..788e1014576f 100644
>>>>> --- a/fs/block_dev.c
>>>>> +++ b/fs/block_dev.c
>>>>> @@ -1680,6 +1680,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>>>> struct inode *bd_inode = bdev_file_inode(file);
>>>>> loff_t size = i_size_read(bd_inode);
>>>>> struct blk_plug plug;
>>>>> + size_t shorted = 0;
>>>>> ssize_t ret;
>>>>> if (bdev_read_only(I_BDEV(bd_inode)))
>>>>> @@ -1697,12 +1698,17 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>>>> if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
>>>>> return -EOPNOTSUPP;
>>>>> - iov_iter_truncate(from, size - iocb->ki_pos);
>>>>> + size -= iocb->ki_pos;
>>>>> + if (iov_iter_count(from) > size) {
>>>>> + shorted = iov_iter_count(from) - size;
>>>>> + iov_iter_truncate(from, size);
>>>>> + }
>>>>> blk_start_plug(&plug);
>>>>> ret = __generic_file_write_iter(iocb, from);
>>>>> if (ret > 0)
>>>>> ret = generic_write_sync(iocb, ret);
>>>>> + iov_iter_reexpand(from, iov_iter_count(from) + shorted);
>>>>> blk_finish_plug(&plug);
>>>>> return ret;
>>>>> }
>>>>> @@ -1714,13 +1720,21 @@ ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
>>>>> struct inode *bd_inode = bdev_file_inode(file);
>>>>> loff_t size = i_size_read(bd_inode);
>>>>> loff_t pos = iocb->ki_pos;
>>>>> + size_t shorted = 0;
>>>>> + ssize_t ret;
>>>>> if (pos >= size)
>>>>> return 0;
>>>>> size -= pos;
>>>>> - iov_iter_truncate(to, size);
>>>>> - return generic_file_read_iter(iocb, to);
>>>>> + if (iov_iter_count(to) > size) {
>>>>> + shorted = iov_iter_count(to) - size;
>>>>> + iov_iter_truncate(to, size);
>>>>> + }
>>>>> +
>>>>> + ret = generic_file_read_iter(iocb, to);
>>>>> + iov_iter_reexpand(to, iov_iter_count(to) + shorted);
>>>>> + return ret;
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(blkdev_read_iter);
>>>>>
>>>>
>>>
>>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-30 12:57 ` Pavel Begunkov
@ 2021-04-30 14:35 ` Al Viro
2021-05-06 16:57 ` Pavel Begunkov
2021-05-06 17:19 ` Jens Axboe
0 siblings, 2 replies; 18+ messages in thread
From: Al Viro @ 2021-04-30 14:35 UTC (permalink / raw)
To: Pavel Begunkov; +Cc: yangerkun, axboe, linux-fsdevel, linux-block, io-uring
On Fri, Apr 30, 2021 at 01:57:22PM +0100, Pavel Begunkov wrote:
> On 4/28/21 7:16 AM, yangerkun wrote:
> > Hi,
> >
> > Should we pick this patch for 5.13?
>
> Looks ok to me
Looks sane. BTW, Pavel, could you go over #untested.iov_iter
and give it some beating? Ideally - with per-commit profiling to see
what speedups/slowdowns do they come with...
It's not in the final state (if nothing else, it needs to be
rebased on top of xarray stuff, and there will be followup cleanups
as well), but I'd appreciate testing and profiling data...
It does survive xfstests + LTP syscall tests, but that's about
it.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-30 14:35 ` Al Viro
@ 2021-05-06 16:57 ` Pavel Begunkov
2021-05-06 17:17 ` Al Viro
2021-05-06 17:19 ` Jens Axboe
1 sibling, 1 reply; 18+ messages in thread
From: Pavel Begunkov @ 2021-05-06 16:57 UTC (permalink / raw)
To: Al Viro, Jens Axboe; +Cc: yangerkun, linux-fsdevel, linux-block, io-uring
On 4/30/21 3:35 PM, Al Viro wrote:
> On Fri, Apr 30, 2021 at 01:57:22PM +0100, Pavel Begunkov wrote:
>> On 4/28/21 7:16 AM, yangerkun wrote:
>>> Hi,
>>>
>>> Should we pick this patch for 5.13?
>>
>> Looks ok to me
>
> Looks sane. BTW, Pavel, could you go over #untested.iov_iter
> and give it some beating? Ideally - with per-commit profiling to see
> what speedups/slowdowns do they come with...
I've heard Jens already tested it out. Jens, is that right? Can you
share? especially since you have much more fitting hardware.
>
> It's not in the final state (if nothing else, it needs to be
> rebased on top of xarray stuff, and there will be followup cleanups
> as well), but I'd appreciate testing and profiling data...
>
> It does survive xfstests + LTP syscall tests, but that's about
> it.
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-05-06 16:57 ` Pavel Begunkov
@ 2021-05-06 17:17 ` Al Viro
0 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2021-05-06 17:17 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Jens Axboe, yangerkun, linux-fsdevel, linux-block, io-uring
On Thu, May 06, 2021 at 05:57:02PM +0100, Pavel Begunkov wrote:
> On 4/30/21 3:35 PM, Al Viro wrote:
> > On Fri, Apr 30, 2021 at 01:57:22PM +0100, Pavel Begunkov wrote:
> >> On 4/28/21 7:16 AM, yangerkun wrote:
> >>> Hi,
> >>>
> >>> Should we pick this patch for 5.13?
> >>
> >> Looks ok to me
> >
> > Looks sane. BTW, Pavel, could you go over #untested.iov_iter
> > and give it some beating? Ideally - with per-commit profiling to see
> > what speedups/slowdowns do they come with...
>
> I've heard Jens already tested it out. Jens, is that right? Can you
> share? especially since you have much more fitting hardware.
FWIW, the current branch is #untested.iov_iter-3 and the code generated
by it at least _looks_ better than with mainline; how much of an improvement
does it make would have to be found by profiling...
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-04-30 14:35 ` Al Viro
2021-05-06 16:57 ` Pavel Begunkov
@ 2021-05-06 17:19 ` Jens Axboe
2021-05-06 18:55 ` Al Viro
1 sibling, 1 reply; 18+ messages in thread
From: Jens Axboe @ 2021-05-06 17:19 UTC (permalink / raw)
To: Al Viro, Pavel Begunkov; +Cc: yangerkun, linux-fsdevel, linux-block, io-uring
On 4/30/21 8:35 AM, Al Viro wrote:
> On Fri, Apr 30, 2021 at 01:57:22PM +0100, Pavel Begunkov wrote:
>> On 4/28/21 7:16 AM, yangerkun wrote:
>>> Hi,
>>>
>>> Should we pick this patch for 5.13?
>>
>> Looks ok to me
>
> Looks sane. BTW, Pavel, could you go over #untested.iov_iter
> and give it some beating? Ideally - with per-commit profiling to see
> what speedups/slowdowns do they come with...
>
> It's not in the final state (if nothing else, it needs to be
> rebased on top of xarray stuff, and there will be followup cleanups
> as well), but I'd appreciate testing and profiling data...
>
> It does survive xfstests + LTP syscall tests, but that's about
> it.
Al, I ran your v3 branch of that and I didn't see anything in terms
of speedups. The test case is something that just writes to eventfd
a ton of times, enough to get a picture of the overall runtime. First
I ran with the existing baseline, which is eventfd using ->write():
Executed in 436.58 millis fish external
usr time 106.21 millis 121.00 micros 106.09 millis
sys time 331.32 millis 33.00 micros 331.29 millis
Executed in 436.84 millis fish external
usr time 113.38 millis 0.00 micros 113.38 millis
sys time 324.32 millis 226.00 micros 324.10 millis
Then I ran it with the eventfd ->write_iter() patch I posted:
Executed in 484.54 millis fish external
usr time 93.19 millis 119.00 micros 93.07 millis
sys time 391.35 millis 46.00 micros 391.30 millis
Executed in 485.45 millis fish external
usr time 96.05 millis 0.00 micros 96.05 millis
sys time 389.42 millis 216.00 micros 389.20 millis
Doing a quick profile, on the latter run with ->write_iter() we're
spending 8% of the time in _copy_from_iter(), and 4% in
new_sync_write(). That's obviously not there at all for the first case.
Both have about 4% in eventfd_write(). Non-iter case spends 1% in
copy_from_user().
Finally with your branch pulled in as well, iow using ->write_iter() for
eventfd and your iov changes:
Executed in 485.26 millis fish external
usr time 103.09 millis 70.00 micros 103.03 millis
sys time 382.18 millis 83.00 micros 382.09 millis
Executed in 485.16 millis fish external
usr time 104.07 millis 69.00 micros 104.00 millis
sys time 381.09 millis 94.00 micros 381.00 millis
and there's no real difference there. We're spending less time in
_copy_from_iter() (8% -> 6%) and less time in new_sync_write(), but
doesn't seem to manifest itself in reduced runtime.
--
Jens Axboe
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-05-06 17:19 ` Jens Axboe
@ 2021-05-06 18:55 ` Al Viro
2021-05-06 19:15 ` Jens Axboe
0 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2021-05-06 18:55 UTC (permalink / raw)
To: Jens Axboe
Cc: Pavel Begunkov, yangerkun, linux-fsdevel, linux-block, io-uring
On Thu, May 06, 2021 at 11:19:03AM -0600, Jens Axboe wrote:
> Doing a quick profile, on the latter run with ->write_iter() we're
> spending 8% of the time in _copy_from_iter(), and 4% in
> new_sync_write(). That's obviously not there at all for the first case.
> Both have about 4% in eventfd_write(). Non-iter case spends 1% in
> copy_from_user().
>
> Finally with your branch pulled in as well, iow using ->write_iter() for
> eventfd and your iov changes:
>
> Executed in 485.26 millis fish external
> usr time 103.09 millis 70.00 micros 103.03 millis
> sys time 382.18 millis 83.00 micros 382.09 millis
>
> Executed in 485.16 millis fish external
> usr time 104.07 millis 69.00 micros 104.00 millis
> sys time 381.09 millis 94.00 micros 381.00 millis
>
> and there's no real difference there. We're spending less time in
> _copy_from_iter() (8% -> 6%) and less time in new_sync_write(), but
> doesn't seem to manifest itself in reduced runtime.
Interesting... do you have instruction-level profiles for _copy_from_iter()
and new_sync_write() on the last of those trees?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-05-06 18:55 ` Al Viro
@ 2021-05-06 19:15 ` Jens Axboe
2021-05-06 21:08 ` Al Viro
0 siblings, 1 reply; 18+ messages in thread
From: Jens Axboe @ 2021-05-06 19:15 UTC (permalink / raw)
To: Al Viro; +Cc: Pavel Begunkov, yangerkun, linux-fsdevel, linux-block, io-uring
[-- Attachment #1: Type: text/plain, Size: 1273 bytes --]
On 5/6/21 12:55 PM, Al Viro wrote:
> On Thu, May 06, 2021 at 11:19:03AM -0600, Jens Axboe wrote:
>
>> Doing a quick profile, on the latter run with ->write_iter() we're
>> spending 8% of the time in _copy_from_iter(), and 4% in
>> new_sync_write(). That's obviously not there at all for the first case.
>> Both have about 4% in eventfd_write(). Non-iter case spends 1% in
>> copy_from_user().
>>
>> Finally with your branch pulled in as well, iow using ->write_iter() for
>> eventfd and your iov changes:
>>
>> Executed in 485.26 millis fish external
>> usr time 103.09 millis 70.00 micros 103.03 millis
>> sys time 382.18 millis 83.00 micros 382.09 millis
>>
>> Executed in 485.16 millis fish external
>> usr time 104.07 millis 69.00 micros 104.00 millis
>> sys time 381.09 millis 94.00 micros 381.00 millis
>>
>> and there's no real difference there. We're spending less time in
>> _copy_from_iter() (8% -> 6%) and less time in new_sync_write(), but
>> doesn't seem to manifest itself in reduced runtime.
>
> Interesting... do you have instruction-level profiles for _copy_from_iter()
> and new_sync_write() on the last of those trees?
Attached output of perf annotate <func> for that last run.
--
Jens Axboe
[-- Attachment #2: nsw --]
[-- Type: text/plain, Size: 10648 bytes --]
Percent | Source code & Disassembly of vmlinux for cycles (72 samples, percent: local period)
---------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: ffffffff812cef20 <new_sync_write>:
: new_sync_write():
: inc_syscr(current);
: return ret;
: }
:
: static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t len, loff_t *ppos)
: {
0.00 : ffffffff812cef20: callq ffffffff8103a8a0 <__fentry__>
0.00 : ffffffff812cef25: push %rbp
0.00 : ffffffff812cef26: mov %rdx,%r8
5.55 : ffffffff812cef29: mov %rsp,%rbp
0.00 : ffffffff812cef2c: push %r12
0.00 : ffffffff812cef2e: push %rbx
0.00 : ffffffff812cef2f: mov %rcx,%r12
0.00 : ffffffff812cef32: sub $0x68,%rsp
: struct iovec iov = { .iov_base = (void __user *)buf, .iov_len = len };
0.00 : ffffffff812cef36: mov %rdx,-0x70(%rbp)
: iocb_flags():
: }
:
: static inline int iocb_flags(struct file *file)
: {
: int res = 0;
: if (file->f_flags & O_APPEND)
0.00 : ffffffff812cef3a: mov 0x40(%rdi),%edx
: new_sync_write():
: {
8.33 : ffffffff812cef3d: mov %rdi,%rbx
: struct iovec iov = { .iov_base = (void __user *)buf, .iov_len = len };
0.00 : ffffffff812cef40: mov %rsi,-0x78(%rbp)
: iocb_flags():
0.00 : ffffffff812cef44: mov %edx,%eax
0.00 : ffffffff812cef46: shr $0x6,%eax
0.00 : ffffffff812cef49: and $0x10,%eax
: res |= IOCB_APPEND;
: if (file->f_flags & O_DIRECT)
: res |= IOCB_DIRECT;
0.00 : ffffffff812cef4c: mov %eax,%ecx
0.00 : ffffffff812cef4e: or $0x20000,%ecx
0.00 : ffffffff812cef54: test $0x40,%dh
6.94 : ffffffff812cef57: cmovne %ecx,%eax
: if ((file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host))
0.00 : ffffffff812cef5a: test $0x10,%dh
0.00 : ffffffff812cef5d: jne ffffffff812cef77 <new_sync_write+0x57>
0.00 : ffffffff812cef5f: mov 0xd0(%rdi),%rcx
0.00 : ffffffff812cef66: mov (%rcx),%rcx
0.00 : ffffffff812cef69: mov 0x28(%rcx),%rsi
0.00 : ffffffff812cef6d: testb $0x10,0x50(%rsi)
13.89 : ffffffff812cef71: je ffffffff812cf04c <new_sync_write+0x12c>
: res |= IOCB_DSYNC;
0.00 : ffffffff812cef77: or $0x2,%eax
: if (file->f_flags & __O_SYNC)
: res |= IOCB_SYNC;
0.00 : ffffffff812cef7a: mov %eax,%ecx
0.00 : ffffffff812cef7c: or $0x4,%ecx
0.00 : ffffffff812cef7f: and $0x100000,%edx
: file_write_hint():
: if (file->f_write_hint != WRITE_LIFE_NOT_SET)
0.00 : ffffffff812cef85: mov 0x34(%rbx),%edx
: iocb_flags():
: res |= IOCB_SYNC;
0.00 : ffffffff812cef88: cmovne %ecx,%eax
: file_write_hint():
: if (file->f_write_hint != WRITE_LIFE_NOT_SET)
0.00 : ffffffff812cef8b: test %edx,%edx
6.97 : ffffffff812cef8d: jne ffffffff812cf03c <new_sync_write+0x11c>
: return file_inode(file)->i_write_hint;
0.00 : ffffffff812cef93: mov 0x20(%rbx),%rdx
0.00 : ffffffff812cef97: movzbl 0x87(%rdx),%edx
: get_current():
:
: DECLARE_PER_CPU(struct task_struct *, current_task);
:
: static __always_inline struct task_struct *get_current(void)
: {
: return this_cpu_read_stable(current_task);
0.00 : ffffffff812cef9e: mov %gs:0x126c0,%rcx
: get_current_ioprio():
: * If the calling process has set an I/O priority, use that. Otherwise, return
: * the default I/O priority.
: */
: static inline int get_current_ioprio(void)
: {
: struct io_context *ioc = current->io_context;
0.00 : ffffffff812cefa7: mov 0x860(%rcx),%rsi
:
: if (ioc)
0.00 : ffffffff812cefae: xor %ecx,%ecx
0.00 : ffffffff812cefb0: test %rsi,%rsi
0.00 : ffffffff812cefb3: je ffffffff812cefb9 <new_sync_write+0x99>
: return ioc->ioprio;
0.00 : ffffffff812cefb5: movzwl 0x14(%rsi),%ecx
: init_sync_kiocb():
: *kiocb = (struct kiocb) {
0.00 : ffffffff812cefb9: shl $0x10,%ecx
12.50 : ffffffff812cefbc: movzwl %dx,%edx
0.00 : ffffffff812cefbf: movq $0x0,-0x38(%rbp)
0.00 : ffffffff812cefc7: movq $0x0,-0x30(%rbp)
0.00 : ffffffff812cefcf: or %ecx,%edx
0.00 : ffffffff812cefd1: movq $0x0,-0x28(%rbp)
0.00 : ffffffff812cefd9: movq $0x0,-0x18(%rbp)
0.00 : ffffffff812cefe1: mov %rbx,-0x40(%rbp)
0.00 : ffffffff812cefe5: mov %eax,-0x20(%rbp)
6.93 : ffffffff812cefe8: mov %edx,-0x1c(%rbp)
: new_sync_write():
: struct kiocb kiocb;
: struct iov_iter iter;
: ssize_t ret;
:
: init_sync_kiocb(&kiocb, filp);
: kiocb.ki_pos = (ppos ? *ppos : 0);
0.00 : ffffffff812cefeb: test %r12,%r12
0.00 : ffffffff812cefee: je ffffffff812cf05b <new_sync_write+0x13b>
: iov_iter_init(&iter, WRITE, &iov, 1, len);
0.00 : ffffffff812ceff0: mov $0x1,%esi
0.00 : ffffffff812ceff5: lea -0x68(%rbp),%rdi
0.00 : ffffffff812ceff9: mov $0x1,%ecx
0.00 : ffffffff812ceffe: lea -0x78(%rbp),%rdx
: kiocb.ki_pos = (ppos ? *ppos : 0);
0.00 : ffffffff812cf002: mov (%r12),%rax
0.00 : ffffffff812cf006: mov %rax,-0x38(%rbp)
: iov_iter_init(&iter, WRITE, &iov, 1, len);
8.33 : ffffffff812cf00a: callq ffffffff814c45e0 <iov_iter_init>
: call_write_iter():
: return file->f_op->write_iter(kio, iter);
12.51 : ffffffff812cf00f: mov 0x28(%rbx),%rax
0.00 : ffffffff812cf013: lea -0x68(%rbp),%rsi
0.00 : ffffffff812cf017: lea -0x40(%rbp),%rdi
0.00 : ffffffff812cf01b: callq *0x28(%rax)
: new_sync_write():
:
: ret = call_write_iter(filp, &kiocb, &iter);
: BUG_ON(ret == -EIOCBQUEUED);
0.00 : ffffffff812cf01e: cmp $0xfffffffffffffdef,%rax
0.00 : ffffffff812cf024: je ffffffff812cf089 <new_sync_write+0x169>
: if (ret > 0 && ppos)
0.00 : ffffffff812cf026: test %rax,%rax
0.00 : ffffffff812cf029: jle ffffffff812cf033 <new_sync_write+0x113>
: *ppos = kiocb.ki_pos;
0.00 : ffffffff812cf02b: mov -0x38(%rbp),%rdx
12.49 : ffffffff812cf02f: mov %rdx,(%r12)
: return ret;
: }
0.00 : ffffffff812cf033: add $0x68,%rsp
0.00 : ffffffff812cf037: pop %rbx
0.00 : ffffffff812cf038: pop %r12
0.00 : ffffffff812cf03a: pop %rbp
0.00 : ffffffff812cf03b: retq
: ki_hint_validate():
: if (hint <= max_hint)
0.00 : ffffffff812cf03c: xor %ecx,%ecx
0.00 : ffffffff812cf03e: cmp $0xffff,%edx
0.00 : ffffffff812cf044: cmova %ecx,%edx
0.00 : ffffffff812cf047: jmpq ffffffff812cef9e <new_sync_write+0x7e>
: iocb_flags():
: if ((file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host))
5.55 : ffffffff812cf04c: testb $0x1,0xc(%rcx)
0.00 : ffffffff812cf050: je ffffffff812cef7a <new_sync_write+0x5a>
0.00 : ffffffff812cf056: jmpq ffffffff812cef77 <new_sync_write+0x57>
: new_sync_write():
: iov_iter_init(&iter, WRITE, &iov, 1, len);
0.00 : ffffffff812cf05b: mov $0x1,%esi
0.00 : ffffffff812cf060: lea -0x68(%rbp),%rdi
0.00 : ffffffff812cf064: mov $0x1,%ecx
0.00 : ffffffff812cf069: lea -0x78(%rbp),%rdx
0.00 : ffffffff812cf06d: callq ffffffff814c45e0 <iov_iter_init>
: call_write_iter():
: return file->f_op->write_iter(kio, iter);
0.00 : ffffffff812cf072: mov 0x28(%rbx),%rax
0.00 : ffffffff812cf076: lea -0x68(%rbp),%rsi
0.00 : ffffffff812cf07a: lea -0x40(%rbp),%rdi
0.00 : ffffffff812cf07e: callq *0x28(%rax)
: new_sync_write():
: BUG_ON(ret == -EIOCBQUEUED);
0.00 : ffffffff812cf081: cmp $0xfffffffffffffdef,%rax
0.00 : ffffffff812cf087: jne ffffffff812cf033 <new_sync_write+0x113>
0.00 : ffffffff812cf089: ud2
[-- Attachment #3: cfi --]
[-- Type: text/plain, Size: 30346 bytes --]
Percent | Source code & Disassembly of vmlinux for cycles (113 samples, percent: local period)
----------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: ffffffff814c6aa0 <_copy_from_iter>:
: _copy_from_iter():
: }
: EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
: #endif /* CONFIG_ARCH_HAS_COPY_MC */
:
: size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
: {
0.00 : ffffffff814c6aa0: push %rbp
7.07 : ffffffff814c6aa1: mov %rdx,%rax
0.00 : ffffffff814c6aa4: mov %rsp,%rbp
0.00 : ffffffff814c6aa7: push %r15
0.00 : ffffffff814c6aa9: push %r14
0.00 : ffffffff814c6aab: push %r13
3.54 : ffffffff814c6aad: push %r12
0.00 : ffffffff814c6aaf: push %rbx
0.00 : ffffffff814c6ab0: sub $0x50,%rsp
0.00 : ffffffff814c6ab4: mov %rdx,-0x78(%rbp)
: iov_iter_type():
: };
: };
:
: static inline enum iter_type iov_iter_type(const struct iov_iter *i)
: {
: return i->iter_type;
0.89 : ffffffff814c6ab8: movzbl (%rdx),%edx
: _copy_from_iter():
0.00 : ffffffff814c6abb: mov %rdi,-0x68(%rbp)
: if (unlikely(iov_iter_is_pipe(i))) {
0.00 : ffffffff814c6abf: cmp $0x3,%dl
0.00 : ffffffff814c6ac2: je ffffffff814c6bd6 <_copy_from_iter+0x136>
0.00 : ffffffff814c6ac8: mov %rax,%rdi
: WARN_ON(1);
: return 0;
: }
: if (iter_is_iovec(i))
: might_fault();
: iterate_and_advance(i, bytes, base, len, off,
0.00 : ffffffff814c6acb: mov 0x10(%rax),%rax
0.00 : ffffffff814c6acf: cmp %rsi,%rax
0.00 : ffffffff814c6ad2: cmovbe %rax,%rsi
2.65 : ffffffff814c6ad6: mov %rsi,%r13
0.00 : ffffffff814c6ad9: test %rsi,%rsi
3.52 : ffffffff814c6adc: je ffffffff814c6bdd <_copy_from_iter+0x13d>
1.76 : ffffffff814c6ae2: test %dl,%dl
0.00 : ffffffff814c6ae4: jne ffffffff814c6be2 <_copy_from_iter+0x142>
0.00 : ffffffff814c6aea: mov 0x18(%rdi),%rax
0.00 : ffffffff814c6aee: mov 0x8(%rdi),%r14
0.00 : ffffffff814c6af2: xor %r15d,%r15d
0.00 : ffffffff814c6af5: mov -0x68(%rbp),%rdi
25.58 : ffffffff814c6af9: lea 0x10(%rax),%r12
0.00 : ffffffff814c6afd: jmp ffffffff814c6b0e <_copy_from_iter+0x6e>
0.00 : ffffffff814c6aff: mov -0x68(%rbp),%rax
0.00 : ffffffff814c6b03: lea (%rax,%r15,1),%rdi
0.00 : ffffffff814c6b07: add $0x10,%r12
: {
0.00 : ffffffff814c6b0b: xor %r14d,%r14d
: iterate_and_advance(i, bytes, base, len, off,
0.00 : ffffffff814c6b0e: mov -0x8(%r12),%rcx
1.09 : ffffffff814c6b13: lea -0x10(%r12),%rax
0.00 : ffffffff814c6b18: mov %r12,-0x60(%rbp)
0.00 : ffffffff814c6b1c: mov %rax,-0x70(%rbp)
0.00 : ffffffff814c6b20: mov %rcx,%rbx
0.00 : ffffffff814c6b23: sub %r14,%rbx
1.76 : ffffffff814c6b26: cmp %r13,%rbx
0.00 : ffffffff814c6b29: cmova %r13,%rbx
0.00 : ffffffff814c6b2d: test %rbx,%rbx
0.00 : ffffffff814c6b30: je ffffffff814c6b07 <_copy_from_iter+0x67>
0.00 : ffffffff814c6b32: mov -0x10(%r12),%rsi
0.00 : ffffffff814c6b37: mov %rbx,%rax
0.00 : ffffffff814c6b3a: add %r14,%rsi
: __chk_range_not_ok():
: */
: if (__builtin_constant_p(size))
: return unlikely(addr > limit - size);
:
: /* Arbitrary sizes? Be careful about overflow */
: addr += size;
0.00 : ffffffff814c6b3d: add %rsi,%rax
4.42 : ffffffff814c6b40: jb ffffffff814c6bd1 <_copy_from_iter+0x131>
: copyin():
: if (access_ok(from, n)) {
0.00 : ffffffff814c6b46: movabs $0x7ffffffff000,%rdx
0.00 : ffffffff814c6b50: cmp %rdx,%rax
3.52 : ffffffff814c6b53: ja ffffffff814c6bd1 <_copy_from_iter+0x131>
: copy_user_generic():
: /*
: * If CPU has ERMS feature, use copy_user_enhanced_fast_string.
: * Otherwise, if CPU has rep_good feature, use copy_user_generic_string.
: * Otherwise, use copy_user_generic_unrolled.
: */
: alternative_call_2(copy_user_generic_unrolled,
0.00 : ffffffff814c6b55: mov %ebx,%edx
0.00 : ffffffff814c6b57: callq ffffffff81523880 <copy_user_generic_unrolled>
: _copy_from_iter():
: iterate_and_advance(i, bytes, base, len, off,
6.18 : ffffffff814c6b5c: mov -0x8(%r12),%rcx
: copy_user_generic():
: X86_FEATURE_ERMS,
: ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
: "=d" (len)),
: "1" (to), "2" (from), "3" (len)
: : "memory", "rcx", "r8", "r9", "r10", "r11");
: return ret;
0.00 : ffffffff814c6b61: mov %eax,%eax
: _copy_from_iter():
0.00 : ffffffff814c6b63: cltq
0.00 : ffffffff814c6b65: mov %rbx,%rdx
0.00 : ffffffff814c6b68: sub %rbx,%r13
0.00 : ffffffff814c6b6b: sub %rax,%rdx
0.00 : ffffffff814c6b6e: add %rax,%r13
0.00 : ffffffff814c6b71: add %rdx,%r15
3.53 : ffffffff814c6b74: add %r14,%rdx
0.00 : ffffffff814c6b77: cmp %rcx,%rdx
0.00 : ffffffff814c6b7a: jb ffffffff814c6bc4 <_copy_from_iter+0x124>
0.00 : ffffffff814c6b7c: test %r13,%r13
0.00 : ffffffff814c6b7f: jne ffffffff814c6aff <_copy_from_iter+0x5f>
0.00 : ffffffff814c6b85: mov -0x78(%rbp),%rcx
2.66 : ffffffff814c6b89: mov -0x60(%rbp),%rdi
2.65 : ffffffff814c6b8d: mov %rdi,%rax
0.00 : ffffffff814c6b90: sub 0x18(%rcx),%rax
11.54 : ffffffff814c6b94: mov %r13,0x8(%rcx)
0.00 : ffffffff814c6b98: mov %rdi,0x18(%rcx)
12.36 : ffffffff814c6b9c: mov %rcx,%rdi
0.00 : ffffffff814c6b9f: sar $0x4,%rax
0.00 : ffffffff814c6ba3: sub %rax,0x20(%rcx)
0.00 : ffffffff814c6ba7: mov 0x10(%rcx),%rax
0.00 : ffffffff814c6bab: sub %r15,%rax
0.00 : ffffffff814c6bae: mov %rax,0x10(%rdi)
: copyin(addr + off, base, len),
: memcpy(addr + off, base, len)
: )
:
: return bytes;
: }
3.53 : ffffffff814c6bb2: add $0x50,%rsp
0.00 : ffffffff814c6bb6: mov %r15,%rax
0.00 : ffffffff814c6bb9: pop %rbx
0.00 : ffffffff814c6bba: pop %r12
0.00 : ffffffff814c6bbc: pop %r13
0.00 : ffffffff814c6bbe: pop %r14
0.00 : ffffffff814c6bc0: pop %r15
1.76 : ffffffff814c6bc2: pop %rbp
0.00 : ffffffff814c6bc3: retq
0.00 : ffffffff814c6bc4: mov -0x70(%rbp),%rax
0.00 : ffffffff814c6bc8: mov %rdx,%r13
0.00 : ffffffff814c6bcb: mov %rax,-0x60(%rbp)
0.00 : ffffffff814c6bcf: jmp ffffffff814c6b85 <_copy_from_iter+0xe5>
: copyin():
0.00 : ffffffff814c6bd1: mov %rbx,%rax
0.00 : ffffffff814c6bd4: jmp ffffffff814c6b63 <_copy_from_iter+0xc3>
: _copy_from_iter():
: WARN_ON(1);
0.00 : ffffffff814c6bd6: ud2
: return 0;
0.00 : ffffffff814c6bd8: xor %r15d,%r15d
0.00 : ffffffff814c6bdb: jmp ffffffff814c6bb2 <_copy_from_iter+0x112>
0.00 : ffffffff814c6bdd: xor %r15d,%r15d
0.00 : ffffffff814c6be0: jmp ffffffff814c6bb2 <_copy_from_iter+0x112>
: iterate_and_advance(i, bytes, base, len, off,
0.00 : ffffffff814c6be2: cmp $0x2,%dl
0.00 : ffffffff814c6be5: je ffffffff814c6e09 <_copy_from_iter+0x369>
0.00 : ffffffff814c6beb: cmp $0x1,%dl
0.00 : ffffffff814c6bee: je ffffffff814c6d6b <_copy_from_iter+0x2cb>
0.00 : ffffffff814c6bf4: mov %rsi,%r15
0.00 : ffffffff814c6bf7: cmp $0x4,%dl
0.00 : ffffffff814c6bfa: jne ffffffff814c6bab <_copy_from_iter+0x10b>
0.00 : ffffffff814c6bfc: mov 0x8(%rdi),%rax
0.00 : ffffffff814c6c00: add 0x20(%rdi),%rax
0.00 : ffffffff814c6c04: movl $0x0,-0x48(%rbp)
0.00 : ffffffff814c6c0b: movq $0x3,-0x40(%rbp)
0.00 : ffffffff814c6c13: movq $0x0,-0x38(%rbp)
0.00 : ffffffff814c6c1b: movq $0x0,-0x30(%rbp)
0.00 : ffffffff814c6c23: mov %eax,%ebx
0.00 : ffffffff814c6c25: shr $0xc,%rax
0.00 : ffffffff814c6c29: mov %rax,%rcx
0.00 : ffffffff814c6c2c: mov %rax,-0x60(%rbp)
0.00 : ffffffff814c6c30: mov 0x18(%rdi),%rax
0.00 : ffffffff814c6c34: and $0xfff,%ebx
0.00 : ffffffff814c6c3a: mov %rcx,-0x50(%rbp)
0.00 : ffffffff814c6c3e: mov %rax,-0x58(%rbp)
0.00 : ffffffff814c6c42: mov $0xffffffffffffffff,%rsi
0.00 : ffffffff814c6c49: lea -0x58(%rbp),%rdi
0.00 : ffffffff814c6c4d: xor %r15d,%r15d
0.00 : ffffffff814c6c50: callq ffffffff8151fe40 <xas_find>
0.00 : ffffffff814c6c55: mov %rax,%r14
0.00 : ffffffff814c6c58: test %rax,%rax
0.00 : ffffffff814c6c5b: je ffffffff814c6d51 <_copy_from_iter+0x2b1>
0.00 : ffffffff814c6c61: mov %ebx,%r12d
: xas_retry():
: * Context: Any context.
: * Return: true if the operation needs to be retried.
: */
: static inline bool xas_retry(struct xa_state *xas, const void *entry)
: {
: if (xa_is_zero(entry))
0.00 : ffffffff814c6c64: cmp $0x406,%r14
0.00 : ffffffff814c6c6b: je ffffffff814c6d1c <_copy_from_iter+0x27c>
: return true;
: if (!xa_is_retry(entry))
0.00 : ffffffff814c6c71: cmp $0x402,%r14
0.00 : ffffffff814c6c78: je ffffffff814c6f7f <_copy_from_iter+0x4df>
: _copy_from_iter():
0.00 : ffffffff814c6c7e: test $0x1,%r14b
0.00 : ffffffff814c6c82: jne ffffffff814c6f78 <_copy_from_iter+0x4d8>
0.00 : ffffffff814c6c88: mov %r14,%rdi
0.00 : ffffffff814c6c8b: callq ffffffff81296c00 <PageHuge>
0.00 : ffffffff814c6c90: mov %eax,%ebx
0.00 : ffffffff814c6c92: test %eax,%eax
0.00 : ffffffff814c6c94: jne ffffffff814c6f30 <_copy_from_iter+0x490>
0.00 : ffffffff814c6c9a: mov -0x60(%rbp),%rdi
0.00 : ffffffff814c6c9e: mov 0x20(%r14),%rax
0.00 : ffffffff814c6ca2: mov %edi,%ecx
0.00 : ffffffff814c6ca4: sub %eax,%ecx
0.00 : ffffffff814c6ca6: cmp %rdi,%rax
0.00 : ffffffff814c6ca9: cmovb %ecx,%ebx
0.00 : ffffffff814c6cac: jmp ffffffff814c6d00 <_copy_from_iter+0x260>
0.00 : ffffffff814c6cae: mov %r12d,%eax
0.00 : ffffffff814c6cb1: mov $0x1000,%edx
0.00 : ffffffff814c6cb6: movslq %ebx,%rsi
0.00 : ffffffff814c6cb9: mov -0x68(%rbp),%rcx
0.00 : ffffffff814c6cbd: sub %rax,%rdx
0.00 : ffffffff814c6cc0: cmp %r13,%rdx
0.00 : ffffffff814c6cc3: cmova %r13,%rdx
0.00 : ffffffff814c6cc7: shl $0x6,%rsi
0.00 : ffffffff814c6ccb: add %r14,%rsi
: lowmem_page_address():
: */
: #include <linux/vmstat.h>
:
: static __always_inline void *lowmem_page_address(const struct page *page)
: {
: return page_to_virt(page);
0.00 : ffffffff814c6cce: sub 0xebda73(%rip),%rsi # ffffffff82384748 <vmemmap_base>
: _copy_from_iter():
0.00 : ffffffff814c6cd5: mov %rdx,%r12
0.00 : ffffffff814c6cd8: lea (%rcx,%r15,1),%rdi
0.00 : ffffffff814c6cdc: add %r12,%r15
: lowmem_page_address():
0.00 : ffffffff814c6cdf: sar $0x6,%rsi
0.00 : ffffffff814c6ce3: shl $0xc,%rsi
0.00 : ffffffff814c6ce7: add 0xebda6a(%rip),%rsi # ffffffff82384758 <page_offset_base>
: _copy_from_iter():
0.00 : ffffffff814c6cee: add %rax,%rsi
: memcpy():
: if (q_size < size)
: __read_overflow2();
: }
: if (p_size < size || q_size < size)
: fortify_panic(__func__);
: return __underlying_memcpy(p, q, size);
0.00 : ffffffff814c6cf1: callq ffffffff81a22620 <__memcpy>
: _copy_from_iter():
0.00 : ffffffff814c6cf6: sub %r12,%r13
0.00 : ffffffff814c6cf9: je ffffffff814c6d51 <_copy_from_iter+0x2b1>
0.00 : ffffffff814c6cfb: inc %ebx
0.00 : ffffffff814c6cfd: xor %r12d,%r12d
: constant_test_bit():
: }
:
: static __always_inline bool constant_test_bit(long nr, const volatile unsigned long *addr)
: {
: return ((1UL << (nr & (BITS_PER_LONG-1))) &
: (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
0.00 : ffffffff814c6d00: mov (%r14),%rax
0.00 : ffffffff814c6d03: shr $0x10,%rax
0.00 : ffffffff814c6d07: and $0x1,%eax
: thp_nr_pages():
: */
: static inline int thp_nr_pages(struct page *page)
: {
: VM_BUG_ON_PGFLAGS(PageTail(page), page);
: if (PageHead(page))
: return HPAGE_PMD_NR;
0.00 : ffffffff814c6d0a: cmp $0x1,%al
0.00 : ffffffff814c6d0c: sbb %eax,%eax
0.00 : ffffffff814c6d0e: and $0xfffffe01,%eax
0.00 : ffffffff814c6d13: add $0x200,%eax
: _copy_from_iter():
0.00 : ffffffff814c6d18: cmp %eax,%ebx
0.00 : ffffffff814c6d1a: jl ffffffff814c6cae <_copy_from_iter+0x20e>
: xas_next_entry():
: *
: * Return: The next present entry after the one currently referred to by @xas.
: */
: static inline void *xas_next_entry(struct xa_state *xas, unsigned long max)
: {
: struct xa_node *node = xas->xa_node;
0.00 : ffffffff814c6d1c: mov -0x40(%rbp),%rdi
: xas_not_node():
: return ((unsigned long)node & 3) || !node;
0.00 : ffffffff814c6d20: test $0x3,%dil
0.00 : ffffffff814c6d24: setne %cl
0.00 : ffffffff814c6d27: test %rdi,%rdi
0.00 : ffffffff814c6d2a: sete %al
0.00 : ffffffff814c6d2d: or %al,%cl
0.00 : ffffffff814c6d2f: je ffffffff814c6ecc <_copy_from_iter+0x42c>
: xas_next_entry():
: return xas_find(xas, max);
: if (unlikely(xas->xa_offset == XA_CHUNK_MASK))
: return xas_find(xas, max);
: entry = xa_entry(xas->xa, node, xas->xa_offset + 1);
: if (unlikely(xa_is_internal(entry)))
: return xas_find(xas, max);
0.00 : ffffffff814c6d35: mov $0xffffffffffffffff,%rsi
0.00 : ffffffff814c6d3c: lea -0x58(%rbp),%rdi
0.00 : ffffffff814c6d40: callq ffffffff8151fe40 <xas_find>
0.00 : ffffffff814c6d45: mov %rax,%r14
: _copy_from_iter():
0.00 : ffffffff814c6d48: test %rax,%rax
0.00 : ffffffff814c6d4b: jne ffffffff814c6c64 <_copy_from_iter+0x1c4>
: __rcu_read_unlock():
: }
:
: static inline void __rcu_read_unlock(void)
: {
: preempt_enable();
: rcu_read_unlock_strict();
0.00 : ffffffff814c6d51: callq ffffffff810e12c0 <rcu_read_unlock_strict>
: _copy_from_iter():
0.00 : ffffffff814c6d56: mov -0x78(%rbp),%rax
0.00 : ffffffff814c6d5a: mov -0x78(%rbp),%rdi
0.00 : ffffffff814c6d5e: add %r15,0x8(%rax)
0.00 : ffffffff814c6d62: mov 0x10(%rax),%rax
0.00 : ffffffff814c6d66: jmpq ffffffff814c6bab <_copy_from_iter+0x10b>
0.00 : ffffffff814c6d6b: mov 0x18(%rdi),%rax
0.00 : ffffffff814c6d6f: xor %r15d,%r15d
0.00 : ffffffff814c6d72: mov 0x8(%rdi),%rbx
0.00 : ffffffff814c6d76: mov -0x68(%rbp),%rdi
0.00 : ffffffff814c6d7a: lea 0x10(%rax),%r12
0.00 : ffffffff814c6d7e: mov %r15,%rax
0.00 : ffffffff814c6d81: mov %r12,%r15
0.00 : ffffffff814c6d84: mov %rax,%r12
0.00 : ffffffff814c6d87: jmp ffffffff814c6d97 <_copy_from_iter+0x2f7>
0.00 : ffffffff814c6d89: mov -0x68(%rbp),%rax
0.00 : ffffffff814c6d8d: lea (%rax,%r12,1),%rdi
0.00 : ffffffff814c6d91: add $0x10,%r15
0.00 : ffffffff814c6d95: xor %ebx,%ebx
0.00 : ffffffff814c6d97: mov -0x8(%r15),%r14
0.00 : ffffffff814c6d9b: lea -0x10(%r15),%rax
0.00 : ffffffff814c6d9f: mov %r15,-0x60(%rbp)
0.00 : ffffffff814c6da3: mov %rax,-0x70(%rbp)
0.00 : ffffffff814c6da7: sub %rbx,%r14
0.00 : ffffffff814c6daa: cmp %r13,%r14
0.00 : ffffffff814c6dad: cmova %r13,%r14
0.00 : ffffffff814c6db1: test %r14,%r14
0.00 : ffffffff814c6db4: je ffffffff814c6d91 <_copy_from_iter+0x2f1>
0.00 : ffffffff814c6db6: mov -0x10(%r15),%rsi
: memcpy():
0.00 : ffffffff814c6dba: mov %r14,%rdx
: _copy_from_iter():
0.00 : ffffffff814c6dbd: add %r14,%r12
0.00 : ffffffff814c6dc0: sub %r14,%r13
0.00 : ffffffff814c6dc3: add %rbx,%rsi
: memcpy():
0.00 : ffffffff814c6dc6: callq ffffffff81a22620 <__memcpy>
: _copy_from_iter():
0.00 : ffffffff814c6dcb: lea (%rbx,%r14,1),%rcx
0.00 : ffffffff814c6dcf: cmp %rcx,-0x8(%r15)
0.00 : ffffffff814c6dd3: ja ffffffff814c6eb9 <_copy_from_iter+0x419>
0.00 : ffffffff814c6dd9: test %r13,%r13
0.00 : ffffffff814c6ddc: jne ffffffff814c6d89 <_copy_from_iter+0x2e9>
0.00 : ffffffff814c6dde: mov %r12,%r15
0.00 : ffffffff814c6de1: mov -0x78(%rbp),%rdi
0.00 : ffffffff814c6de5: mov -0x60(%rbp),%rcx
0.00 : ffffffff814c6de9: mov %rcx,%rax
0.00 : ffffffff814c6dec: sub 0x18(%rdi),%rax
0.00 : ffffffff814c6df0: mov %r13,0x8(%rdi)
0.00 : ffffffff814c6df4: mov %rcx,0x18(%rdi)
0.00 : ffffffff814c6df8: sar $0x4,%rax
0.00 : ffffffff814c6dfc: sub %rax,0x20(%rdi)
0.00 : ffffffff814c6e00: mov 0x10(%rdi),%rax
0.00 : ffffffff814c6e04: jmpq ffffffff814c6bab <_copy_from_iter+0x10b>
0.00 : ffffffff814c6e09: mov 0x18(%rdi),%r14
0.00 : ffffffff814c6e0d: mov 0x8(%rdi),%r12d
0.00 : ffffffff814c6e11: xor %r15d,%r15d
0.00 : ffffffff814c6e14: mov 0xc(%r14),%eax
0.00 : ffffffff814c6e18: mov 0x8(%r14),%edx
0.00 : ffffffff814c6e1c: mov $0x1000,%esi
0.00 : ffffffff814c6e21: mov -0x68(%rbp),%rdi
0.00 : ffffffff814c6e25: add %r12d,%eax
0.00 : ffffffff814c6e28: sub %r12d,%edx
0.00 : ffffffff814c6e2b: mov %eax,%ecx
0.00 : ffffffff814c6e2d: and $0xfff,%ecx
0.00 : ffffffff814c6e33: cmp %r13,%rdx
0.00 : ffffffff814c6e36: cmova %r13,%rdx
0.00 : ffffffff814c6e3a: sub %rcx,%rsi
0.00 : ffffffff814c6e3d: cmp %rsi,%rdx
0.00 : ffffffff814c6e40: cmovbe %rdx,%rsi
0.00 : ffffffff814c6e44: shr $0xc,%eax
0.00 : ffffffff814c6e47: add %r15,%rdi
0.00 : ffffffff814c6e4a: mov %rsi,%rbx
0.00 : ffffffff814c6e4d: mov %eax,%esi
0.00 : ffffffff814c6e4f: shl $0x6,%rsi
0.00 : ffffffff814c6e53: add (%r14),%rsi
: memcpy():
0.00 : ffffffff814c6e56: mov %rbx,%rdx
: _copy_from_iter():
0.00 : ffffffff814c6e59: add %rbx,%r15
: lowmem_page_address():
0.00 : ffffffff814c6e5c: sub 0xebd8e5(%rip),%rsi # ffffffff82384748 <vmemmap_base>
: _copy_from_iter():
0.00 : ffffffff814c6e63: add %ebx,%r12d
: lowmem_page_address():
0.00 : ffffffff814c6e66: sar $0x6,%rsi
0.00 : ffffffff814c6e6a: shl $0xc,%rsi
0.00 : ffffffff814c6e6e: add 0xebd8e3(%rip),%rsi # ffffffff82384758 <page_offset_base>
: _copy_from_iter():
0.00 : ffffffff814c6e75: add %rcx,%rsi
: memcpy():
0.00 : ffffffff814c6e78: callq ffffffff81a22620 <__memcpy>
: _copy_from_iter():
0.00 : ffffffff814c6e7d: cmp %r12d,0x8(%r14)
0.00 : ffffffff814c6e81: jne ffffffff814c6e8a <_copy_from_iter+0x3ea>
0.00 : ffffffff814c6e83: add $0x10,%r14
0.00 : ffffffff814c6e87: xor %r12d,%r12d
0.00 : ffffffff814c6e8a: sub %rbx,%r13
0.00 : ffffffff814c6e8d: jne ffffffff814c6e14 <_copy_from_iter+0x374>
0.00 : ffffffff814c6e8f: mov -0x78(%rbp),%rcx
0.00 : ffffffff814c6e93: mov %r12d,%eax
0.00 : ffffffff814c6e96: mov %rax,0x8(%rcx)
0.00 : ffffffff814c6e9a: mov %r14,%rax
0.00 : ffffffff814c6e9d: sub 0x18(%rcx),%rax
0.00 : ffffffff814c6ea1: mov %rcx,%rdi
0.00 : ffffffff814c6ea4: mov %r14,0x18(%rcx)
0.00 : ffffffff814c6ea8: sar $0x4,%rax
0.00 : ffffffff814c6eac: sub %rax,0x20(%rcx)
0.00 : ffffffff814c6eb0: mov 0x10(%rcx),%rax
0.00 : ffffffff814c6eb4: jmpq ffffffff814c6bab <_copy_from_iter+0x10b>
0.00 : ffffffff814c6eb9: mov -0x70(%rbp),%rax
0.00 : ffffffff814c6ebd: mov %r12,%r15
0.00 : ffffffff814c6ec0: mov %rcx,%r13
0.00 : ffffffff814c6ec3: mov %rax,-0x60(%rbp)
0.00 : ffffffff814c6ec7: jmpq ffffffff814c6de1 <_copy_from_iter+0x341>
: xas_next_entry():
: if (unlikely(xas_not_node(node) || node->shift ||
0.00 : ffffffff814c6ecc: cmpb $0x0,(%rdi)
0.00 : ffffffff814c6ecf: jne ffffffff814c6d35 <_copy_from_iter+0x295>
0.00 : ffffffff814c6ed5: mov -0x50(%rbp),%rsi
0.00 : ffffffff814c6ed9: movzbl -0x46(%rbp),%r9d
0.00 : ffffffff814c6ede: mov %rsi,%r8
0.00 : ffffffff814c6ee1: mov %r9,%rax
0.00 : ffffffff814c6ee4: and $0x3f,%r8d
0.00 : ffffffff814c6ee8: cmp %r8,%r9
0.00 : ffffffff814c6eeb: jne ffffffff814c6d35 <_copy_from_iter+0x295>
: if (unlikely(xas->xa_index >= max))
0.00 : ffffffff814c6ef1: cmp $0xffffffffffffffff,%rsi
0.00 : ffffffff814c6ef5: je ffffffff814c6f60 <_copy_from_iter+0x4c0>
: if (unlikely(xas->xa_offset == XA_CHUNK_MASK))
0.00 : ffffffff814c6ef7: cmp $0x3f,%al
0.00 : ffffffff814c6ef9: je ffffffff814c6f4b <_copy_from_iter+0x4ab>
: entry = xa_entry(xas->xa, node, xas->xa_offset + 1);
0.00 : ffffffff814c6efb: movzbl %al,%r8d
: xa_entry():
: return rcu_dereference_check(node->slots[offset],
0.00 : ffffffff814c6eff: add $0x5,%r8
0.00 : ffffffff814c6f03: mov 0x8(%rdi,%r8,8),%r14
: xa_is_internal():
: return ((unsigned long)entry & 3) == 2;
0.00 : ffffffff814c6f08: mov %r14,%r8
0.00 : ffffffff814c6f0b: and $0x3,%r8d
: xas_next_entry():
: if (unlikely(xa_is_internal(entry)))
0.00 : ffffffff814c6f0f: cmp $0x2,%r8
0.00 : ffffffff814c6f13: je ffffffff814c6f37 <_copy_from_iter+0x497>
: xas->xa_offset++;
0.00 : ffffffff814c6f15: inc %eax
: xas->xa_index++;
0.00 : ffffffff814c6f17: inc %rsi
: } while (!entry);
0.00 : ffffffff814c6f1a: mov $0x1,%ecx
0.00 : ffffffff814c6f1f: test %r14,%r14
0.00 : ffffffff814c6f22: je ffffffff814c6ef1 <_copy_from_iter+0x451>
0.00 : ffffffff814c6f24: mov %al,-0x46(%rbp)
0.00 : ffffffff814c6f27: mov %rsi,-0x50(%rbp)
0.00 : ffffffff814c6f2b: jmpq ffffffff814c6c64 <_copy_from_iter+0x1c4>
: _copy_from_iter():
0.00 : ffffffff814c6f30: ud2
0.00 : ffffffff814c6f32: jmpq ffffffff814c6d51 <_copy_from_iter+0x2b1>
0.00 : ffffffff814c6f37: test %cl,%cl
0.00 : ffffffff814c6f39: je ffffffff814c6d35 <_copy_from_iter+0x295>
0.00 : ffffffff814c6f3f: mov %al,-0x46(%rbp)
0.00 : ffffffff814c6f42: mov %rsi,-0x50(%rbp)
0.00 : ffffffff814c6f46: jmpq ffffffff814c6d35 <_copy_from_iter+0x295>
0.00 : ffffffff814c6f4b: test %cl,%cl
0.00 : ffffffff814c6f4d: je ffffffff814c6d35 <_copy_from_iter+0x295>
0.00 : ffffffff814c6f53: movb $0x3f,-0x46(%rbp)
0.00 : ffffffff814c6f57: mov %rsi,-0x50(%rbp)
: xas_next_entry():
: return xas_find(xas, max);
0.00 : ffffffff814c6f5b: jmpq ffffffff814c6d35 <_copy_from_iter+0x295>
0.00 : ffffffff814c6f60: test %cl,%cl
0.00 : ffffffff814c6f62: je ffffffff814c6d35 <_copy_from_iter+0x295>
0.00 : ffffffff814c6f68: mov %al,-0x46(%rbp)
0.00 : ffffffff814c6f6b: movq $0xffffffffffffffff,-0x50(%rbp)
: return xas_find(xas, max);
0.00 : ffffffff814c6f73: jmpq ffffffff814c6d35 <_copy_from_iter+0x295>
: _copy_from_iter():
0.00 : ffffffff814c6f78: ud2
0.00 : ffffffff814c6f7a: jmpq ffffffff814c6d51 <_copy_from_iter+0x2b1>
: xas_reset():
: xas->xa_node = XAS_RESTART;
0.00 : ffffffff814c6f7f: movq $0x3,-0x40(%rbp)
: xas_not_node():
: return ((unsigned long)node & 3) || !node;
0.00 : ffffffff814c6f87: jmpq ffffffff814c6d35 <_copy_from_iter+0x295>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-05-06 19:15 ` Jens Axboe
@ 2021-05-06 21:08 ` Al Viro
2021-05-06 21:17 ` Matthew Wilcox
2021-05-07 14:59 ` Jens Axboe
0 siblings, 2 replies; 18+ messages in thread
From: Al Viro @ 2021-05-06 21:08 UTC (permalink / raw)
To: Jens Axboe
Cc: Pavel Begunkov, yangerkun, linux-fsdevel, linux-block, io-uring
On Thu, May 06, 2021 at 01:15:01PM -0600, Jens Axboe wrote:
> Attached output of perf annotate <func> for that last run.
Heh... I wonder if keeping the value of iocb_flags(file) in
struct file itself would have a visible effect...
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-05-06 21:08 ` Al Viro
@ 2021-05-06 21:17 ` Matthew Wilcox
2021-05-07 14:59 ` Jens Axboe
1 sibling, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2021-05-06 21:17 UTC (permalink / raw)
To: Al Viro
Cc: Jens Axboe, Pavel Begunkov, yangerkun, linux-fsdevel, linux-block,
io-uring
On Thu, May 06, 2021 at 09:08:50PM +0000, Al Viro wrote:
> On Thu, May 06, 2021 at 01:15:01PM -0600, Jens Axboe wrote:
>
> > Attached output of perf annotate <func> for that last run.
>
> Heh... I wonder if keeping the value of iocb_flags(file) in
> struct file itself would have a visible effect...
I suggested that ...
https://lore.kernel.org/linux-fsdevel/[email protected]/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] block: reexpand iov_iter after read/write
2021-05-06 21:08 ` Al Viro
2021-05-06 21:17 ` Matthew Wilcox
@ 2021-05-07 14:59 ` Jens Axboe
1 sibling, 0 replies; 18+ messages in thread
From: Jens Axboe @ 2021-05-07 14:59 UTC (permalink / raw)
To: Al Viro; +Cc: Pavel Begunkov, yangerkun, linux-fsdevel, linux-block, io-uring
On 5/6/21 3:08 PM, Al Viro wrote:
> On Thu, May 06, 2021 at 01:15:01PM -0600, Jens Axboe wrote:
>
>> Attached output of perf annotate <func> for that last run.
>
> Heh... I wonder if keeping the value of iocb_flags(file) in
> struct file itself would have a visible effect...
A quick hack to get rid of the init_sync_kiocb() in new_sync_write() and
just eliminate the ki_flags read in eventfd_write(), as the test case is
blocking. That brings us closer to the ->write() method, down 7% vs the
previous 10%:
Executed in 468.23 millis fish external
usr time 95.09 millis 114.00 micros 94.98 millis
sys time 372.98 millis 76.00 micros 372.90 millis
Executed in 468.97 millis fish external
usr time 91.05 millis 89.00 micros 90.96 millis
sys time 377.92 millis 69.00 micros 377.85 millis
--
Jens Axboe
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2021-05-07 14:59 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-01 7:18 [PATCH] block: reexpand iov_iter after read/write yangerkun
2021-04-06 1:28 ` yangerkun
2021-04-06 11:04 ` Pavel Begunkov
2021-04-07 14:16 ` yangerkun
2021-04-09 14:49 ` Pavel Begunkov
2021-04-15 17:37 ` Pavel Begunkov
2021-04-15 17:39 ` Pavel Begunkov
2021-04-28 6:16 ` yangerkun
2021-04-30 12:57 ` Pavel Begunkov
2021-04-30 14:35 ` Al Viro
2021-05-06 16:57 ` Pavel Begunkov
2021-05-06 17:17 ` Al Viro
2021-05-06 17:19 ` Jens Axboe
2021-05-06 18:55 ` Al Viro
2021-05-06 19:15 ` Jens Axboe
2021-05-06 21:08 ` Al Viro
2021-05-06 21:17 ` Matthew Wilcox
2021-05-07 14:59 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox