On Fri, Oct 21, 2022 at 11:34:04AM +0100, Pavel Begunkov wrote: >Add bio pcpu caching for normal / IRQ-driven I/O extending REQ_ALLOC_CACHE, >which was limited to iopoll. So below comment (stating process context as MUST) can also be removed as part of this series now? 495 * If REQ_ALLOC_CACHE is set, the final put of the bio MUST be done from process 496 * context, not hard/soft IRQ. 497 * 498 * Returns: Pointer to new bio on success, NULL on failure. 499 */ 500 struct bio *bio_alloc_bioset(struct block_device *bdev, unsigned short nr_vecs, 501 blk_opf_t opf, gfp_t gfp_mask, 502 struct bio_set *bs) 503 { >t/io_uring with an Optane SSD setup showed +7% >for batches of 32 requests and +4.3% for batches of 8. > >IRQ, 128/32/32, cache off >IOPS=59.08M, BW=28.84GiB/s, IOS/call=31/31 >IOPS=59.30M, BW=28.96GiB/s, IOS/call=32/32 >IOPS=59.97M, BW=29.28GiB/s, IOS/call=31/31 >IOPS=59.92M, BW=29.26GiB/s, IOS/call=32/32 >IOPS=59.81M, BW=29.20GiB/s, IOS/call=32/31 > >IRQ, 128/32/32, cache on >IOPS=64.05M, BW=31.27GiB/s, IOS/call=32/31 >IOPS=64.22M, BW=31.36GiB/s, IOS/call=32/32 >IOPS=64.04M, BW=31.27GiB/s, IOS/call=31/31 >IOPS=63.16M, BW=30.84GiB/s, IOS/call=32/32 > >IRQ, 32/8/8, cache off >IOPS=50.60M, BW=24.71GiB/s, IOS/call=7/8 >IOPS=50.22M, BW=24.52GiB/s, IOS/call=8/7 >IOPS=49.54M, BW=24.19GiB/s, IOS/call=8/8 >IOPS=50.07M, BW=24.45GiB/s, IOS/call=7/7 >IOPS=50.46M, BW=24.64GiB/s, IOS/call=8/8 > >IRQ, 32/8/8, cache on >IOPS=51.39M, BW=25.09GiB/s, IOS/call=8/7 >IOPS=52.52M, BW=25.64GiB/s, IOS/call=7/8 >IOPS=52.57M, BW=25.67GiB/s, IOS/call=8/8 >IOPS=52.58M, BW=25.67GiB/s, IOS/call=8/7 >IOPS=52.61M, BW=25.69GiB/s, IOS/call=8/8 > >The next step will be turning it on for other users, hopefully by default. >The only restriction we currently have is that the allocations can't be >done from non-irq context and so needs auditing. Isn't allocation (of bio) happening in non-irq context already? And Reviewed-by: Kanchan Joshi