* [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O
@ 2022-10-21 10:34 ` Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 1/3] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Pavel Begunkov @ 2022-10-21 10:34 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, linux-kernel, Christoph Hellwig, Pavel Begunkov
Add bio pcpu caching for normal / IRQ-driven I/O extending REQ_ALLOC_CACHE,
which was limited to iopoll. t/io_uring with an Optane SSD setup showed +7%
for batches of 32 requests and +4.3% for batches of 8.
IRQ, 128/32/32, cache off
IOPS=59.08M, BW=28.84GiB/s, IOS/call=31/31
IOPS=59.30M, BW=28.96GiB/s, IOS/call=32/32
IOPS=59.97M, BW=29.28GiB/s, IOS/call=31/31
IOPS=59.92M, BW=29.26GiB/s, IOS/call=32/32
IOPS=59.81M, BW=29.20GiB/s, IOS/call=32/31
IRQ, 128/32/32, cache on
IOPS=64.05M, BW=31.27GiB/s, IOS/call=32/31
IOPS=64.22M, BW=31.36GiB/s, IOS/call=32/32
IOPS=64.04M, BW=31.27GiB/s, IOS/call=31/31
IOPS=63.16M, BW=30.84GiB/s, IOS/call=32/32
IRQ, 32/8/8, cache off
IOPS=50.60M, BW=24.71GiB/s, IOS/call=7/8
IOPS=50.22M, BW=24.52GiB/s, IOS/call=8/7
IOPS=49.54M, BW=24.19GiB/s, IOS/call=8/8
IOPS=50.07M, BW=24.45GiB/s, IOS/call=7/7
IOPS=50.46M, BW=24.64GiB/s, IOS/call=8/8
IRQ, 32/8/8, cache on
IOPS=51.39M, BW=25.09GiB/s, IOS/call=8/7
IOPS=52.52M, BW=25.64GiB/s, IOS/call=7/8
IOPS=52.57M, BW=25.67GiB/s, IOS/call=8/8
IOPS=52.58M, BW=25.67GiB/s, IOS/call=8/7
IOPS=52.61M, BW=25.69GiB/s, IOS/call=8/8
The next step will be turning it on for other users, hopefully by default.
The only restriction we currently have is that the allocations can't be
done from non-irq context and so needs auditing.
note: needs "bio: safeguard REQ_ALLOC_CACHE bio put" missing in for-6.2/block
v2: fix botched splicing threshold checks
v3: remove merged patch
limit scope of flags var in bio_put_percpu_cache (Christoph Hellwig)
Pavel Begunkov (3):
bio: split pcpu cache part of bio_put into a helper
block/bio: add pcpu caching for non-polling bio_put
io_uring/rw: enable bio caches for IRQ rw
block/bio.c | 93 +++++++++++++++++++++++++++++++++++++++------------
io_uring/rw.c | 3 +-
2 files changed, 74 insertions(+), 22 deletions(-)
--
2.38.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH for-next v3 1/3] bio: split pcpu cache part of bio_put into a helper
2022-10-21 10:34 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Pavel Begunkov
@ 2022-10-21 10:34 ` Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 2/3] block/bio: add pcpu caching for non-polling bio_put Pavel Begunkov
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2022-10-21 10:34 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, linux-kernel, Christoph Hellwig, Pavel Begunkov
Extract a helper out of bio_put for recycling into percpu caches.
It's a preparation patch without functional changes.
Signed-off-by: Pavel Begunkov <[email protected]>
---
block/bio.c | 38 +++++++++++++++++++++++++-------------
1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 0a14af923738..7a573e0f5f52 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -725,6 +725,28 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
bs->cache = NULL;
}
+static inline void bio_put_percpu_cache(struct bio *bio)
+{
+ struct bio_alloc_cache *cache;
+
+ cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
+ bio_uninit(bio);
+
+ if ((bio->bi_opf & REQ_POLLED) && !WARN_ON_ONCE(in_interrupt())) {
+ bio->bi_next = cache->free_list;
+ cache->free_list = bio;
+ cache->nr++;
+ } else {
+ put_cpu();
+ bio_free(bio);
+ return;
+ }
+
+ if (cache->nr > ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
+ bio_alloc_cache_prune(cache, ALLOC_CACHE_SLACK);
+ put_cpu();
+}
+
/**
* bio_put - release a reference to a bio
* @bio: bio to release reference to
@@ -740,20 +762,10 @@ void bio_put(struct bio *bio)
if (!atomic_dec_and_test(&bio->__bi_cnt))
return;
}
-
- if ((bio->bi_opf & REQ_ALLOC_CACHE) && !WARN_ON_ONCE(in_interrupt())) {
- struct bio_alloc_cache *cache;
-
- bio_uninit(bio);
- cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
- bio->bi_next = cache->free_list;
- cache->free_list = bio;
- if (++cache->nr > ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
- bio_alloc_cache_prune(cache, ALLOC_CACHE_SLACK);
- put_cpu();
- } else {
+ if (bio->bi_opf & REQ_ALLOC_CACHE)
+ bio_put_percpu_cache(bio);
+ else
bio_free(bio);
- }
}
EXPORT_SYMBOL(bio_put);
--
2.38.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH for-next v3 2/3] block/bio: add pcpu caching for non-polling bio_put
2022-10-21 10:34 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 1/3] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
@ 2022-10-21 10:34 ` Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 3/3] io_uring/rw: enable bio caches for IRQ rw Pavel Begunkov
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2022-10-21 10:34 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, linux-kernel, Christoph Hellwig, Pavel Begunkov
This patch extends REQ_ALLOC_CACHE to IRQ completions, whenever
currently it's only limited to iopoll. Instead of guarding the list with
irq toggling on alloc, which is expensive, it keeps an additional
irq-safe list from which bios are spliced in batches to ammortise
overhead. On the put side it toggles irqs, but in many cases they're
already disabled and so cheap.
Signed-off-by: Pavel Begunkov <[email protected]>
---
block/bio.c | 63 +++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 51 insertions(+), 12 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 7a573e0f5f52..f7c57352f306 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -25,9 +25,15 @@
#include "blk-rq-qos.h"
#include "blk-cgroup.h"
+#define ALLOC_CACHE_THRESHOLD 16
+#define ALLOC_CACHE_SLACK 64
+#define ALLOC_CACHE_MAX 512
+
struct bio_alloc_cache {
struct bio *free_list;
+ struct bio *free_list_irq;
unsigned int nr;
+ unsigned int nr_irq;
};
static struct biovec_slab {
@@ -408,6 +414,22 @@ static void punt_bios_to_rescuer(struct bio_set *bs)
queue_work(bs->rescue_workqueue, &bs->rescue_work);
}
+static void bio_alloc_irq_cache_splice(struct bio_alloc_cache *cache)
+{
+ unsigned long flags;
+
+ /* cache->free_list must be empty */
+ if (WARN_ON_ONCE(cache->free_list))
+ return;
+
+ local_irq_save(flags);
+ cache->free_list = cache->free_list_irq;
+ cache->free_list_irq = NULL;
+ cache->nr += cache->nr_irq;
+ cache->nr_irq = 0;
+ local_irq_restore(flags);
+}
+
static struct bio *bio_alloc_percpu_cache(struct block_device *bdev,
unsigned short nr_vecs, blk_opf_t opf, gfp_t gfp,
struct bio_set *bs)
@@ -416,9 +438,13 @@ static struct bio *bio_alloc_percpu_cache(struct block_device *bdev,
struct bio *bio;
cache = per_cpu_ptr(bs->cache, get_cpu());
- if (!cache->free_list) {
- put_cpu();
- return NULL;
+ if (!cache->free_list &&
+ READ_ONCE(cache->nr_irq) >= ALLOC_CACHE_THRESHOLD) {
+ bio_alloc_irq_cache_splice(cache);
+ if (!cache->free_list) {
+ put_cpu();
+ return NULL;
+ }
}
bio = cache->free_list;
cache->free_list = bio->bi_next;
@@ -676,11 +702,8 @@ void guard_bio_eod(struct bio *bio)
bio_truncate(bio, maxsector << 9);
}
-#define ALLOC_CACHE_MAX 512
-#define ALLOC_CACHE_SLACK 64
-
-static void bio_alloc_cache_prune(struct bio_alloc_cache *cache,
- unsigned int nr)
+static int __bio_alloc_cache_prune(struct bio_alloc_cache *cache,
+ unsigned int nr)
{
unsigned int i = 0;
struct bio *bio;
@@ -692,6 +715,17 @@ static void bio_alloc_cache_prune(struct bio_alloc_cache *cache,
if (++i == nr)
break;
}
+ return i;
+}
+
+static void bio_alloc_cache_prune(struct bio_alloc_cache *cache,
+ unsigned int nr)
+{
+ nr -= __bio_alloc_cache_prune(cache, nr);
+ if (!READ_ONCE(cache->free_list)) {
+ bio_alloc_irq_cache_splice(cache);
+ __bio_alloc_cache_prune(cache, nr);
+ }
}
static int bio_cpu_dead(unsigned int cpu, struct hlist_node *node)
@@ -737,12 +771,17 @@ static inline void bio_put_percpu_cache(struct bio *bio)
cache->free_list = bio;
cache->nr++;
} else {
- put_cpu();
- bio_free(bio);
- return;
+ unsigned long flags;
+
+ local_irq_save(flags);
+ bio->bi_next = cache->free_list_irq;
+ cache->free_list_irq = bio;
+ cache->nr_irq++;
+ local_irq_restore(flags);
}
- if (cache->nr > ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
+ if (READ_ONCE(cache->nr_irq) + cache->nr >
+ ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
bio_alloc_cache_prune(cache, ALLOC_CACHE_SLACK);
put_cpu();
}
--
2.38.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH for-next v3 3/3] io_uring/rw: enable bio caches for IRQ rw
2022-10-21 10:34 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 1/3] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 2/3] block/bio: add pcpu caching for non-polling bio_put Pavel Begunkov
@ 2022-10-21 10:34 ` Pavel Begunkov
2022-10-25 13:25 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Kanchan Joshi
2022-10-25 19:42 ` Jens Axboe
4 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2022-10-21 10:34 UTC (permalink / raw)
To: Jens Axboe, linux-block
Cc: io-uring, linux-kernel, Christoph Hellwig, Pavel Begunkov
Now we can use IOCB_ALLOC_CACHE not only for iopoll'ed reads/write but
also for normal IRQ driven I/O.
Signed-off-by: Pavel Begunkov <[email protected]>
---
io_uring/rw.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/io_uring/rw.c b/io_uring/rw.c
index a25cd44cd415..009ed489cfa0 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -647,6 +647,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
ret = kiocb_set_rw_flags(kiocb, rw->flags);
if (unlikely(ret))
return ret;
+ kiocb->ki_flags |= IOCB_ALLOC_CACHE;
/*
* If the file is marked O_NONBLOCK, still allow retry for it if it
@@ -662,7 +663,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
return -EOPNOTSUPP;
kiocb->private = NULL;
- kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;
+ kiocb->ki_flags |= IOCB_HIPRI;
kiocb->ki_complete = io_complete_rw_iopoll;
req->iopoll_completed = 0;
} else {
--
2.38.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O
2022-10-21 10:34 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Pavel Begunkov
` (2 preceding siblings ...)
2022-10-21 10:34 ` [PATCH for-next v3 3/3] io_uring/rw: enable bio caches for IRQ rw Pavel Begunkov
@ 2022-10-25 13:25 ` Kanchan Joshi
2022-10-25 14:51 ` Pavel Begunkov
2022-10-25 19:42 ` Jens Axboe
4 siblings, 1 reply; 7+ messages in thread
From: Kanchan Joshi @ 2022-10-25 13:25 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Jens Axboe, linux-block, io-uring, linux-kernel,
Christoph Hellwig
[-- Attachment #1: Type: text/plain, Size: 2038 bytes --]
On Fri, Oct 21, 2022 at 11:34:04AM +0100, Pavel Begunkov wrote:
>Add bio pcpu caching for normal / IRQ-driven I/O extending REQ_ALLOC_CACHE,
>which was limited to iopoll.
So below comment (stating process context as MUST) can also be removed as
part of this series now?
495 * If REQ_ALLOC_CACHE is set, the final put of the bio MUST be done from process
496 * context, not hard/soft IRQ.
497 *
498 * Returns: Pointer to new bio on success, NULL on failure.
499 */
500 struct bio *bio_alloc_bioset(struct block_device *bdev, unsigned short nr_vecs,
501 blk_opf_t opf, gfp_t gfp_mask,
502 struct bio_set *bs)
503 {
>t/io_uring with an Optane SSD setup showed +7%
>for batches of 32 requests and +4.3% for batches of 8.
>
>IRQ, 128/32/32, cache off
>IOPS=59.08M, BW=28.84GiB/s, IOS/call=31/31
>IOPS=59.30M, BW=28.96GiB/s, IOS/call=32/32
>IOPS=59.97M, BW=29.28GiB/s, IOS/call=31/31
>IOPS=59.92M, BW=29.26GiB/s, IOS/call=32/32
>IOPS=59.81M, BW=29.20GiB/s, IOS/call=32/31
>
>IRQ, 128/32/32, cache on
>IOPS=64.05M, BW=31.27GiB/s, IOS/call=32/31
>IOPS=64.22M, BW=31.36GiB/s, IOS/call=32/32
>IOPS=64.04M, BW=31.27GiB/s, IOS/call=31/31
>IOPS=63.16M, BW=30.84GiB/s, IOS/call=32/32
>
>IRQ, 32/8/8, cache off
>IOPS=50.60M, BW=24.71GiB/s, IOS/call=7/8
>IOPS=50.22M, BW=24.52GiB/s, IOS/call=8/7
>IOPS=49.54M, BW=24.19GiB/s, IOS/call=8/8
>IOPS=50.07M, BW=24.45GiB/s, IOS/call=7/7
>IOPS=50.46M, BW=24.64GiB/s, IOS/call=8/8
>
>IRQ, 32/8/8, cache on
>IOPS=51.39M, BW=25.09GiB/s, IOS/call=8/7
>IOPS=52.52M, BW=25.64GiB/s, IOS/call=7/8
>IOPS=52.57M, BW=25.67GiB/s, IOS/call=8/8
>IOPS=52.58M, BW=25.67GiB/s, IOS/call=8/7
>IOPS=52.61M, BW=25.69GiB/s, IOS/call=8/8
>
>The next step will be turning it on for other users, hopefully by default.
>The only restriction we currently have is that the allocations can't be
>done from non-irq context and so needs auditing.
Isn't allocation (of bio) happening in non-irq context already?
And
Reviewed-by: Kanchan Joshi <[email protected]>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O
2022-10-25 13:25 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Kanchan Joshi
@ 2022-10-25 14:51 ` Pavel Begunkov
0 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2022-10-25 14:51 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Jens Axboe, linux-block, io-uring, linux-kernel,
Christoph Hellwig
On 10/25/22 14:25, Kanchan Joshi wrote:
> On Fri, Oct 21, 2022 at 11:34:04AM +0100, Pavel Begunkov wrote:
>> Add bio pcpu caching for normal / IRQ-driven I/O extending REQ_ALLOC_CACHE,
>> which was limited to iopoll.
>
> So below comment (stating process context as MUST) can also be removed as
> part of this series now?
Right, good point
> 495 * If REQ_ALLOC_CACHE is set, the final put of the bio MUST be done from process
> 496 * context, not hard/soft IRQ.
> 497 *
> 498 * Returns: Pointer to new bio on success, NULL on failure.
> 499 */
> 500 struct bio *bio_alloc_bioset(struct block_device *bdev, unsigned short nr_vecs,
> 501 blk_opf_t opf, gfp_t gfp_mask,
> 502 struct bio_set *bs)
> 503 {
[...]
>> The next step will be turning it on for other users, hopefully by default.
>> The only restriction we currently have is that the allocations can't be
>> done from non-irq context and so needs auditing.
>
> Isn't allocation (of bio) happening in non-irq context already?
That's my assumption, true for most of them, but I need to actually
check that. Will be following up after this series is merged.
> Reviewed-by: Kanchan Joshi <[email protected]>
thanks
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O
2022-10-21 10:34 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Pavel Begunkov
` (3 preceding siblings ...)
2022-10-25 13:25 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Kanchan Joshi
@ 2022-10-25 19:42 ` Jens Axboe
4 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2022-10-25 19:42 UTC (permalink / raw)
To: Pavel Begunkov, linux-block; +Cc: linux-kernel, Christoph Hellwig, io-uring
On Fri, 21 Oct 2022 11:34:04 +0100, Pavel Begunkov wrote:
> Add bio pcpu caching for normal / IRQ-driven I/O extending REQ_ALLOC_CACHE,
> which was limited to iopoll. t/io_uring with an Optane SSD setup showed +7%
> for batches of 32 requests and +4.3% for batches of 8.
>
> IRQ, 128/32/32, cache off
> IOPS=59.08M, BW=28.84GiB/s, IOS/call=31/31
> IOPS=59.30M, BW=28.96GiB/s, IOS/call=32/32
> IOPS=59.97M, BW=29.28GiB/s, IOS/call=31/31
> IOPS=59.92M, BW=29.26GiB/s, IOS/call=32/32
> IOPS=59.81M, BW=29.20GiB/s, IOS/call=32/31
>
> [...]
Applied, thanks!
[1/3] bio: split pcpu cache part of bio_put into a helper
commit: 0b0735a8c24f006d2d9d8b2b408b8c90f3163abd
[2/3] block/bio: add pcpu caching for non-polling bio_put
commit: 13a184e269656994180e8c64ff56db03ed737902
[3/3] io_uring/rw: enable bio caches for IRQ rw
commit: 93dad04746ea1340dec267f0e98ac42e8bc67160
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-10-25 19:42 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20221021103627epcas5p34eaaf3c8161bbee33160cce8b58efd5f@epcas5p3.samsung.com>
2022-10-21 10:34 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 1/3] bio: split pcpu cache part of bio_put into a helper Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 2/3] block/bio: add pcpu caching for non-polling bio_put Pavel Begunkov
2022-10-21 10:34 ` [PATCH for-next v3 3/3] io_uring/rw: enable bio caches for IRQ rw Pavel Begunkov
2022-10-25 13:25 ` [PATCH for-next v3 0/3] implement pcpu bio caching for IRQ I/O Kanchan Joshi
2022-10-25 14:51 ` Pavel Begunkov
2022-10-25 19:42 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox