From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on gnuweeb.org X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_PASS,SPF_SOFTFAIL,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5A90C433F5 for ; Thu, 21 Apr 2022 09:17:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234709AbiDUJUq (ORCPT ); Thu, 21 Apr 2022 05:20:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387464AbiDUJUl (ORCPT ); Thu, 21 Apr 2022 05:20:41 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A938325E88 for ; Thu, 21 Apr 2022 02:17:51 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 23L8AdKG000913 for ; Thu, 21 Apr 2022 02:17:51 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=iW5NDkEejOpL0nBrmEH9IDfG176DPGk8txa3aDVve88=; b=K2qFV8KYIfKlfzrK0bxmDoRM5syBVjGBau7everfxnqORwPHjY+Sec0pbm/rd7BdlkoC lY+Sqe+6gmjoLSTV5RvToabpQycZCr3z8TKm0vsI8u0mMTmVnGHrBVe/qggQ22VLpDaO kXo9vrXf9kMwo3YjPERITlb/e9oA7w/TwRQ= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3fj36tb6v4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 21 Apr 2022 02:17:51 -0700 Received: from twshared8053.07.ash9.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:11d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:17:50 -0700 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id 016F77CA7725; Thu, 21 Apr 2022 02:14:56 -0700 (PDT) From: Dylan Yudaken To: CC: , , , Dylan Yudaken Subject: [PATCH liburing 5/5] overflow: add tests Date: Thu, 21 Apr 2022 02:14:27 -0700 Message-ID: <20220421091427.2118151-6-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220421091427.2118151-1-dylany@fb.com> References: <20220421091427.2118151-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe Content-Type: text/plain X-Proofpoint-GUID: LY2_NyNcYU6VQxbVQ741xcFMdzhrghDU X-Proofpoint-ORIG-GUID: LY2_NyNcYU6VQxbVQ741xcFMdzhrghDU X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-20_06,2022-04-20_01,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Add tests that verify that overflow conditions behave appropriately. Specifically: * if overflow is continually flushed, then CQEs should arrive mostly in order to prevent starvation of some completions * if CQEs are dropped due to GFP_ATOMIC allocation failures it is possible to terminate cleanly. This is not tested by default as it requires debug kernel config, and also has system-wide effects Signed-off-by: Dylan Yudaken --- test/cq-overflow.c | 240 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 233 insertions(+), 7 deletions(-) diff --git a/test/cq-overflow.c b/test/cq-overflow.c index 057570e..067308a 100644 --- a/test/cq-overflow.c +++ b/test/cq-overflow.c @@ -9,6 +9,7 @@ #include #include #include +#include =20 #include "helpers.h" #include "liburing.h" @@ -21,6 +22,32 @@ static struct iovec *vecs; =20 #define ENTRIES 8 =20 +/* + * io_uring has rare cases where CQEs are lost. + * This happens when there is no space in the CQ ring, and also there is= no + * GFP_ATOMIC memory available. In reality this probably means that the = process + * is about to be killed as many other things might start failing, but w= e still + * want to test that liburing and the kernel deal with this properly. Th= e fault + * injection framework allows us to test this scenario. Unfortunately th= is + * requires some system wide changes and so we do not enable this by def= ault. + * The tests in this file should work in both cases (where overflows are= queued + * and where they are dropped) on recent kernels. + * + * In order to test dropped CQEs you should enable fault injection in th= e kernel + * config: + * + * CONFIG_FAULT_INJECTION=3Dy + * CONFIG_FAILSLAB=3Dy + * CONFIG_FAULT_INJECTION_DEBUG_FS=3Dy + * + * and then run the test as follows: + * echo Y > /sys/kernel/debug/failslab/task-filter + * echo 100 > /sys/kernel/debug/failslab/probability + * echo 0 > /sys/kernel/debug/failslab/verbose + * echo 100000 > /sys/kernel/debug/failslab/times + * bash -c "echo 1 > /proc/self/make-it-fail && exec ./cq-overflow.t" + */ + static int test_io(const char *file, unsigned long usecs, unsigned *drop= s, int fault) { struct io_uring_sqe *sqe; @@ -29,6 +56,7 @@ static int test_io(const char *file, unsigned long usec= s, unsigned *drops, int f unsigned reaped, total; struct io_uring ring; int nodrop, i, fd, ret; + bool cqe_dropped =3D false; =20 fd =3D open(file, O_RDONLY | O_DIRECT); if (fd < 0) { @@ -103,8 +131,8 @@ static int test_io(const char *file, unsigned long us= ecs, unsigned *drops, int f reap_it: reaped =3D 0; do { - if (nodrop) { - /* nodrop should never lose events */ + if (nodrop && !cqe_dropped) { + /* nodrop should never lose events unless cqe_dropped */ if (reaped =3D=3D total) break; } else { @@ -112,7 +140,10 @@ reap_it: break; } ret =3D io_uring_wait_cqe(&ring, &cqe); - if (ret) { + if (nodrop && ret =3D=3D -EBADR) { + cqe_dropped =3D true; + continue; + } else if (ret) { fprintf(stderr, "wait_cqe=3D%d\n", ret); goto err; } @@ -132,7 +163,7 @@ reap_it: goto err; } =20 - if (!nodrop) { + if (!nodrop || cqe_dropped) { *drops =3D *ring.cq.koverflow; } else if (*ring.cq.koverflow) { fprintf(stderr, "Found %u overflows\n", *ring.cq.koverflow); @@ -153,18 +184,31 @@ static int reap_events(struct io_uring *ring, unsig= ned nr_events, int do_wait) { struct io_uring_cqe *cqe; int i, ret =3D 0, seq =3D 0; + unsigned int start_overflow =3D *ring->cq.koverflow; + unsigned int drop_count =3D 0; + bool dropped =3D false; =20 for (i =3D 0; i < nr_events; i++) { if (do_wait) ret =3D io_uring_wait_cqe(ring, &cqe); else ret =3D io_uring_peek_cqe(ring, &cqe); - if (ret) { + if (do_wait && ret =3D=3D -EBADR) { + unsigned int this_drop =3D *ring->cq.koverflow - + start_overflow; + + dropped =3D true; + drop_count +=3D this_drop; + start_overflow =3D *ring->cq.koverflow; + assert(this_drop > 0); + i +=3D (this_drop - 1); + continue; + } else if (ret) { if (ret !=3D -EAGAIN) fprintf(stderr, "cqe peek failed: %d\n", ret); break; } - if (cqe->user_data !=3D seq) { + if (!dropped && cqe->user_data !=3D seq) { fprintf(stderr, "cqe sequence out-of-order\n"); fprintf(stderr, "got %d, wanted %d\n", (int) cqe->user_data, seq); @@ -241,19 +285,201 @@ err: return 1; } =20 + +static void submit_one_nop(struct io_uring *ring, int ud) +{ + struct io_uring_sqe *sqe; + int ret; + + sqe =3D io_uring_get_sqe(ring); + assert(sqe); + io_uring_prep_nop(sqe); + sqe->user_data =3D ud; + ret =3D io_uring_submit(ring); + assert(ret =3D=3D 1); +} + +/* + * Create an overflow condition and ensure that SQEs are still processed + */ +static int test_overflow_handling( + bool batch, + int cqe_multiple, + bool poll) +{ + struct io_uring ring; + struct io_uring_params p; + int ret, i, j, ud, cqe_count; + unsigned int count; + int const N =3D 8; + int const LOOPS =3D 128; + int const QUEUE_LENGTH =3D 1024; + int completions[N]; + int queue[QUEUE_LENGTH]; + int queued =3D 0; + int outstanding =3D 0; + bool cqe_dropped =3D false; + + memset(&completions, 0, sizeof(int) * N); + memset(&p, 0, sizeof(p)); + p.cq_entries =3D 2 * cqe_multiple; + p.flags |=3D IORING_SETUP_CQSIZE; + + if (poll) + p.flags |=3D IORING_SETUP_IOPOLL; + + ret =3D io_uring_queue_init_params(2, &ring, &p); + if (ret) { + fprintf(stderr, "io_uring_queue_init failed %d\n", ret); + return 1; + } + + assert(p.cq_entries < N); + /* submit N SQEs, some should overflow */ + for (i =3D 0; i < N; i++) { + submit_one_nop(&ring, i); + outstanding++; + } + + for (i =3D 0; i < LOOPS; i++) { + struct io_uring_cqe *cqes[N]; + + if (io_uring_cq_has_overflow(&ring)) { + /* + * Flush any overflowed CQEs and process those. Actively + * flush these to make sure CQEs arrive in vague order + * of being sent. + */ + ret =3D io_uring_flush_overflow(&ring); + if (ret !=3D 0) { + fprintf(stderr, + "io_uring_flush_overflow returned %d\n", + ret); + goto err; + } + } else if (!cqe_dropped) { + for (j =3D 0; j < queued; j++) { + submit_one_nop(&ring, queue[j]); + outstanding++; + } + queued =3D 0; + } + + /* We have lost some random cqes, stop if no remaining. */ + if (cqe_dropped && outstanding =3D=3D *ring.cq.koverflow) + break; + + ret =3D io_uring_wait_cqe(&ring, &cqes[0]); + if (ret =3D=3D -EBADR) { + cqe_dropped =3D true; + fprintf(stderr, "CQE dropped\n"); + continue; + } else if (ret !=3D 0) { + fprintf(stderr, "io_uring_wait_cqes failed %d\n", ret); + goto err; + } + cqe_count =3D 1; + if (batch) { + ret =3D io_uring_peek_batch_cqe(&ring, &cqes[0], 2); + if (ret < 0) { + fprintf(stderr, + "io_uring_peek_batch_cqe failed %d\n", + ret); + goto err; + } + cqe_count =3D ret; + } + for (j =3D 0; j < cqe_count; j++) { + assert(cqes[j]->user_data < N); + ud =3D cqes[j]->user_data; + completions[ud]++; + assert(queued < QUEUE_LENGTH); + queue[queued++] =3D (int)ud; + } + io_uring_cq_advance(&ring, cqe_count); + outstanding -=3D cqe_count; + } + + /* See if there were any drops by flushing the CQ ring *and* overflow *= / + do { + struct io_uring_cqe *cqe; + + ret =3D io_uring_flush_overflow(&ring); + if (ret < 0) { + if (ret =3D=3D -EBADR) { + fprintf(stderr, "CQE dropped\n"); + cqe_dropped =3D true; + break; + } + goto err; + } + if (outstanding && !io_uring_cq_ready(&ring)) + ret =3D io_uring_wait_cqe_timeout(&ring, &cqe, NULL); + + if (ret && ret !=3D -ETIME) { + if (ret =3D=3D -EBADR) { + fprintf(stderr, "CQE dropped\n"); + cqe_dropped =3D true; + break; + } + fprintf(stderr, "wait_cqe_timeout =3D %d\n", ret); + goto err; + } + count =3D io_uring_cq_ready(&ring); + io_uring_cq_advance(&ring, count); + outstanding -=3D count; + } while (count); + + io_uring_queue_exit(&ring); + + /* Make sure that completions come back in the same order they were + * sent. If they come back unfairly then this will concentrate on a + * couple of indices. + */ + for (i =3D 1; !cqe_dropped && i < N; i++) { + if (abs(completions[i] - completions[i - 1]) > 1) { + fprintf( + stderr, + "bad completion size %d %d\n", + completions[i], + completions[i - 1]); + goto err; + } + } + return 0; +err: + io_uring_queue_exit(&ring); + return 1; +} + int main(int argc, char *argv[]) { const char *fname =3D ".cq-overflow"; unsigned iters, drops; unsigned long usecs; int ret; + int i; =20 if (argc > 1) return 0; =20 + for (i =3D 0; i < 8; i++) { + bool batch =3D i & 1; + int mult =3D (i & 2) ? 1 : 2; + bool poll =3D i & 4; + + ret =3D test_overflow_handling(batch, mult, poll); + if (ret) { + fprintf(stderr, "test_overflow_handling(" + "batch=3D%d, mult=3D%d, poll=3D%d) failed\n", + batch, mult, poll); + goto err; + } + } + ret =3D test_overflow(); if (ret) { - printf("test_overflow failed\n"); + fprintf(stderr, "test_overflow failed\n"); return ret; } =20 --=20 2.30.2