* [RFC PATCH v2 0/4] liburing: add api for napi busy poll timeout @ 2022-11-07 17:53 Stefan Roesch 2022-11-07 17:53 ` [RFC PATCH v2 1/4] liburing: add api to set " Stefan Roesch ` (3 more replies) 0 siblings, 4 replies; 9+ messages in thread From: Stefan Roesch @ 2022-11-07 17:53 UTC (permalink / raw) To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2 This adds two new api's to set and clear the napi busy poll timeout. The two new functions are called: - io_uring_register_busy_poll_timeout and - io_uring_unregister_busy_poll_timeout. The patch series also contains the documentation for the two new functions and two test programs. The client program is called napi-busy-poll-client and the server program napi-busy-poll-server. The client measures the roundtrip times of requests. There is also a kernel patch "io-uring: support napi busy poll" to enable this feature on the kernel side. Changes: - V2: - Updated the liburing.map file for the two new functions. (added a 2.4 section) - Added a description of the new feature to the changelog file - Fixed the indentation of the longopts structure - Used defined exit constants - Fixed encodeUserData to support 32 bit builds Signed-off-by: Stefan Roesch <[email protected]> Stefan Roesch (4): liburing: add api to set napi busy poll timeout liburing: add documentation for new napi busy polling liburing: add test programs for napi busy poll liburing: update changelog with new feature CHANGELOG | 3 + man/io_uring_register_napi.3 | 35 +++ man/io_uring_unregister_napi.3 | 26 ++ src/include/liburing.h | 4 + src/include/liburing/io_uring.h | 4 + src/liburing.map | 8 + src/register.c | 15 ++ test/Makefile | 2 + test/napi-busy-poll-client.c | 422 ++++++++++++++++++++++++++++++++ test/napi-busy-poll-server.c | 372 ++++++++++++++++++++++++++++ 10 files changed, 891 insertions(+) create mode 100644 man/io_uring_register_napi.3 create mode 100644 man/io_uring_unregister_napi.3 create mode 100644 test/napi-busy-poll-client.c create mode 100644 test/napi-busy-poll-server.c base-commit: 754bc068ec482c5338a07dd74b7d3892729bb847 -- 2.30.2 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH v2 1/4] liburing: add api to set napi busy poll timeout 2022-11-07 17:53 [RFC PATCH v2 0/4] liburing: add api for napi busy poll timeout Stefan Roesch @ 2022-11-07 17:53 ` Stefan Roesch 2022-11-08 7:14 ` Ammar Faizi 2022-11-07 17:53 ` [RFC PATCH v2 2/4] liburing: add documentation for new napi busy polling Stefan Roesch ` (2 subsequent siblings) 3 siblings, 1 reply; 9+ messages in thread From: Stefan Roesch @ 2022-11-07 17:53 UTC (permalink / raw) To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2 This adds the two functions to register and unregister the napi busy poll timeout: - io_uring_register_napi_busy_poll_timeout - io_uring_unregister_napi_busy_poll_timeout Signed-off-by: Stefan Roesch <[email protected]> --- src/include/liburing.h | 4 ++++ src/include/liburing/io_uring.h | 4 ++++ src/liburing.map | 8 ++++++++ src/register.c | 15 +++++++++++++++ 4 files changed, 31 insertions(+) diff --git a/src/include/liburing.h b/src/include/liburing.h index 12a703f..6722fa2 100644 --- a/src/include/liburing.h +++ b/src/include/liburing.h @@ -235,6 +235,10 @@ int io_uring_register_sync_cancel(struct io_uring *ring, int io_uring_register_file_alloc_range(struct io_uring *ring, unsigned off, unsigned len); +int io_uring_register_napi_busy_poll_timeout(struct io_uring *ring, + unsigned int to); +int io_uring_unregister_napi_busy_poll_timeout(struct io_uring *ring); + int io_uring_get_events(struct io_uring *ring); int io_uring_submit_and_get_events(struct io_uring *ring); diff --git a/src/include/liburing/io_uring.h b/src/include/liburing/io_uring.h index a3e0920..0919d9e 100644 --- a/src/include/liburing/io_uring.h +++ b/src/include/liburing/io_uring.h @@ -499,6 +499,10 @@ enum { /* register a range of fixed file slots for automatic slot allocation */ IORING_REGISTER_FILE_ALLOC_RANGE = 25, + /* set/clear busy poll timeout */ + IORING_REGISTER_NAPI_BUSY_POLL_TIMEOUT = 26, + IORING_UNREGISTER_NAPI_BUSY_POLL_TIMEOUT= 27, + /* this goes last */ IORING_REGISTER_LAST }; diff --git a/src/liburing.map b/src/liburing.map index 06c64f8..793766e 100644 --- a/src/liburing.map +++ b/src/liburing.map @@ -60,6 +60,8 @@ LIBURING_2.3 { global: io_uring_register_sync_cancel; io_uring_register_file_alloc_range; + io_uring_register_busy_poll_timeout; + io_uring_unregister_busy_poll_timeout; io_uring_enter; io_uring_enter2; io_uring_setup; @@ -67,3 +69,9 @@ LIBURING_2.3 { io_uring_get_events; io_uring_submit_and_get_events; } LIBURING_2.2; + +LIBURING_2.4 { + global: + io_uring_napi_register_busy_poll_timeout; + io_uring_napi_unregister_busy_poll_timeout; +} LIBURING_2.3; diff --git a/src/register.c b/src/register.c index e849825..ffbfb8a 100644 --- a/src/register.c +++ b/src/register.c @@ -367,3 +367,18 @@ int io_uring_register_file_alloc_range(struct io_uring *ring, IORING_REGISTER_FILE_ALLOC_RANGE, &range, 0); } + +int io_uring_register_napi_busy_poll_timeout(struct io_uring *ring, + unsigned int to) +{ + return __sys_io_uring_register(ring->ring_fd, + IORING_REGISTER_NAPI_BUSY_POLL_TIMEOUT, + NULL, to); +} + +int io_uring_unregister_napi_busy_poll_timeout(struct io_uring *ring) +{ + return __sys_io_uring_register(ring->ring_fd, + IORING_UNREGISTER_NAPI_BUSY_POLL_TIMEOUT, + NULL, 0); +} -- 2.30.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 1/4] liburing: add api to set napi busy poll timeout 2022-11-07 17:53 ` [RFC PATCH v2 1/4] liburing: add api to set " Stefan Roesch @ 2022-11-08 7:14 ` Ammar Faizi 0 siblings, 0 replies; 9+ messages in thread From: Ammar Faizi @ 2022-11-08 7:14 UTC (permalink / raw) To: Stefan Roesch, Facebook Kernel Team Cc: Jens Axboe, Olivier Langlois, netdev Mailing List, io-uring Mailing List, Jakub Kicinski On 11/8/22 12:53 AM, Stefan Roesch wrote: > diff --git a/src/liburing.map b/src/liburing.map > index 06c64f8..793766e 100644 > --- a/src/liburing.map > +++ b/src/liburing.map > @@ -60,6 +60,8 @@ LIBURING_2.3 { > global: > io_uring_register_sync_cancel; > io_uring_register_file_alloc_range; > + io_uring_register_busy_poll_timeout; > + io_uring_unregister_busy_poll_timeout; > io_uring_enter; > io_uring_enter2; > io_uring_setup; > @@ -67,3 +69,9 @@ LIBURING_2.3 { > io_uring_get_events; > io_uring_submit_and_get_events; > } LIBURING_2.2; I don't understand this part. You add: io_uring_register_busy_poll_timeout io_uring_unregister_busy_poll_timeout in the LIBURING_2.3 section. What are they? I don't find their declaration and definition. How do they differ from: io_uring_napi_register_busy_poll_timeout io_uring_napi_unregister_busy_poll_timeout that you add in the LIBURING_2.4 section? -- Ammar Faizi ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH v2 2/4] liburing: add documentation for new napi busy polling 2022-11-07 17:53 [RFC PATCH v2 0/4] liburing: add api for napi busy poll timeout Stefan Roesch 2022-11-07 17:53 ` [RFC PATCH v2 1/4] liburing: add api to set " Stefan Roesch @ 2022-11-07 17:53 ` Stefan Roesch 2022-11-08 8:04 ` Ammar Faizi 2022-11-07 17:53 ` [RFC PATCH v2 3/4] liburing: add test programs for napi busy poll Stefan Roesch 2022-11-07 17:53 ` [RFC PATCH v2 4/4] liburing: update changelog with new feature Stefan Roesch 3 siblings, 1 reply; 9+ messages in thread From: Stefan Roesch @ 2022-11-07 17:53 UTC (permalink / raw) To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2 This adds two man pages for the two new functions: - io_uring_register_napi_busy_poll_timeout - io_uring_unregister_napi_busy_poll_timeout Signed-off-by: Stefan Roesch <[email protected]> --- man/io_uring_register_napi.3 | 35 ++++++++++++++++++++++++++++++++++ man/io_uring_unregister_napi.3 | 26 +++++++++++++++++++++++++ 2 files changed, 61 insertions(+) create mode 100644 man/io_uring_register_napi.3 create mode 100644 man/io_uring_unregister_napi.3 diff --git a/man/io_uring_register_napi.3 b/man/io_uring_register_napi.3 new file mode 100644 index 0000000..4ef591c --- /dev/null +++ b/man/io_uring_register_napi.3 @@ -0,0 +1,35 @@ +.\" Copyright (C) 2022 Stefan Roesch <[email protected]> +.\" +.\" SPDX-License-Identifier: LGPL-2.0-or-later +.\" +.TH io_uring_register_napi_busy_poll_timeout 3 "November 1, 2022" "liburing-2.3" "liburing Manual" +.SH NAME +io_uring_register_napi_busy_poll_timeout \- register NAPI busy poll timeout +.SH SYNOPSIS +.nf +.B #include <liburing.h> +.PP +.BI "int io_uring_register_napi_busy_poll_timeout(struct io_uring *" ring "," +.BI " unsigned int " timeout) +.PP +.fi +.SH DESCRIPTION +.PP +The +.BR io_uring_register_napi_busy_poll_timeout (3) +function registers the NAPI busy poll +.I timeout +for subsequent operations. + +Registering a NAPI busy poll timeout is a requirement to be able to use +NAPI busy polling. The other way to enable NAPI busy polling is to set the +proc setting /proc/sys/net/core/busy_poll. + +NAPI busy poll can reduce the network roundtrip time. + + +.SH RETURN VALUE +On success +.BR io_uring_register_napi_busy_poll_timeout (3) +return 0. On failure they return +.BR -errno . diff --git a/man/io_uring_unregister_napi.3 b/man/io_uring_unregister_napi.3 new file mode 100644 index 0000000..3a73327 --- /dev/null +++ b/man/io_uring_unregister_napi.3 @@ -0,0 +1,26 @@ +.\" Copyright (C) 2022 Stefan Roesch <[email protected]> +.\" +.\" SPDX-License-Identifier: LGPL-2.0-or-later +.\" +.TH io_uring_unregister_napi_busy_poll_timeout 3 "November 1, 2022" "liburing-2.3" "liburing Manual" +.SH NAME +io_uring_unregister_napi_busy_poll_timeout \- unregister NAPI busy poll timeout +.SH SYNOPSIS +.nf +.B #include <liburing.h> +.PP +.BI "int io_uring_unregister_napi_busy_poll_timeout(struct io_uring *" ring ") +.PP +.fi +.SH DESCRIPTION +.PP +The +.BR io_uring_unregister_napi_busy_poll_timeout (3) +function unregisters the NAPI busy poll +for subsequent operations. + +.SH RETURN VALUE +On success +.BR io_uring_unregister_napi_busy_poll_timeout (3) +return 0. On failure they return +.BR -errno . -- 2.30.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 2/4] liburing: add documentation for new napi busy polling 2022-11-07 17:53 ` [RFC PATCH v2 2/4] liburing: add documentation for new napi busy polling Stefan Roesch @ 2022-11-08 8:04 ` Ammar Faizi 0 siblings, 0 replies; 9+ messages in thread From: Ammar Faizi @ 2022-11-08 8:04 UTC (permalink / raw) To: Stefan Roesch, Facebook Kernel Team Cc: Jens Axboe, Olivier Langlois, netdev Mailing List, io-uring Mailing List, Jakub Kicinski On 11/8/22 12:53 AM, Stefan Roesch wrote: > +.TH io_uring_register_napi_busy_poll_timeout 3 "November 1, 2022" "liburing-2.3" "liburing Manual" [...] > +.TH io_uring_unregister_napi_busy_poll_timeout 3 "November 1, 2022" "liburing-2.3" "liburing Manual" liburing-2.3 has already been released. These two should go to liburing-2.4. See: https://github.com/axboe/liburing/tags for the liburing version list. -- Ammar Faizi ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH v2 3/4] liburing: add test programs for napi busy poll 2022-11-07 17:53 [RFC PATCH v2 0/4] liburing: add api for napi busy poll timeout Stefan Roesch 2022-11-07 17:53 ` [RFC PATCH v2 1/4] liburing: add api to set " Stefan Roesch 2022-11-07 17:53 ` [RFC PATCH v2 2/4] liburing: add documentation for new napi busy polling Stefan Roesch @ 2022-11-07 17:53 ` Stefan Roesch 2022-11-08 7:01 ` Ammar Faizi 2022-11-07 17:53 ` [RFC PATCH v2 4/4] liburing: update changelog with new feature Stefan Roesch 3 siblings, 1 reply; 9+ messages in thread From: Stefan Roesch @ 2022-11-07 17:53 UTC (permalink / raw) To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2 This adds two test programs to test the napi busy poll functionality. It consists of a client program and a server program. To get a napi id, the client and the server program need to be run on different hosts. To test the napi busy poll timeout, the -t needs to be specified. A reasonable value for the busy poll timeout is 100. By specifying the busy poll timeout on the server and the client the best results are accomplished. Signed-off-by: Stefan Roesch <[email protected]> --- test/Makefile | 2 + test/napi-busy-poll-client.c | 422 +++++++++++++++++++++++++++++++++++ test/napi-busy-poll-server.c | 372 ++++++++++++++++++++++++++++++ 3 files changed, 796 insertions(+) create mode 100644 test/napi-busy-poll-client.c create mode 100644 test/napi-busy-poll-server.c diff --git a/test/Makefile b/test/Makefile index 8263e9f..8c606f9 100644 --- a/test/Makefile +++ b/test/Makefile @@ -105,6 +105,8 @@ test_srcs := \ mkdir.c \ msg-ring.c \ multicqes_drain.c \ + napi-busy-poll-client.c \ + napi-busy-poll-server.c \ nolibc.c \ nop-all-sizes.c \ nop.c \ diff --git a/test/napi-busy-poll-client.c b/test/napi-busy-poll-client.c new file mode 100644 index 0000000..0ae4afa --- /dev/null +++ b/test/napi-busy-poll-client.c @@ -0,0 +1,422 @@ +#include <ctype.h> +#include <errno.h> +#include <float.h> +#include <getopt.h> +#include <liburing.h> +#include <math.h> +#include <sched.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <time.h> +#include <unistd.h> +#include <arpa/inet.h> +#include <netdb.h> +#include <netinet/in.h> + +#include "helpers.h" + +#define MAXBUFLEN 100 +#define PORTNOLEN 10 +#define ADDRLEN 80 +#define RINGSIZE 1024 + +#define printable(ch) (isprint((unsigned char)ch) ? ch : '#') + +enum { + IOURING_RECV, + IOURING_SEND, + IOURING_RECVMSG, + IOURING_SENDMSG +}; + +struct ctx +{ + struct io_uring ring; + struct sockaddr_in6 saddr; + + int sockfd; + int buffer_len; + int num_pings; + bool napi_check; + + union { + char buffer[MAXBUFLEN]; + struct timespec ts; + }; + + int rtt_index; + double *rtt; +} ctx; + +struct options +{ + int num_pings; + int timeout; + bool sq_poll; + bool busy_loop; + char port[PORTNOLEN]; + char addr[ADDRLEN]; +} options; + +struct option longopts[] = +{ + {"address" , 1, NULL, 'a'}, + {"busy" , 0, NULL, 'b'}, + {"help" , 0, NULL, 'h'}, + {"num_pings", 1, NULL, 'n'}, + {"port" , 1, NULL, 'p'}, + {"sqpoll" , 0, NULL, 's'}, + {"timeout" , 1, NULL, 't'}, + {NULL , 0, NULL, 0 } +}; + +void printUsage(const char *name) +{ + fprintf(stderr, + "Usage: %s [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll]" + " [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help]\n" + "--address\n" + "-a : remote or local ipv6 address\n" + "--busy\n" + "-b : busy poll io_uring instead of blocking.\n" + "--num_pings\n" + "-n : number of pings\n" + "--port\n" + "-p : port\n" + "--sqpoll\n" + "-s : Configure io_uring to use SQPOLL thread\n" + "--timeout\n" + "-t : Configure NAPI busy poll timeoutn" + "--help\n" + "-h : Display this usage message\n\n", + name); +} + +void printError(const char *msg, int opt) +{ + if (msg && opt) + fprintf(stderr, "%s (-%c)\n", msg, printable(opt)); +} + +void setProcessScheduler(void) +{ + struct sched_param param; + + param.sched_priority = sched_get_priority_max(SCHED_FIFO); + if (sched_setscheduler(0, SCHED_FIFO, ¶m) < 0) + fprintf(stderr, "sched_setscheduler() failed: (%d) %s\n", + errno, strerror(errno)); +} + +double diffTimespec(const struct timespec *time1, const struct timespec *time0) +{ + return (time1->tv_sec - time0->tv_sec) + + (time1->tv_nsec - time0->tv_nsec) / 1000000000.0; +} + +uint64_t encodeUserData(char type, int fd) +{ + return (uint32_t)fd | ((uint64_t)type << 56); +} + +void decodeUserData(uint64_t data, char *type, int *fd) +{ + *type = data >> 56; + *fd = data & 0xffffffffU; +} + +const char *opTypeToStr(char type) +{ + const char *res; + + switch (type) { + case IOURING_RECV: + res = "IOURING_RECV"; + break; + case IOURING_SEND: + res = "IOURING_SEND"; + break; + case IOURING_RECVMSG: + res = "IOURING_RECVMSG"; + break; + case IOURING_SENDMSG: + res = "IOURING_SENDMSG"; + break; + default: + res = "Unknown"; + } + + return res; +} + +void reportNapi(struct ctx *ctx) +{ + unsigned int napi_id = 0; + socklen_t len = sizeof(napi_id); + + getsockopt(ctx->sockfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &napi_id, &len); + if (napi_id) + printf(" napi id: %d\n", napi_id); + else + printf(" unassigned napi id\n"); + + ctx->napi_check = true; +} + +void sendPing(struct ctx *ctx) +{ + struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring); + + clock_gettime(CLOCK_REALTIME, (struct timespec *)ctx->buffer); + + io_uring_prep_send(sqe, ctx->sockfd, ctx->buffer, sizeof(struct timespec), 0); + sqe->user_data = encodeUserData(IOURING_SEND, ctx->sockfd); +} + +void receivePing(struct ctx *ctx) +{ + struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring); + + io_uring_prep_recv(sqe, ctx->sockfd, ctx->buffer, MAXBUFLEN, 0); + sqe->user_data = encodeUserData(IOURING_RECV, ctx->sockfd); +} + +void recordRTT(struct ctx *ctx) +{ + struct timespec startTs = ctx->ts; + + // Send next ping. + sendPing(ctx); + + // Store round-trip time. + ctx->rtt[ctx->rtt_index] = diffTimespec(&ctx->ts, &startTs); + ctx->rtt_index++; +} + +void printStats(struct ctx *ctx) +{ + double minRTT = DBL_MAX; + double maxRTT = 0.0; + double avgRTT = 0.0; + double stddevRTT = 0.0; + + // Calculate min, max, avg. + for (int i = 0; i < ctx->rtt_index; i++) { + if (ctx->rtt[i] < minRTT) + minRTT = ctx->rtt[i]; + if (ctx->rtt[i] > maxRTT) + maxRTT = ctx->rtt[i]; + + avgRTT += ctx->rtt[i]; + } + avgRTT /= ctx->rtt_index; + + // Calculate stddev. + for (int i = 0; i < ctx->rtt_index; i++) + stddevRTT += fabs(ctx->rtt[i] - avgRTT); + stddevRTT /= ctx->rtt_index; + + fprintf(stdout, " rtt(us) min/avg/max/mdev = %.3f/%.3f/%.3f/%.3f\n", + minRTT * 1000000, avgRTT * 1000000, maxRTT * 1000000, stddevRTT * 1000000); +} + +void completion(struct ctx *ctx, struct io_uring_cqe *cqe) +{ + char type; + int fd; + int res = cqe->res; + + decodeUserData(cqe->user_data, &type, &fd); + if (res < 0) { + fprintf(stderr, "unexpected %s failure: (%d) %s\n", + opTypeToStr(type), -res, strerror(-res)); + abort(); + } + + switch (type) { + case IOURING_SEND: + receivePing(ctx); + break; + case IOURING_RECV: + if (res != sizeof(struct timespec)) { + fprintf(stderr, "unexpected ping reply len: %d\n", res); + abort(); + } + + if (!ctx->napi_check) { + reportNapi(ctx); + sendPing(ctx); + } else { + recordRTT(ctx); + } + + --ctx->num_pings; + break; + + default: + fprintf(stderr, "unexpected %s completion\n", + opTypeToStr(type)); + abort(); + break; + } +} + +int main(int argc, char *argv[]) +{ + struct ctx ctx; + struct options opt; + struct __kernel_timespec *tsPtr; + struct __kernel_timespec ts; + struct io_uring_params params; + int flag; + + memset(&opt, 0, sizeof(struct options)); + + // Process flags. + while ((flag = getopt_long(argc, argv, ":hsba:n:p:t:", longopts, NULL)) != -1) { + switch (flag) { + case 'a': + strcpy(opt.addr, optarg); + break; + case 'b': + opt.busy_loop = true; + break; + case 'h': + printUsage(argv[0]); + exit(0); + break; + case 'n': + opt.num_pings = atoi(optarg) + 1; + break; + case 'p': + strcpy(opt.port, optarg); + break; + case 's': + opt.sq_poll = true; + break; + case 't': + opt.timeout = atoi(optarg); + break; + case ':': + printError("Missing argument", optopt); + printUsage(argv[0]); + exit(-1); + break; + case '?': + printError("Unrecognized option", optopt); + printUsage(argv[0]); + exit(-1); + break; + + default: + fprintf(stderr, "Fatal: Unexpected case in CmdLineProcessor switch()\n"); + exit(-1); + break; + } + } + + if (strlen(opt.addr) == 0) { + fprintf(stderr, "address option is mandatory\n"); + printUsage(argv[0]); + exit(T_EXIT_FAIL); + } + + ctx.saddr.sin6_port = htons(atoi(opt.port)); + ctx.saddr.sin6_family = AF_INET6; + + if (inet_pton(AF_INET6, opt.addr, &ctx.saddr.sin6_addr) <= 0) { + fprintf(stderr, "inet_pton error for %s\n", optarg); + printUsage(argv[0]); + exit(T_EXIT_FAIL); + } + + // Connect to server. + fprintf(stdout, "Connecting to %s... (port=%s) to send %d pings\n", opt.addr, opt.port, opt.num_pings - 1); + + if ((ctx.sockfd = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) { + fprintf(stderr, "socket() failed: (%d) %s\n", errno, strerror(errno)); + exit(T_EXIT_FAIL); + } + + if (connect(ctx.sockfd, (struct sockaddr *)&ctx.saddr, sizeof(struct sockaddr_in6)) < 0) { + fprintf(stderr, "connect() failed: (%d) %s\n", errno, strerror(errno)); + exit(T_EXIT_FAIL); + } + + // Setup ring. + memset(¶ms, 0, sizeof(params)); + memset(&ts, 0, sizeof(ts)); + + if (opt.sq_poll) { + params.flags = IORING_SETUP_SQPOLL; + params.sq_thread_idle = 50; + } + + if (io_uring_queue_init_params(RINGSIZE, &ctx.ring, ¶ms) < 0) { + fprintf(stderr, "io_uring_queue_init_params() failed: (%d) %s\n", + errno, strerror(errno)); + exit(T_EXIT_FAIL); + } + + if (opt.timeout) + io_uring_register_napi_busy_poll_timeout(&ctx.ring, opt.timeout); + + if (opt.busy_loop) + tsPtr = &ts; + else + tsPtr = NULL; + + + // Use realtime scheduler. + setProcessScheduler(); + + // Copy payload. + clock_gettime(CLOCK_REALTIME, &ctx.ts); + + // Setup context. + ctx.napi_check = false; + ctx.buffer_len = sizeof(struct timespec); + ctx.num_pings = opt.num_pings; + + ctx.rtt_index = 0; + ctx.rtt = (double *)malloc(sizeof(double) * opt.num_pings); + if (!ctx.rtt) { + fprintf(stderr, "Cannot allocate results array\n"); + exit(T_EXIT_FAIL); + } + + // Send initial message to get napi id. + sendPing(&ctx); + + while (ctx.num_pings != 0) { + int res; + unsigned num_completed = 0; + unsigned head; + struct io_uring_cqe *cqe; + + do { + res = io_uring_submit_and_wait_timeout(&ctx.ring, &cqe, 1, tsPtr, NULL); + } + while (res < 0 && errno == ETIME); + + io_uring_for_each_cqe(&ctx.ring, head, cqe) { + ++num_completed; + completion(&ctx, cqe); + } + + if (num_completed) + io_uring_cq_advance(&ctx.ring, num_completed); + } + + printStats(&ctx); + free(ctx.rtt); + io_uring_queue_exit(&ctx.ring); + + // Clean up. + close(ctx.sockfd); + + return 0; +} diff --git a/test/napi-busy-poll-server.c b/test/napi-busy-poll-server.c new file mode 100644 index 0000000..535f556 --- /dev/null +++ b/test/napi-busy-poll-server.c @@ -0,0 +1,372 @@ +#include <ctype.h> +#include <errno.h> +#include <getopt.h> +#include <liburing.h> +#include <math.h> +#include <sched.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <time.h> +#include <unistd.h> +#include <arpa/inet.h> +#include <netdb.h> +#include <netinet/in.h> + +#include "helpers.h" + +#define MAXBUFLEN 100 +#define PORTNOLEN 10 +#define ADDRLEN 80 +#define RINGSIZE 1024 + +#define printable(ch) (isprint((unsigned char)ch) ? ch : '#') + +enum { + IOURING_RECV, + IOURING_SEND, + IOURING_RECVMSG, + IOURING_SENDMSG +}; + +struct ctx +{ + struct io_uring ring; + struct sockaddr_in6 saddr; + struct iovec iov; + struct msghdr msg; + + int sockfd; + int buffer_len; + int num_pings; + bool napi_check; + + union { + char buffer[MAXBUFLEN]; + struct timespec ts; + }; +} ctx; + +struct options +{ + int num_pings; + int timeout; + + bool listen; + bool sq_poll; + bool busy_loop; + + char port[PORTNOLEN]; + char addr[ADDRLEN]; +} options; + +struct option longopts[] = +{ + {"address" , 1, NULL, 'a'}, + {"busy" , 0, NULL, 'b'}, + {"help" , 0, NULL, 'h'}, + {"listen" , 0, NULL, 'l'}, + {"num_pings", 1, NULL, 'n'}, + {"port" , 1, NULL, 'p'}, + {"sqpoll" , 0, NULL, 's'}, + {"timeout" , 1, NULL, 't'}, + {NULL , 0, NULL, 0 } +}; + +void printUsage(const char *name) +{ + fprintf(stderr, + "Usage: %s [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll]" + " [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help]\n" + " --listen\n" + "-l : Server mode\n" + "--address\n" + "-a : remote or local ipv6 address\n" + "--busy\n" + "-b : busy poll io_uring instead of blocking.\n" + "--num_pings\n" + "-n : number of pings\n" + "--port\n" + "-p : port\n" + "--sqpoll\n" + "-s : Configure io_uring to use SQPOLL thread\n" + "--timeout\n" + "-t : Configure NAPI busy poll timeoutn" + "--help\n" + "-h : Display this usage message\n\n", + name); +} + +void printError(const char *msg, int opt) +{ + if (msg && opt) + fprintf(stderr, "%s (-%c)\n", msg, printable(opt)); +} + +void setProcessScheduler() +{ + struct sched_param param; + + param.sched_priority = sched_get_priority_max(SCHED_FIFO); + if (sched_setscheduler(0, SCHED_FIFO, ¶m) < 0) + fprintf(stderr, "sched_setscheduler() failed: (%d) %s\n", + errno, strerror(errno)); +} + +uint64_t encodeUserData(char type, int fd) +{ + return (uint32_t)fd | ((__u64)type << 56); +} + +void decodeUserData(uint64_t data, char *type, int *fd) +{ + *type = data >> 56; + *fd = data & 0xffffffffU; +} + +const char *opTypeToStr(char type) +{ + const char *res; + + switch (type) { + case IOURING_RECV: + res = "IOURING_RECV"; + break; + case IOURING_SEND: + res = "IOURING_SEND"; + break; + case IOURING_RECVMSG: + res = "IOURING_RECVMSG"; + break; + case IOURING_SENDMSG: + res = "IOURING_SENDMSG"; + break; + default: + res = "Unknown"; + } + + return res; +} + +void reportNapi(struct ctx *ctx) +{ + unsigned int napi_id = 0; + socklen_t len = sizeof(napi_id); + + getsockopt(ctx->sockfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &napi_id, &len); + if (napi_id) + printf(" napi id: %d\n", napi_id); + else + printf(" unassigned napi id\n"); + + ctx->napi_check = true; +} + +void sendPing(struct ctx *ctx) +{ + + struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring); + + io_uring_prep_sendmsg(sqe, ctx->sockfd, &ctx->msg, 0); + sqe->user_data = encodeUserData(IOURING_SENDMSG, ctx->sockfd); +} + +void receivePing(struct ctx *ctx) +{ + bzero(&ctx->msg, sizeof(struct msghdr)); + ctx->msg.msg_name = &ctx->saddr; + ctx->msg.msg_namelen = sizeof(struct sockaddr_in6); + ctx->iov.iov_base = ctx->buffer; + ctx->iov.iov_len = MAXBUFLEN; + ctx->msg.msg_iov = &ctx->iov; + ctx->msg.msg_iovlen = 1; + + struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring); + io_uring_prep_recvmsg(sqe, ctx->sockfd, &ctx->msg, 0); + sqe->user_data = encodeUserData(IOURING_RECVMSG, ctx->sockfd); +} + +void completion(struct ctx *ctx, struct io_uring_cqe *cqe) +{ + char type; + int fd; + int res = cqe->res; + + decodeUserData(cqe->user_data, &type, &fd); + if (res < 0) { + fprintf(stderr, "unexpected %s failure: (%d) %s\n", + opTypeToStr(type), -res, strerror(-res)); + abort(); + } + + switch (type) { + case IOURING_SENDMSG: + receivePing(ctx); + --ctx->num_pings; + break; + case IOURING_RECVMSG: + ctx->iov.iov_len = res; + sendPing(ctx); + if (!ctx->napi_check) + reportNapi(ctx); + break; + default: + fprintf(stderr, "unexpected %s completion\n", + opTypeToStr(type)); + abort(); + break; + } +} + +int main(int argc, char *argv[]) +{ + int flag; + struct ctx ctx; + struct options opt; + struct __kernel_timespec *tsPtr; + struct __kernel_timespec ts; + struct io_uring_params params; + + memset(&opt, 0, sizeof(struct options)); + + // Process flags. + while ((flag = getopt_long(argc, argv, ":lhsba:n:p:t:", longopts, NULL)) != -1) { + switch (flag) { + case 'a': + strcpy(opt.addr, optarg); + break; + case 'b': + opt.busy_loop = true; + break; + case 'h': + printUsage(argv[0]); + exit(0); + break; + case 'l': + opt.listen = true; + break; + case 'n': + opt.num_pings = atoi(optarg) + 1; + break; + case 'p': + strcpy(opt.port, optarg); + break; + case 's': + opt.sq_poll = true; + break; + case 't': + opt.timeout = atoi(optarg); + break; + case ':': + printError("Missing argument", optopt); + printUsage(argv[0]); + exit(-1); + break; + case '?': + printError("Unrecognized option", optopt); + printUsage(argv[0]); + exit(-1); + break; + + default: + fprintf(stderr, "Fatal: Unexpected case in CmdLineProcessor switch()\n"); + exit(-1); + break; + } + } + + if (strlen(opt.addr) == 0) { + fprintf(stderr, "address option is mandatory\n"); + printUsage(argv[0]); + exit(T_EXIT_FAIL); + } + + ctx.saddr.sin6_port = htons(atoi(opt.port)); + ctx.saddr.sin6_family = AF_INET6; + + if (inet_pton(AF_INET6, opt.addr, &ctx.saddr.sin6_addr) <= 0) { + fprintf(stderr, "inet_pton error for %s\n", optarg); + printUsage(argv[0]); + exit(T_EXIT_FAIL); + } + + // Connect to server. + fprintf(stdout, "Listening %s : %s...\n", opt.addr, opt.port); + + if ((ctx.sockfd = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) { + fprintf(stderr, "socket() failed: (%d) %s\n", errno, strerror(errno)); + exit(T_EXIT_FAIL); + } + + if (bind(ctx.sockfd, (struct sockaddr *)&ctx.saddr, sizeof(struct sockaddr_in6)) < 0) { + fprintf(stderr, "bind() failed: (%d) %s\n", errno, strerror(errno)); + exit(T_EXIT_FAIL); + } + + // Setup ring. + memset(¶ms, 0, sizeof(params)); + memset(&ts, 0, sizeof(ts)); + + if (opt.sq_poll) { + params.flags = IORING_SETUP_SQPOLL; + params.sq_thread_idle = 50; + } + + if (io_uring_queue_init_params(RINGSIZE, &ctx.ring, ¶ms) < 0) { + fprintf(stderr, "io_uring_queue_init_params() failed: (%d) %s\n", + errno, strerror(errno)); + exit(T_EXIT_FAIL); + } + + if (opt.timeout) + io_uring_register_napi_busy_poll_timeout(&ctx.ring, opt.timeout); + + if (opt.busy_loop) + tsPtr = &ts; + else + tsPtr = NULL; + + + // Use realtime scheduler. + setProcessScheduler(); + + // Copy payload. + clock_gettime(CLOCK_REALTIME, &ctx.ts); + + // Setup context. + ctx.napi_check = false; + ctx.buffer_len = sizeof(struct timespec); + ctx.num_pings = opt.num_pings; + + // Receive initial message to get napi id. + receivePing(&ctx); + + while (ctx.num_pings != 0) { + int res; + unsigned int num_completed = 0; + unsigned int head; + struct io_uring_cqe *cqe; + + do { + res = io_uring_submit_and_wait_timeout(&ctx.ring, &cqe, 1, tsPtr, NULL); + } + while (res < 0 && errno == ETIME); + + io_uring_for_each_cqe(&ctx.ring, head, cqe) { + ++num_completed; + completion(&ctx, cqe); + } + + if (num_completed) { + io_uring_cq_advance(&ctx.ring, num_completed); + } + } + + // Clean up. + io_uring_queue_exit(&ctx.ring); + close(ctx.sockfd); + + return 0; +} -- 2.30.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 3/4] liburing: add test programs for napi busy poll 2022-11-07 17:53 ` [RFC PATCH v2 3/4] liburing: add test programs for napi busy poll Stefan Roesch @ 2022-11-08 7:01 ` Ammar Faizi 2022-11-08 9:47 ` Ammar Faizi 0 siblings, 1 reply; 9+ messages in thread From: Ammar Faizi @ 2022-11-08 7:01 UTC (permalink / raw) To: Stefan Roesch, Facebook Kernel Team Cc: Jens Axboe, Olivier Langlois, netdev Mailing List, io-uring Mailing List, Jakub Kicinski On 11/8/22 12:53 AM, Stefan Roesch wrote: > This adds two test programs to test the napi busy poll functionality. It > consists of a client program and a server program. To get a napi id, the > client and the server program need to be run on different hosts. > > To test the napi busy poll timeout, the -t needs to be specified. A > reasonable value for the busy poll timeout is 100. By specifying the > busy poll timeout on the server and the client the best results are > accomplished. > > Signed-off-by: Stefan Roesch <[email protected]> > --- > test/Makefile | 2 + > test/napi-busy-poll-client.c | 422 +++++++++++++++++++++++++++++++++++ > test/napi-busy-poll-server.c | 372 ++++++++++++++++++++++++++++++ > 3 files changed, 796 insertions(+) > create mode 100644 test/napi-busy-poll-client.c > create mode 100644 test/napi-busy-poll-server.c Hi Stefan, We don't write liburing tests this way. Your new tests break the "make runtests" command: ... ... Running test napi-busy-poll-client.t address option is mandatory Usage: ./napi-busy-poll-client.t [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll] [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help] ... snip ... Test napi-busy-poll-client.t failed with ret 1 Running test napi-busy-poll-server.t address option is mandatory Usage: ./napi-busy-poll-server.t [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll] [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help] ... snip ... ... ... Tests failed (3): <napi-busy-poll-client.t> <napi-busy-poll-server.t> <pipe-bug.t> make[1]: *** [Makefile:235: runtests] Error 1 make[1]: Leaving directory '/home/ammarfaizi2/app/liburing/test' make: *** [Makefile:21: runtests] Error 2 All test programs in the "test/" directory are run by "make runtests" command. Please try to run them with "make runtests" command. If you want to test several arguments combination variants, you can do something like this: https://github.com/axboe/liburing/blob/754bc068ec482/test/socket.c#L369-L409 Note: Since you're adding a new feature, your test program should check whether the running kernel supports the new feature. If the running kernel doesn't support it, use T_EXIT_SKIP as the exit code. -- Ammar Faizi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 3/4] liburing: add test programs for napi busy poll 2022-11-08 7:01 ` Ammar Faizi @ 2022-11-08 9:47 ` Ammar Faizi 0 siblings, 0 replies; 9+ messages in thread From: Ammar Faizi @ 2022-11-08 9:47 UTC (permalink / raw) To: Stefan Roesch, Facebook Kernel Team Cc: Jens Axboe, Olivier Langlois, netdev Mailing List, io-uring Mailing List, Jakub Kicinski On 11/8/22 2:01 PM, Ammar Faizi wrote: > On 11/8/22 12:53 AM, Stefan Roesch wrote: >> This adds two test programs to test the napi busy poll functionality. It >> consists of a client program and a server program. To get a napi id, the >> client and the server program need to be run on different hosts. >> >> To test the napi busy poll timeout, the -t needs to be specified. A >> reasonable value for the busy poll timeout is 100. By specifying the >> busy poll timeout on the server and the client the best results are >> accomplished. >> >> Signed-off-by: Stefan Roesch <[email protected]> >> --- >> test/Makefile | 2 + >> test/napi-busy-poll-client.c | 422 +++++++++++++++++++++++++++++++++++ >> test/napi-busy-poll-server.c | 372 ++++++++++++++++++++++++++++++ >> 3 files changed, 796 insertions(+) >> create mode 100644 test/napi-busy-poll-client.c >> create mode 100644 test/napi-busy-poll-server.c > > Hi Stefan, > > We don't write liburing tests this way. Your new tests break the "make runtests" > command: > > ... > ... > Running test napi-busy-poll-client.t address option is mandatory > Usage: ./napi-busy-poll-client.t [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll] [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help] > ... snip ... > > Test napi-busy-poll-client.t failed with ret 1 > Running test napi-busy-poll-server.t address option is mandatory > Usage: ./napi-busy-poll-server.t [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll] [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help] > ... snip ... > ... > ... > Tests failed (3): <napi-busy-poll-client.t> <napi-busy-poll-server.t> <pipe-bug.t> > make[1]: *** [Makefile:235: runtests] Error 1 > make[1]: Leaving directory '/home/ammarfaizi2/app/liburing/test' > make: *** [Makefile:21: runtests] Error 2 > > All test programs in the "test/" directory are run by "make runtests" command. > Please try to run them with "make runtests" command. > > If you want to test several arguments combination variants, you can do something > like this: > > https://github.com/axboe/liburing/blob/754bc068ec482/test/socket.c#L369-L409 > > Note: Since you're adding a new feature, your test program should check whether > the running kernel supports the new feature. If the running kernel doesn't > support it, use T_EXIT_SKIP as the exit code. Note: In that example, it doesn't use T_EXIT_* as the exit code because this exit code protocol is a new framework. We haven't finished to port the old tests to follow this rule. It's introduced in commit (in liburing-2.3): ed430fbeb3336 ("tests: migrate some tests to use enum-based exit codes") New tests should follow the T_EXIT_* convention for exit code. -- Ammar Faizi ^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH v2 4/4] liburing: update changelog with new feature 2022-11-07 17:53 [RFC PATCH v2 0/4] liburing: add api for napi busy poll timeout Stefan Roesch ` (2 preceding siblings ...) 2022-11-07 17:53 ` [RFC PATCH v2 3/4] liburing: add test programs for napi busy poll Stefan Roesch @ 2022-11-07 17:53 ` Stefan Roesch 3 siblings, 0 replies; 9+ messages in thread From: Stefan Roesch @ 2022-11-07 17:53 UTC (permalink / raw) To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2 Add a new entry to the changelog file for the napi busy poll feature. Signed-off-by: Stefan Roesch <[email protected]> --- CHANGELOG | 3 +++ 1 file changed, 3 insertions(+) diff --git a/CHANGELOG b/CHANGELOG index 09511af..1db0269 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,3 +1,6 @@ +liburing-2.4 release +- Support for napi busy polling + liburing-2.3 release - Support non-libc build for aarch64. -- 2.30.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-11-08 9:47 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-11-07 17:53 [RFC PATCH v2 0/4] liburing: add api for napi busy poll timeout Stefan Roesch 2022-11-07 17:53 ` [RFC PATCH v2 1/4] liburing: add api to set " Stefan Roesch 2022-11-08 7:14 ` Ammar Faizi 2022-11-07 17:53 ` [RFC PATCH v2 2/4] liburing: add documentation for new napi busy polling Stefan Roesch 2022-11-08 8:04 ` Ammar Faizi 2022-11-07 17:53 ` [RFC PATCH v2 3/4] liburing: add test programs for napi busy poll Stefan Roesch 2022-11-08 7:01 ` Ammar Faizi 2022-11-08 9:47 ` Ammar Faizi 2022-11-07 17:53 ` [RFC PATCH v2 4/4] liburing: update changelog with new feature Stefan Roesch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox