public inbox for [email protected]
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/4] liburing: add api for napi busy poll timeout
@ 2022-11-15  7:09 Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Stefan Roesch @ 2022-11-15  7:09 UTC (permalink / raw)
  To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2

This adds three new api's to set/clear the napi busy poll timeout and to set
the napi prefer busy poll setting. The three new functions are called:
- io_uring_napi_register_busy_poll_timeout,
- io_uring_napi_unregister_busy_poll_timeout,
- io_uring_register_napi_prefer_busy_poll.

The patch series also contains the documentation for the three new functions
and two example programs. The client program is called napi-busy-poll-client
and the server program napi-busy-poll-server. The client measures the
roundtrip times of requests.

There is also a kernel patch "io-uring: support napi busy poll" to enable
this feature on the kernel side.

Changes:
- V3:
  - Updated liburing.map file
  - Moved example programs from the test directory to the example directory.
    The two example programs don't fit well in the test category and need to
    be run from separate hosts.
  - Added the io_uring_register_napi_prefer_busy_poll API.
  - Added the call to io_uring_register_napi_prefer_busy_poll to the example
    programs
  - Updated the documentation
- V2:
  - Updated the liburing.map file for the two new functions.
    (added a 2.4 section)
  - Added a description of the new feature to the changelog file
  - Fixed the indentation of the longopts structure
  - Used defined exit constants
  - Fixed encodeUserData to support 32 bit builds


Signed-off-by: Stefan Roesch <[email protected]>

Stefan Roesch (4):
  liburing: add api to set napi busy poll settings
  liburing: add documentation for new napi busy polling
  liburing: add test programs for napi busy poll
  liburing: update changelog with new feature

 .gitignore                                    |   2 +
 CHANGELOG                                     |   3 +
 examples/Makefile                             |   2 +
 examples/napi-busy-poll-client.c              | 432 ++++++++++++++++++
 examples/napi-busy-poll-server.c              | 380 +++++++++++++++
 ...io_uring_register_napi_busy_poll_timeout.3 |  35 ++
 man/io_uring_register_napi_prefer_busy_poll.3 |  35 ++
 ..._uring_unregister_napi_busy_poll_timeout.3 |  26 ++
 src/include/liburing.h                        |   6 +
 src/include/liburing/io_uring.h               |   4 +
 src/liburing.map                              |   7 +
 src/register.c                                |  23 +
 12 files changed, 955 insertions(+)
 create mode 100644 examples/napi-busy-poll-client.c
 create mode 100644 examples/napi-busy-poll-server.c
 create mode 100644 man/io_uring_register_napi_busy_poll_timeout.3
 create mode 100644 man/io_uring_register_napi_prefer_busy_poll.3
 create mode 100644 man/io_uring_unregister_napi_busy_poll_timeout.3


base-commit: 8fc22e3b3348c0a6384ec926e0b19b6707622e58
-- 
2.30.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 1/4] liburing: add api to set napi busy poll settings
  2022-11-15  7:09 [RFC PATCH v3 0/4] liburing: add api for napi busy poll timeout Stefan Roesch
@ 2022-11-15  7:09 ` Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 2/4] liburing: add documentation for new napi busy polling Stefan Roesch
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Stefan Roesch @ 2022-11-15  7:09 UTC (permalink / raw)
  To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2

This adds three functions to manage the napi busy poll settings:
- io_uring_register_napi_busy_poll_timeout
- io_uring_unregister_napi_busy_poll_timeout
- io_uring_register_napi_prefer_busy_poll

Signed-off-by: Stefan Roesch <[email protected]>
---
 src/include/liburing.h          |  6 ++++++
 src/include/liburing/io_uring.h |  4 ++++
 src/liburing.map                |  7 +++++++
 src/register.c                  | 23 +++++++++++++++++++++++
 4 files changed, 40 insertions(+)

diff --git a/src/include/liburing.h b/src/include/liburing.h
index 12a703f..47bbced 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -235,6 +235,12 @@ int io_uring_register_sync_cancel(struct io_uring *ring,
 int io_uring_register_file_alloc_range(struct io_uring *ring,
 					unsigned off, unsigned len);
 
+int io_uring_register_napi_prefer_busy_poll(struct io_uring *ring,
+					    bool prefer_busy_poll);
+int io_uring_register_napi_busy_poll_timeout(struct io_uring *ring,
+					     unsigned int to);
+int io_uring_unregister_napi_busy_poll_timeout(struct io_uring *ring);
+
 int io_uring_get_events(struct io_uring *ring);
 int io_uring_submit_and_get_events(struct io_uring *ring);
 
diff --git a/src/include/liburing/io_uring.h b/src/include/liburing/io_uring.h
index a3e0920..2e53f52 100644
--- a/src/include/liburing/io_uring.h
+++ b/src/include/liburing/io_uring.h
@@ -499,6 +499,10 @@ enum {
 	/* register a range of fixed file slots for automatic slot allocation */
 	IORING_REGISTER_FILE_ALLOC_RANGE	= 25,
 
+	/* set/clear busy poll settings */
+	IORING_REGISTER_NAPI_PREFER_BUSY_POLL	= 26,
+	IORING_REGISTER_NAPI_BUSY_POLL_TIMEOUT	= 27,
+
 	/* this goes last */
 	IORING_REGISTER_LAST
 };
diff --git a/src/liburing.map b/src/liburing.map
index 06c64f8..2e41a40 100644
--- a/src/liburing.map
+++ b/src/liburing.map
@@ -67,3 +67,10 @@ LIBURING_2.3 {
 		io_uring_get_events;
 		io_uring_submit_and_get_events;
 } LIBURING_2.2;
+
+LIBURING_2.4 {
+	global:
+		io_uring_napi_register_prefer_busy_poll;
+		io_uring_napi_register_busy_poll_timeout;
+		io_uring_napi_unregister_busy_poll_timeout;
+} LIBURING_2.3;
diff --git a/src/register.c b/src/register.c
index e849825..50250b8 100644
--- a/src/register.c
+++ b/src/register.c
@@ -367,3 +367,26 @@ int io_uring_register_file_alloc_range(struct io_uring *ring,
 				       IORING_REGISTER_FILE_ALLOC_RANGE, &range,
 				       0);
 }
+
+int io_uring_register_napi_prefer_busy_poll(struct io_uring *ring,
+					    bool prefer_busy_poll)
+{
+	return __sys_io_uring_register(ring->ring_fd,
+				IORING_REGISTER_NAPI_PREFER_BUSY_POLL,
+				NULL, prefer_busy_poll);
+}
+
+int io_uring_register_napi_busy_poll_timeout(struct io_uring *ring,
+					     unsigned int to)
+{
+	return __sys_io_uring_register(ring->ring_fd,
+				IORING_REGISTER_NAPI_BUSY_POLL_TIMEOUT,
+				NULL, to);
+}
+
+int io_uring_unregister_napi_busy_poll_timeout(struct io_uring *ring)
+{
+	return __sys_io_uring_register(ring->ring_fd,
+				IORING_REGISTER_NAPI_BUSY_POLL_TIMEOUT,
+				NULL, 0);
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 2/4] liburing: add documentation for new napi busy polling
  2022-11-15  7:09 [RFC PATCH v3 0/4] liburing: add api for napi busy poll timeout Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
@ 2022-11-15  7:09 ` Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 3/4] liburing: add test programs for napi busy poll Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 4/4] liburing: update changelog with new feature Stefan Roesch
  3 siblings, 0 replies; 5+ messages in thread
From: Stefan Roesch @ 2022-11-15  7:09 UTC (permalink / raw)
  To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2

This adds two man pages for the two new functions:
- io_uring_register_napi_busy_poll_timeout
- io_uring_unregister_napi_busy_poll_timeout

Signed-off-by: Stefan Roesch <[email protected]>
---
 ...io_uring_register_napi_busy_poll_timeout.3 | 35 +++++++++++++++++++
 man/io_uring_register_napi_prefer_busy_poll.3 | 35 +++++++++++++++++++
 ..._uring_unregister_napi_busy_poll_timeout.3 | 26 ++++++++++++++
 3 files changed, 96 insertions(+)
 create mode 100644 man/io_uring_register_napi_busy_poll_timeout.3
 create mode 100644 man/io_uring_register_napi_prefer_busy_poll.3
 create mode 100644 man/io_uring_unregister_napi_busy_poll_timeout.3

diff --git a/man/io_uring_register_napi_busy_poll_timeout.3 b/man/io_uring_register_napi_busy_poll_timeout.3
new file mode 100644
index 0000000..3acce60
--- /dev/null
+++ b/man/io_uring_register_napi_busy_poll_timeout.3
@@ -0,0 +1,35 @@
+.\" Copyright (C) 2022 Stefan Roesch <[email protected]>
+.\"
+.\" SPDX-License-Identifier: LGPL-2.0-or-later
+.\"
+.TH io_uring_register_napi_busy_poll_timeout 3 "November 10, 2022" "liburing-2.4" "liburing Manual"
+.SH NAME
+io_uring_register_napi_busy_poll_timeout \- register NAPI busy poll timeout
+.SH SYNOPSIS
+.nf
+.B #include <liburing.h>
+.PP
+.BI "int io_uring_register_napi_busy_poll_timeout(struct io_uring *" ring ","
+.BI "                                             unsigned int " timeout)
+.PP
+.fi
+.SH DESCRIPTION
+.PP
+The
+.BR io_uring_register_napi_busy_poll_timeout (3)
+function registers the NAPI busy poll
+.I timeout
+for subsequent operations.
+
+Registering a NAPI busy poll timeout is a requirement to be able to use
+NAPI busy polling. The other way to enable NAPI busy polling is to set the
+proc setting /proc/sys/net/core/busy_poll.
+
+NAPI busy poll can reduce the network roundtrip time.
+
+
+.SH RETURN VALUE
+On success
+.BR io_uring_register_napi_busy_poll_timeout (3)
+return 0. On failure they return
+.BR -errno .
diff --git a/man/io_uring_register_napi_prefer_busy_poll.3 b/man/io_uring_register_napi_prefer_busy_poll.3
new file mode 100644
index 0000000..713840e
--- /dev/null
+++ b/man/io_uring_register_napi_prefer_busy_poll.3
@@ -0,0 +1,35 @@
+.\" Copyright (C) 2022 Stefan Roesch <[email protected]>
+.\"
+.\" SPDX-License-Identifier: LGPL-2.0-or-later
+.\"
+.TH io_uring_register_napi_prefer_busy_poll 3 "November 11, 2022" "liburing-2.4" "liburing Manual"
+.SH NAME
+io_uring_register_napi_prefer_busy_poll \- register NAPI prefer busy poll setting
+.SH SYNOPSIS
+.nf
+.B #include <liburing.h>
+.PP
+.BI "int io_uring_register_napi_prefer_busy_poll(struct io_uring *" ring ","
+.BI "                                            bool " prefer_busy_poll)
+.PP
+.fi
+.SH DESCRIPTION
+.PP
+The
+.BR io_uring_register_napi_prefer_busy_poll (3)
+function registers the NAPI
+.I prefer_busy_poll
+for subsequent operations.
+
+Registering a NAPI prefer busy poll seeting sets the mode when calling the
+function napi_busy_loop and corresponds to the SO_PREFER_BUSY_POLL socket
+option.
+
+NAPI prefer busy poll can help in reducng the network roundtrip time.
+
+
+.SH RETURN VALUE
+On success
+.BR io_uring_register_napi_prefer_busy_poll (3)
+return 0. On failure they return
+.BR -errno .
diff --git a/man/io_uring_unregister_napi_busy_poll_timeout.3 b/man/io_uring_unregister_napi_busy_poll_timeout.3
new file mode 100644
index 0000000..666e006
--- /dev/null
+++ b/man/io_uring_unregister_napi_busy_poll_timeout.3
@@ -0,0 +1,26 @@
+.\" Copyright (C) 2022 Stefan Roesch <[email protected]>
+.\"
+.\" SPDX-License-Identifier: LGPL-2.0-or-later
+.\"
+.TH io_uring_unregister_napi_busy_poll_timeout 3 "November 10, 2022" "liburing-2.4" "liburing Manual"
+.SH NAME
+io_uring_unregister_napi_busy_poll_timeout \- unregister NAPI busy poll timeout
+.SH SYNOPSIS
+.nf
+.B #include <liburing.h>
+.PP
+.BI "int io_uring_unregister_napi_busy_poll_timeout(struct io_uring *" ring ")
+.PP
+.fi
+.SH DESCRIPTION
+.PP
+The
+.BR io_uring_unregister_napi_busy_poll_timeout (3)
+function unregisters the NAPI busy poll
+for subsequent operations.
+
+.SH RETURN VALUE
+On success
+.BR io_uring_unregister_napi_busy_poll_timeout (3)
+return 0. On failure they return
+.BR -errno .
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 3/4] liburing: add test programs for napi busy poll
  2022-11-15  7:09 [RFC PATCH v3 0/4] liburing: add api for napi busy poll timeout Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 2/4] liburing: add documentation for new napi busy polling Stefan Roesch
@ 2022-11-15  7:09 ` Stefan Roesch
  2022-11-15  7:09 ` [RFC PATCH v3 4/4] liburing: update changelog with new feature Stefan Roesch
  3 siblings, 0 replies; 5+ messages in thread
From: Stefan Roesch @ 2022-11-15  7:09 UTC (permalink / raw)
  To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2

This adds two test programs to test the napi busy poll functionality. It
consists of a client program and a server program. To get a napi id, the
client and the server program need to be run on different hosts.

To test the napi busy poll timeout, the -t needs to be specified. A
reasonable value for the busy poll timeout is 100. By specifying the
busy poll timeout on the server and the client the best results are
accomplished.

Signed-off-by: Stefan Roesch <[email protected]>
---
 .gitignore                       |   2 +
 examples/Makefile                |   2 +
 examples/napi-busy-poll-client.c | 432 +++++++++++++++++++++++++++++++
 examples/napi-busy-poll-server.c | 380 +++++++++++++++++++++++++++
 4 files changed, 816 insertions(+)
 create mode 100644 examples/napi-busy-poll-client.c
 create mode 100644 examples/napi-busy-poll-server.c

diff --git a/.gitignore b/.gitignore
index 6e8a2f7..89b5a41 100644
--- a/.gitignore
+++ b/.gitignore
@@ -15,6 +15,8 @@
 /examples/io_uring-test
 /examples/io_uring-udp
 /examples/link-cp
+/examples/napi-busy-poll-client
+/examples/napi-busy-poll-server
 /examples/ucontext-cp
 /examples/poll-bench
 /examples/send-zerocopy
diff --git a/examples/Makefile b/examples/Makefile
index e561e05..59f1260 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -15,6 +15,8 @@ example_srcs := \
 	io_uring-test.c \
 	io_uring-udp.c \
 	link-cp.c \
+	napi-busy-poll-client.c \
+	napi-busy-poll-server.c \
 	poll-bench.c \
 	send-zerocopy.c
 
diff --git a/examples/napi-busy-poll-client.c b/examples/napi-busy-poll-client.c
new file mode 100644
index 0000000..38c4798
--- /dev/null
+++ b/examples/napi-busy-poll-client.c
@@ -0,0 +1,432 @@
+#include <ctype.h>
+#include <errno.h>
+#include <float.h>
+#include <getopt.h>
+#include <liburing.h>
+#include <math.h>
+#include <sched.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <time.h>
+#include <unistd.h>
+#include <arpa/inet.h>
+#include <netdb.h>
+#include <netinet/in.h>
+
+#define MAXBUFLEN 100
+#define PORTNOLEN 10
+#define ADDRLEN   80
+#define RINGSIZE  1024
+
+#define printable(ch) (isprint((unsigned char)ch) ? ch : '#')
+
+enum {
+	IOURING_RECV,
+	IOURING_SEND,
+	IOURING_RECVMSG,
+	IOURING_SENDMSG
+};
+
+struct ctx
+{
+	struct io_uring ring;
+	struct sockaddr_in6 saddr;
+
+	int sockfd;
+	int buffer_len;
+	int num_pings;
+	bool napi_check;
+
+	union {
+		char buffer[MAXBUFLEN];
+		struct timespec ts;
+	};
+
+	int rtt_index;
+	double *rtt;
+} ctx;
+
+struct options
+{
+	int  num_pings;
+	int  timeout;
+
+	bool sq_poll;
+	bool busy_loop;
+	bool prefer_busy_poll;
+
+	char port[PORTNOLEN];
+	char addr[ADDRLEN];
+} options;
+
+struct option longopts[] =
+{
+	{"address"  , 1, NULL, 'a'},
+	{"busy"     , 0, NULL, 'b'},
+	{"help"     , 0, NULL, 'h'},
+	{"num_pings", 1, NULL, 'n'},
+	{"port"     , 1, NULL, 'p'},
+	{"prefer"   , 1, NULL, 'u'},
+	{"sqpoll"   , 0, NULL, 's'},
+	{"timeout"  , 1, NULL, 't'},
+	{NULL       , 0, NULL,  0 }
+};
+
+void printUsage(const char *name)
+{
+	fprintf(stderr,
+	"Usage: %s [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll]"
+	" [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help]\n"
+	"--address\n"
+	"-a        : remote or local ipv6 address\n"
+	"--busy\n"
+	"-b        : busy poll io_uring instead of blocking.\n"
+	"--num_pings\n"
+	"-n        : number of pings\n"
+	"--port\n"
+	"-p        : port\n"
+	"--sqpoll\n"
+	"-s        : Configure io_uring to use SQPOLL thread\n"
+	"--timeout\n"
+	"-t        : Configure NAPI busy poll timeoutn"
+	"--prefer\n"
+	"-u        : prefer NAPI busy poll\n"
+	"--help\n"
+	"-h        : Display this usage message\n\n",
+	name);
+}
+
+void printError(const char *msg, int opt)
+{
+	if (msg && opt)
+		fprintf(stderr, "%s (-%c)\n", msg, printable(opt));
+}
+
+void setProcessScheduler(void)
+{
+	struct sched_param param;
+
+	param.sched_priority = sched_get_priority_max(SCHED_FIFO);
+	if (sched_setscheduler(0, SCHED_FIFO, &param) < 0)
+		fprintf(stderr, "sched_setscheduler() failed: (%d) %s\n",
+			errno, strerror(errno));
+}
+
+double diffTimespec(const struct timespec *time1, const struct timespec *time0)
+{
+	return (time1->tv_sec - time0->tv_sec)
+		+ (time1->tv_nsec - time0->tv_nsec) / 1000000000.0;
+}
+
+uint64_t encodeUserData(char type, int fd)
+{
+	return (uint32_t)fd | ((uint64_t)type << 56);
+}
+
+void decodeUserData(uint64_t data, char *type, int *fd)
+{
+	*type = data >> 56;
+	*fd   = data & 0xffffffffU;
+}
+
+const char *opTypeToStr(char type)
+{
+	const char *res;
+
+	switch (type) {
+	case IOURING_RECV:
+		res = "IOURING_RECV";
+		break;
+	case IOURING_SEND:
+		res = "IOURING_SEND";
+		break;
+	case IOURING_RECVMSG:
+		res = "IOURING_RECVMSG";
+		break;
+	case IOURING_SENDMSG:
+		res = "IOURING_SENDMSG";
+		break;
+	default:
+		res = "Unknown";
+	}
+
+	return res;
+}
+
+void reportNapi(struct ctx *ctx)
+{
+	unsigned int napi_id = 0;
+	socklen_t len = sizeof(napi_id);
+
+	getsockopt(ctx->sockfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &napi_id, &len);
+	if (napi_id)
+		printf(" napi id: %d\n", napi_id);
+	else
+		printf(" unassigned napi id\n");
+
+	ctx->napi_check = true;
+}
+
+void sendPing(struct ctx *ctx)
+{
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+
+	clock_gettime(CLOCK_REALTIME, (struct timespec *)ctx->buffer);
+
+	io_uring_prep_send(sqe, ctx->sockfd, ctx->buffer, sizeof(struct timespec), 0);
+	sqe->user_data = encodeUserData(IOURING_SEND, ctx->sockfd);
+}
+
+void receivePing(struct ctx *ctx)
+{
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+
+	io_uring_prep_recv(sqe, ctx->sockfd, ctx->buffer, MAXBUFLEN, 0);
+	sqe->user_data = encodeUserData(IOURING_RECV, ctx->sockfd);
+}
+
+void recordRTT(struct ctx *ctx)
+{
+    struct timespec startTs = ctx->ts;
+
+    // Send next ping.
+    sendPing(ctx);
+
+    // Store round-trip time.
+    ctx->rtt[ctx->rtt_index] = diffTimespec(&ctx->ts, &startTs);
+    ctx->rtt_index++;
+}
+
+void printStats(struct ctx *ctx)
+{
+	double minRTT    = DBL_MAX;
+	double maxRTT    = 0.0;
+	double avgRTT    = 0.0;
+	double stddevRTT = 0.0;
+
+	// Calculate min, max, avg.
+	for (int i = 0; i < ctx->rtt_index; i++) {
+		if (ctx->rtt[i] < minRTT)
+			minRTT = ctx->rtt[i];
+		if (ctx->rtt[i] > maxRTT)
+			maxRTT = ctx->rtt[i];
+
+        	avgRTT += ctx->rtt[i];
+	}
+	avgRTT /= ctx->rtt_index;
+
+	// Calculate stddev.
+	for (int i = 0; i < ctx->rtt_index; i++)
+		stddevRTT += fabs(ctx->rtt[i] - avgRTT);
+	stddevRTT /= ctx->rtt_index;
+
+	fprintf(stdout, " rtt(us) min/avg/max/mdev = %.3f/%.3f/%.3f/%.3f\n",
+		minRTT * 1000000, avgRTT * 1000000, maxRTT * 1000000, stddevRTT * 1000000);
+}
+
+void completion(struct ctx *ctx, struct io_uring_cqe *cqe)
+{
+	char type;
+	int  fd;
+	int  res = cqe->res;
+
+	decodeUserData(cqe->user_data, &type, &fd);
+	if (res < 0) {
+		fprintf(stderr, "unexpected %s failure: (%d) %s\n",
+			opTypeToStr(type), -res, strerror(-res));
+		abort();
+	}
+
+	switch (type) {
+	case IOURING_SEND:
+		receivePing(ctx);
+		break;
+	case IOURING_RECV:
+		if (res != sizeof(struct timespec)) {
+			fprintf(stderr, "unexpected ping reply len: %d\n", res);
+			abort();
+		}
+
+		if (!ctx->napi_check) {
+			reportNapi(ctx);
+			sendPing(ctx);
+		} else {
+			recordRTT(ctx);
+		}
+
+		--ctx->num_pings;
+		break;
+
+	default:
+		fprintf(stderr, "unexpected %s completion\n",
+			opTypeToStr(type));
+		abort();
+		break;
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	struct ctx       ctx;
+	struct options   opt;
+	struct __kernel_timespec *tsPtr;
+	struct __kernel_timespec ts;
+	struct io_uring_params params;
+	int flag;    
+
+	memset(&opt, 0, sizeof(struct options));
+
+	// Process flags.
+	while ((flag = getopt_long(argc, argv, ":hsba:n:p:t:", longopts, NULL)) != -1) {
+		switch (flag) {
+		case 'a':
+			strcpy(opt.addr, optarg);
+			break;
+		case 'b':
+			opt.busy_loop = true;
+			break;
+		case 'h':
+			printUsage(argv[0]);
+			exit(0);
+			break;
+		case 'n':
+			opt.num_pings = atoi(optarg) + 1;
+			break;
+		case 'p':
+			strcpy(opt.port, optarg);
+			break;
+		case 's':
+                	opt.sq_poll = true;
+			break;
+		case 't':
+			opt.timeout = atoi(optarg);
+			break;
+		case 'u':
+			opt.prefer_busy_poll = true;
+			break;
+		case ':':
+			printError("Missing argument", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+		case '?':
+			printError("Unrecognized option", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+
+		default:
+			fprintf(stderr, "Fatal: Unexpected case in CmdLineProcessor switch()\n");
+			exit(-1);
+			break;
+		}
+	}
+
+	if (strlen(opt.addr) == 0) {
+		fprintf(stderr, "address option is mandatory\n");
+		printUsage(argv[0]);
+		exit(1);
+	}
+
+	ctx.saddr.sin6_port   = htons(atoi(opt.port));
+	ctx.saddr.sin6_family = AF_INET6;
+
+	if (inet_pton(AF_INET6, opt.addr, &ctx.saddr.sin6_addr) <= 0) {
+        	fprintf(stderr, "inet_pton error for %s\n", optarg);
+		printUsage(argv[0]);
+		exit(1);
+        }
+
+	// Connect to server.
+	fprintf(stdout, "Connecting to %s... (port=%s) to send %d pings\n", opt.addr, opt.port, opt.num_pings - 1);
+
+	if ((ctx.sockfd = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
+        	fprintf(stderr, "socket() failed: (%d) %s\n", errno, strerror(errno));
+        	exit(1);
+	}
+
+	if (connect(ctx.sockfd, (struct sockaddr *)&ctx.saddr, sizeof(struct sockaddr_in6)) < 0) {
+		fprintf(stderr, "connect() failed: (%d) %s\n", errno, strerror(errno));
+		exit(1);
+	}
+
+	// Setup ring.
+	memset(&params, 0, sizeof(params));
+	memset(&ts, 0, sizeof(ts));
+
+	if (opt.sq_poll) {
+		params.flags = IORING_SETUP_SQPOLL;
+		params.sq_thread_idle = 50;
+	}
+
+	if (io_uring_queue_init_params(RINGSIZE, &ctx.ring, &params) < 0) {
+		fprintf(stderr, "io_uring_queue_init_params() failed: (%d) %s\n",
+			errno, strerror(errno));
+		exit(1);
+	}
+
+	if (opt.prefer_busy_poll)
+		io_uring_register_napi_prefer_busy_poll(&ctx.ring, opt.prefer_busy_poll);
+
+	if (opt.timeout)
+		io_uring_register_napi_busy_poll_timeout(&ctx.ring, opt.timeout);
+
+	if (opt.busy_loop)
+		tsPtr = &ts;
+	else
+		tsPtr = NULL;
+
+
+	// Use realtime scheduler.
+	setProcessScheduler();
+
+	// Copy payload.
+	clock_gettime(CLOCK_REALTIME, &ctx.ts);
+
+	// Setup context.
+	ctx.napi_check = false;
+	ctx.buffer_len = sizeof(struct timespec);
+	ctx.num_pings  = opt.num_pings;
+
+	ctx.rtt_index = 0;
+	ctx.rtt = (double *)malloc(sizeof(double) * opt.num_pings);
+	if (!ctx.rtt) {
+		fprintf(stderr, "Cannot allocate results array\n");
+		exit(1);
+	}
+
+	// Send initial message to get napi id.
+	sendPing(&ctx);
+
+        while (ctx.num_pings != 0) {
+		int res;
+		unsigned num_completed = 0;
+		unsigned head;
+		struct io_uring_cqe *cqe;
+
+		do {
+			res = io_uring_submit_and_wait_timeout(&ctx.ring, &cqe, 1, tsPtr, NULL);
+		}
+		while (res < 0 && errno == ETIME);
+
+		io_uring_for_each_cqe(&ctx.ring, head, cqe) {
+			++num_completed;
+			completion(&ctx, cqe);
+		}
+
+		if (num_completed)
+			io_uring_cq_advance(&ctx.ring, num_completed);
+	}
+
+	printStats(&ctx);
+	free(ctx.rtt);
+	io_uring_queue_exit(&ctx.ring);
+
+	// Clean up.
+	close(ctx.sockfd);
+
+	return 0;
+}
diff --git a/examples/napi-busy-poll-server.c b/examples/napi-busy-poll-server.c
new file mode 100644
index 0000000..11acf44
--- /dev/null
+++ b/examples/napi-busy-poll-server.c
@@ -0,0 +1,380 @@
+#include <ctype.h>
+#include <errno.h>
+#include <getopt.h>
+#include <liburing.h>
+#include <math.h>
+#include <sched.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <time.h>
+#include <unistd.h>
+#include <arpa/inet.h>
+#include <netdb.h>
+#include <netinet/in.h>
+
+#define MAXBUFLEN 100
+#define PORTNOLEN 10
+#define ADDRLEN   80
+#define RINGSIZE  1024
+
+#define printable(ch) (isprint((unsigned char)ch) ? ch : '#')
+
+enum {
+	IOURING_RECV,
+	IOURING_SEND,
+	IOURING_RECVMSG,
+	IOURING_SENDMSG
+};
+
+struct ctx
+{
+	struct io_uring     ring;
+	struct sockaddr_in6 saddr;
+	struct iovec        iov;
+	struct msghdr       msg;
+
+	int sockfd;
+	int buffer_len;
+	int num_pings;
+	bool napi_check;
+
+	union {
+		char buffer[MAXBUFLEN];
+		struct timespec ts;
+	};
+} ctx;
+
+struct options
+{
+	int  num_pings;
+	int  timeout;
+
+	bool listen;
+	bool sq_poll;
+	bool busy_loop;
+	bool prefer_busy_poll;
+
+	char port[PORTNOLEN];
+	char addr[ADDRLEN];
+} options;
+
+struct option longopts[] =
+{
+	{"address"  , 1, NULL, 'a'},
+	{"busy"     , 0, NULL, 'b'},
+	{"help"     , 0, NULL, 'h'},
+	{"listen"   , 0, NULL, 'l'},
+	{"num_pings", 1, NULL, 'n'},
+	{"port"     , 1, NULL, 'p'},
+	{"prefer"   , 1, NULL, 'u'},
+	{"sqpoll"   , 0, NULL, 's'},
+	{"timeout"  , 1, NULL, 't'},
+	{NULL       , 0, NULL,  0 }
+};
+
+void printUsage(const char *name)
+{
+	fprintf(stderr,
+        "Usage: %s [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll]"
+        " [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-h|--help]\n"
+	" --listen\n"
+	"-l        : Server mode\n"
+        "--address\n"
+        "-a        : remote or local ipv6 address\n"
+        "--busy\n"
+        "-b        : busy poll io_uring instead of blocking.\n"
+        "--num_pings\n"
+        "-n        : number of pings\n"
+        "--port\n"
+        "-p        : port\n"
+        "--sqpoll\n"
+        "-s        : Configure io_uring to use SQPOLL thread\n"
+        "--timeout\n"
+        "-t        : Configure NAPI busy poll timeoutn"
+	"--prefer\n"
+	"-u        : prefer NAPI busy poll\n"
+        "--help\n"
+        "-h        : Display this usage message\n\n",
+	name);
+}
+
+void printError(const char *msg, int opt)
+{
+	if (msg && opt)
+		fprintf(stderr, "%s (-%c)\n", msg, printable(opt));
+}
+
+void setProcessScheduler()
+{
+	struct sched_param param;
+
+	param.sched_priority = sched_get_priority_max(SCHED_FIFO);
+	if (sched_setscheduler(0, SCHED_FIFO, &param) < 0)
+		fprintf(stderr, "sched_setscheduler() failed: (%d) %s\n",
+			errno, strerror(errno));
+}
+
+uint64_t encodeUserData(char type, int fd)
+{
+	return (uint32_t)fd | ((__u64)type << 56);
+}
+
+void decodeUserData(uint64_t data, char *type, int *fd)
+{
+	*type = data >> 56;
+	*fd   = data & 0xffffffffU;
+}
+
+const char *opTypeToStr(char type)
+{
+	const char *res;
+
+	switch (type) {
+	case IOURING_RECV:
+		res = "IOURING_RECV";
+		break;
+	case IOURING_SEND:
+		res = "IOURING_SEND";
+		break;
+	case IOURING_RECVMSG:
+		res = "IOURING_RECVMSG";
+		break;
+	case IOURING_SENDMSG:
+		res = "IOURING_SENDMSG";
+		break;
+	default:
+		res = "Unknown";
+	}
+
+	return res;
+}
+
+void reportNapi(struct ctx *ctx)
+{
+	unsigned int napi_id = 0;
+	socklen_t len = sizeof(napi_id);
+
+	getsockopt(ctx->sockfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &napi_id, &len);
+	if (napi_id)
+		printf(" napi id: %d\n", napi_id);
+	else
+		printf(" unassigned napi id\n");
+
+	ctx->napi_check = true;
+}
+
+void sendPing(struct ctx *ctx)
+{
+
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+
+	io_uring_prep_sendmsg(sqe, ctx->sockfd, &ctx->msg, 0);
+	sqe->user_data = encodeUserData(IOURING_SENDMSG, ctx->sockfd);
+}
+
+void receivePing(struct ctx *ctx)
+{
+	bzero(&ctx->msg, sizeof(struct msghdr));
+	ctx->msg.msg_name    = &ctx->saddr;
+	ctx->msg.msg_namelen = sizeof(struct sockaddr_in6);
+	ctx->iov.iov_base    = ctx->buffer;
+	ctx->iov.iov_len     = MAXBUFLEN;
+	ctx->msg.msg_iov     = &ctx->iov;
+	ctx->msg.msg_iovlen  = 1;
+
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+	io_uring_prep_recvmsg(sqe, ctx->sockfd, &ctx->msg, 0);
+	sqe->user_data = encodeUserData(IOURING_RECVMSG, ctx->sockfd);
+}
+
+void completion(struct ctx *ctx, struct io_uring_cqe *cqe)
+{
+	char type;
+	int  fd;
+	int  res = cqe->res;
+
+	decodeUserData(cqe->user_data, &type, &fd);
+	if (res < 0) {
+		fprintf(stderr, "unexpected %s failure: (%d) %s\n",
+			opTypeToStr(type), -res, strerror(-res));
+		abort();
+	}
+
+	switch (type) {
+	case IOURING_SENDMSG:
+		receivePing(ctx);
+		--ctx->num_pings;
+		break;
+	case IOURING_RECVMSG:
+		ctx->iov.iov_len = res;
+		sendPing(ctx);
+		if (!ctx->napi_check)
+			reportNapi(ctx);
+		break;
+	default:
+		fprintf(stderr, "unexpected %s completion\n",
+			opTypeToStr(type));
+		abort();
+		break;
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	int flag;    
+	struct ctx       ctx;
+	struct options   opt;
+	struct __kernel_timespec *tsPtr;
+	struct __kernel_timespec ts;
+	struct io_uring_params params;
+
+	memset(&opt, 0, sizeof(struct options));
+
+	// Process flags.
+	while ((flag = getopt_long(argc, argv, ":lhsba:n:p:t:", longopts, NULL)) != -1) {
+		switch (flag) {
+		case 'a':
+			strcpy(opt.addr, optarg);
+			break;
+		case 'b':
+			opt.busy_loop = true;
+			break;
+		case 'h':
+			printUsage(argv[0]);
+			exit(0);
+			break;
+		case 'l':
+			opt.listen = true;
+			break;
+		case 'n':
+			opt.num_pings = atoi(optarg) + 1;
+			break;
+		case 'p':
+			strcpy(opt.port, optarg);
+			break;
+		case 's':
+                	opt.sq_poll = true;
+			break;
+		case 't':
+			opt.timeout = atoi(optarg);
+			break;
+		case 'u':
+			opt.prefer_busy_poll = true;
+			break;
+		case ':':
+			printError("Missing argument", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+		case '?':
+			printError("Unrecognized option", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+
+		default:
+			fprintf(stderr, "Fatal: Unexpected case in CmdLineProcessor switch()\n");
+			exit(-1);
+			break;
+		}
+	}
+
+	if (strlen(opt.addr) == 0) {
+		fprintf(stderr, "address option is mandatory\n");
+		printUsage(argv[0]);
+		exit(1);
+	}
+
+	ctx.saddr.sin6_port   = htons(atoi(opt.port));
+	ctx.saddr.sin6_family = AF_INET6;
+
+	if (inet_pton(AF_INET6, opt.addr, &ctx.saddr.sin6_addr) <= 0) {
+        	fprintf(stderr, "inet_pton error for %s\n", optarg);
+		printUsage(argv[0]);
+		exit(1);
+        }
+
+	// Connect to server.
+	fprintf(stdout, "Listening %s : %s...\n", opt.addr, opt.port);
+
+	if ((ctx.sockfd = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
+        	fprintf(stderr, "socket() failed: (%d) %s\n", errno, strerror(errno));
+        	exit(1);
+	}
+
+	if (bind(ctx.sockfd, (struct sockaddr *)&ctx.saddr, sizeof(struct sockaddr_in6)) < 0) {
+		fprintf(stderr, "bind() failed: (%d) %s\n", errno, strerror(errno));
+		exit(1);
+	}
+
+	// Setup ring.
+	memset(&params, 0, sizeof(params));
+	memset(&ts, 0, sizeof(ts));
+
+	if (opt.sq_poll) {
+		params.flags = IORING_SETUP_SQPOLL;
+		params.sq_thread_idle = 50;
+	}
+
+	if (io_uring_queue_init_params(RINGSIZE, &ctx.ring, &params) < 0) {
+		fprintf(stderr, "io_uring_queue_init_params() failed: (%d) %s\n",
+			errno, strerror(errno));
+		exit(1);
+	}
+
+	if (opt.prefer_busy_poll)
+		io_uring_register_napi_prefer_busy_poll(&ctx.ring, opt.prefer_busy_poll);
+
+	if (opt.timeout)
+		io_uring_register_napi_busy_poll_timeout(&ctx.ring, opt.timeout);
+
+	if (opt.busy_loop)
+		tsPtr = &ts;
+	else
+		tsPtr = NULL;
+
+
+	// Use realtime scheduler.
+	setProcessScheduler();
+
+	// Copy payload.
+	clock_gettime(CLOCK_REALTIME, &ctx.ts);
+
+	// Setup context.
+	ctx.napi_check = false;
+	ctx.buffer_len = sizeof(struct timespec);
+	ctx.num_pings  = opt.num_pings;
+
+	// Receive initial message to get napi id.
+	receivePing(&ctx);
+
+        while (ctx.num_pings != 0) {
+		int res;
+		unsigned int num_completed = 0;
+		unsigned int head;
+		struct io_uring_cqe *cqe;
+
+		do {
+			res = io_uring_submit_and_wait_timeout(&ctx.ring, &cqe, 1, tsPtr, NULL);
+		}
+		while (res < 0 && errno == ETIME);
+
+		io_uring_for_each_cqe(&ctx.ring, head, cqe) {
+			++num_completed;
+			completion(&ctx, cqe);
+		}
+
+		if (num_completed) {
+			io_uring_cq_advance(&ctx.ring, num_completed);
+		}
+	}
+
+	// Clean up.
+	io_uring_queue_exit(&ctx.ring);
+	close(ctx.sockfd);
+
+	return 0;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 4/4] liburing: update changelog with new feature
  2022-11-15  7:09 [RFC PATCH v3 0/4] liburing: add api for napi busy poll timeout Stefan Roesch
                   ` (2 preceding siblings ...)
  2022-11-15  7:09 ` [RFC PATCH v3 3/4] liburing: add test programs for napi busy poll Stefan Roesch
@ 2022-11-15  7:09 ` Stefan Roesch
  3 siblings, 0 replies; 5+ messages in thread
From: Stefan Roesch @ 2022-11-15  7:09 UTC (permalink / raw)
  To: kernel-team; +Cc: shr, axboe, olivier, netdev, io-uring, kuba, ammarfaizi2

Add a new entry to the changelog file for the napi busy poll feature.

Signed-off-by: Stefan Roesch <[email protected]>
---
 CHANGELOG | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/CHANGELOG b/CHANGELOG
index 09511af..1db0269 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,3 +1,6 @@
+liburing-2.4 release
+- Support for napi busy polling
+
 liburing-2.3 release
 
 - Support non-libc build for aarch64.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-15  7:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-15  7:09 [RFC PATCH v3 0/4] liburing: add api for napi busy poll timeout Stefan Roesch
2022-11-15  7:09 ` [RFC PATCH v3 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
2022-11-15  7:09 ` [RFC PATCH v3 2/4] liburing: add documentation for new napi busy polling Stefan Roesch
2022-11-15  7:09 ` [RFC PATCH v3 3/4] liburing: add test programs for napi busy poll Stefan Roesch
2022-11-15  7:09 ` [RFC PATCH v3 4/4] liburing: update changelog with new feature Stefan Roesch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox