public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH v9 0/4] liburing: add api for napi busy poll
@ 2023-04-25 18:20 Stefan Roesch
  2023-04-25 18:20 ` [PATCH v9 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Stefan Roesch @ 2023-04-25 18:20 UTC (permalink / raw)
  To: io-uring, kernel-team; +Cc: shr, axboe, ammarfaizi2

This adds two new api's to set/clear the napi busy poll settings. The two
new functions are called:
- io_uring_register_napi
- io_uring_unregister_napi

The patch series also contains the documentation for the two new functions
and two example programs. The client program is called napi-busy-poll-client
and the server program napi-busy-poll-server. The client measures the
roundtrip times of requests.

There is also a kernel patch "io-uring: support napi busy poll" to enable
this feature on the kernel side.

Changes:
- V9:
  - Update to library V2.4 and kernel 6.4
- V8:
  - Use tab in liburing-ffi.map file
  - Small fix to io_uring_register_napi man page
- V7:
  - Add new functions to liburing-ffi.map file.
  - Fix some whitespace issues.
  - Address the compile errors from CI in the two files
    napi-busy-poll-client.c and napi-busy-poll-server.c
- V6:
  - Check return value of unregister napi call and verify that busy poll
    timeout and prefer busy poll return the expected values.
- V5:
  - Fixes to documentation.
  - Correct opcode for unregister call
  - Initialize napi structure in example programs
  - Address tab issues in recordRTT()
- V4:
  - Modify functions to use a structure to pass the napi busy poll settings
    to the kernel.
  - Return previous values when returning from the above functions.
  - Rename the functions and remove one function (no longer needed as the
    data is passed as a structure)
- V3:
  - Updated liburing.map file
  - Moved example programs from the test directory to the example directory.
    The two example programs don't fit well in the test category and need to
    be run from separate hosts.
  - Added the io_uring_register_napi_prefer_busy_poll API.
  - Added the call to io_uring_register_napi_prefer_busy_poll to the example
    programs
  - Updated the documentation
- V2:
  - Updated the liburing.map file for the two new functions.
    (added a 2.4 section)
  - Added a description of the new feature to the changelog file
  - Fixed the indentation of the longopts structure
  - Used defined exit constants
  - Fixed encodeUserData to support 32 bit builds

*** BLURB HERE ***

Stefan Roesch (4):
  liburing: add api to set napi busy poll settings
  liburing: add documentation for new napi busy polling
  liburing: add example programs for napi busy poll
  liburing: update changelog with new feature

 .gitignore                       |   2 +
 CHANGELOG                        |   1 +
 examples/Makefile                |   2 +
 examples/napi-busy-poll-client.c | 451 +++++++++++++++++++++++++++++++
 examples/napi-busy-poll-server.c | 386 ++++++++++++++++++++++++++
 man/io_uring_register_napi.3     |  40 +++
 man/io_uring_unregister_napi.3   |  27 ++
 src/include/liburing.h           |   3 +
 src/include/liburing/io_uring.h  |  12 +
 src/liburing-ffi.map             |   2 +
 src/liburing.map                 |   2 +
 src/register.c                   |  12 +
 12 files changed, 940 insertions(+)
 create mode 100644 examples/napi-busy-poll-client.c
 create mode 100644 examples/napi-busy-poll-server.c
 create mode 100644 man/io_uring_register_napi.3
 create mode 100644 man/io_uring_unregister_napi.3


base-commit: b7f85996a5cb290fc2ad7d2f4d7341fc54321016
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2023-04-25 18:20 [PATCH v9 0/4] liburing: add api for napi busy poll Stefan Roesch
@ 2023-04-25 18:20 ` Stefan Roesch
  2024-02-02 16:41   ` Olivier Langlois
  2023-04-25 18:20 ` [PATCH v9 2/4] liburing: add documentation for new napi busy polling Stefan Roesch
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Roesch @ 2023-04-25 18:20 UTC (permalink / raw)
  To: io-uring, kernel-team; +Cc: shr, axboe, ammarfaizi2

This adds two functions to manage the napi busy poll settings:
- io_uring_register_napi
- io_uring_unregister_napi

Signed-off-by: Stefan Roesch <[email protected]>
Reviewed-by: Ammar Faizi <[email protected]>
---
 src/include/liburing.h          |  3 +++
 src/include/liburing/io_uring.h | 12 ++++++++++++
 src/liburing-ffi.map            |  2 ++
 src/liburing.map                |  2 ++
 src/register.c                  | 12 ++++++++++++
 5 files changed, 31 insertions(+)

diff --git a/src/include/liburing.h b/src/include/liburing.h
index 70c1774..add17a0 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -241,6 +241,9 @@ int io_uring_register_sync_cancel(struct io_uring *ring,
 int io_uring_register_file_alloc_range(struct io_uring *ring,
 					unsigned off, unsigned len);
 
+int io_uring_register_napi(struct io_uring *ring, struct io_uring_napi *napi);
+int io_uring_unregister_napi(struct io_uring *ring, struct io_uring_napi *napi);
+
 int io_uring_get_events(struct io_uring *ring);
 int io_uring_submit_and_get_events(struct io_uring *ring);
 
diff --git a/src/include/liburing/io_uring.h b/src/include/liburing/io_uring.h
index 3d3a63b..0e6e0bd 100644
--- a/src/include/liburing/io_uring.h
+++ b/src/include/liburing/io_uring.h
@@ -523,6 +523,10 @@ enum {
 	/* register a range of fixed file slots for automatic slot allocation */
 	IORING_REGISTER_FILE_ALLOC_RANGE	= 25,
 
+	/* set/clear busy poll settings */
+	IORING_REGISTER_NAPI			= 26,
+	IORING_UNREGISTER_NAPI			= 27,
+
 	/* this goes last */
 	IORING_REGISTER_LAST,
 
@@ -662,6 +666,14 @@ struct io_uring_buf_reg {
 	__u64	resv[3];
 };
 
+/* argument for IORING_(UN)REGISTER_NAPI */
+struct io_uring_napi {
+	__u32   busy_poll_to;
+	__u8    prefer_busy_poll;
+	__u8    pad[3];
+	__u64   resv;
+};
+
 /*
  * io_uring_restriction->opcode values
  */
diff --git a/src/liburing-ffi.map b/src/liburing-ffi.map
index 0a5e12c..2d9ab9f 100644
--- a/src/liburing-ffi.map
+++ b/src/liburing-ffi.map
@@ -171,6 +171,8 @@ LIBURING_2.4 {
 		io_uring_prep_msg_ring_fd;
 		io_uring_prep_msg_ring_fd_alloc;
 		io_uring_prep_sendto;
+		io_uring_register_napi;
+		io_uring_unregister_napi;
 	local:
 		*;
 };
diff --git a/src/liburing.map b/src/liburing.map
index 3b37022..c775b61 100644
--- a/src/liburing.map
+++ b/src/liburing.map
@@ -79,4 +79,6 @@ LIBURING_2.4 {
 		io_uring_register_restrictions;
 		io_uring_setup_buf_ring;
 		io_uring_free_buf_ring;
+		io_uring_register_napi;
+		io_uring_unregister_napi;
 } LIBURING_2.3;
diff --git a/src/register.c b/src/register.c
index a3fcc78..d494be7 100644
--- a/src/register.c
+++ b/src/register.c
@@ -337,3 +337,15 @@ int io_uring_register_file_alloc_range(struct io_uring *ring,
 
 	return do_register(ring, IORING_REGISTER_FILE_ALLOC_RANGE, &range, 0);
 }
+
+int io_uring_register_napi(struct io_uring *ring, struct io_uring_napi *napi)
+{
+	return __sys_io_uring_register(ring->ring_fd,
+				IORING_REGISTER_NAPI, napi, 0);
+}
+
+int io_uring_unregister_napi(struct io_uring *ring, struct io_uring_napi *napi)
+{
+	return __sys_io_uring_register(ring->ring_fd,
+				IORING_UNREGISTER_NAPI, napi, 0);
+}
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v9 2/4] liburing: add documentation for new napi busy polling
  2023-04-25 18:20 [PATCH v9 0/4] liburing: add api for napi busy poll Stefan Roesch
  2023-04-25 18:20 ` [PATCH v9 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
@ 2023-04-25 18:20 ` Stefan Roesch
  2023-04-25 18:20 ` [PATCH v9 3/4] liburing: add example programs for napi busy poll Stefan Roesch
  2023-04-25 18:20 ` [PATCH v9 4/4] liburing: update changelog with new feature Stefan Roesch
  3 siblings, 0 replies; 15+ messages in thread
From: Stefan Roesch @ 2023-04-25 18:20 UTC (permalink / raw)
  To: io-uring, kernel-team; +Cc: shr, axboe, ammarfaizi2

This adds two man pages for the two new functions:
- io_uring_register_nap
- io_uring_unregister_napi

Signed-off-by: Stefan Roesch <[email protected]>
Reviewed-by: Ammar Faizi <[email protected]>
---
 man/io_uring_register_napi.3   | 40 ++++++++++++++++++++++++++++++++++
 man/io_uring_unregister_napi.3 | 27 +++++++++++++++++++++++
 2 files changed, 67 insertions(+)
 create mode 100644 man/io_uring_register_napi.3
 create mode 100644 man/io_uring_unregister_napi.3

diff --git a/man/io_uring_register_napi.3 b/man/io_uring_register_napi.3
new file mode 100644
index 0000000..6ce8cff
--- /dev/null
+++ b/man/io_uring_register_napi.3
@@ -0,0 +1,40 @@
+.\" Copyright (C) 2022 Stefan Roesch <[email protected]>
+.\"
+.\" SPDX-License-Identifier: LGPL-2.0-or-later
+.\"
+.TH io_uring_register_napi 3 "November 16, 2022" "liburing-2.4" "liburing Manual"
+.SH NAME
+io_uring_register_napi \- register NAPI busy poll settings
+.SH SYNOPSIS
+.nf
+.B #include <liburing.h>
+.PP
+.BI "int io_uring_register_napi(struct io_uring *" ring ","
+.BI "                           struct io_uring_napi *" napi)
+.PP
+.fi
+.SH DESCRIPTION
+.PP
+The
+.BR io_uring_register_napi (3)
+function registers the NAPI settings for subsequent operations. The NAPI
+settings are specified in the structure that is passed in the
+.I napi
+parameter. The structure consists of the napi timeout
+.I busy_poll_to
+(napi busy poll timeout in us) and
+.IR prefer_busy_poll .
+
+Registering a NAPI settings sets the mode when calling the function
+napi_busy_loop and corresponds to the SO_PREFER_BUSY_POLL socket
+option.
+
+NAPI busy poll can reduce the network roundtrip time.
+
+
+.SH RETURN VALUE
+On success
+.BR io_uring_register_napi (3)
+return 0. On failure they return
+.BR -errno .
+It also updates the napi structure with the current values.
diff --git a/man/io_uring_unregister_napi.3 b/man/io_uring_unregister_napi.3
new file mode 100644
index 0000000..f7087ef
--- /dev/null
+++ b/man/io_uring_unregister_napi.3
@@ -0,0 +1,27 @@
+.\" Copyright (C) 2022 Stefan Roesch <[email protected]>
+.\"
+.\" SPDX-License-Identifier: LGPL-2.0-or-later
+.\"
+.TH io_uring_unregister_napi 3 "November 16, 2022" "liburing-2.4" "liburing Manual"
+.SH NAME
+io_uring_unregister_napi \- unregister NAPI busy poll settings
+.SH SYNOPSIS
+.nf
+.B #include <liburing.h>
+.PP
+.BI "int io_uring_unregister_napi(struct io_uring *" ring ","
+.BI "                             struct io_uring_napi *" napi)
+.PP
+.fi
+.SH DESCRIPTION
+.PP
+The
+.BR io_uring_unregister_napi (3)
+function unregisters the NAPI busy poll settings for subsequent operations.
+
+.SH RETURN VALUE
+On success
+.BR io_uring_unregister_napi (3)
+return 0. On failure they return
+.BR -errno .
+It also updates the napi structure with the current values.
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v9 3/4] liburing: add example programs for napi busy poll
  2023-04-25 18:20 [PATCH v9 0/4] liburing: add api for napi busy poll Stefan Roesch
  2023-04-25 18:20 ` [PATCH v9 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
  2023-04-25 18:20 ` [PATCH v9 2/4] liburing: add documentation for new napi busy polling Stefan Roesch
@ 2023-04-25 18:20 ` Stefan Roesch
  2023-04-25 18:20 ` [PATCH v9 4/4] liburing: update changelog with new feature Stefan Roesch
  3 siblings, 0 replies; 15+ messages in thread
From: Stefan Roesch @ 2023-04-25 18:20 UTC (permalink / raw)
  To: io-uring, kernel-team; +Cc: shr, axboe, ammarfaizi2

This adds two example programs to test the napi busy poll functionality.
It consists of a client program and a server program. To get a napi id,
the client and the server program need to be run on different hosts.

To test the napi busy poll timeout, the -t needs to be specified. A
reasonable value for the busy poll timeout is 100. By specifying the
busy poll timeout on the server and the client the best results are
accomplished.

Signed-off-by: Stefan Roesch <[email protected]>
Acked-by: Ammar Faizi <[email protected]>
---
 .gitignore                       |   2 +
 examples/Makefile                |   2 +
 examples/napi-busy-poll-client.c | 451 +++++++++++++++++++++++++++++++
 examples/napi-busy-poll-server.c | 386 ++++++++++++++++++++++++++
 4 files changed, 841 insertions(+)
 create mode 100644 examples/napi-busy-poll-client.c
 create mode 100644 examples/napi-busy-poll-server.c

diff --git a/.gitignore b/.gitignore
index bfb2224..1c3623d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -19,6 +19,8 @@
 /examples/io_uring-test
 /examples/io_uring-udp
 /examples/link-cp
+/examples/napi-busy-poll-client
+/examples/napi-busy-poll-server
 /examples/ucontext-cp
 /examples/poll-bench
 /examples/send-zerocopy
diff --git a/examples/Makefile b/examples/Makefile
index 715ac4c..6e3d1f8 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -19,6 +19,8 @@ example_srcs := \
 	io_uring-test.c \
 	io_uring-udp.c \
 	link-cp.c \
+	napi-busy-poll-client.c \
+	napi-busy-poll-server.c \
 	poll-bench.c \
 	send-zerocopy.c \
 	rsrc-update-bench.c
diff --git a/examples/napi-busy-poll-client.c b/examples/napi-busy-poll-client.c
new file mode 100644
index 0000000..fda1da7
--- /dev/null
+++ b/examples/napi-busy-poll-client.c
@@ -0,0 +1,451 @@
+#include <ctype.h>
+#include <errno.h>
+#include <float.h>
+#include <getopt.h>
+#include <liburing.h>
+#include <math.h>
+#include <sched.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <time.h>
+#include <unistd.h>
+#include <arpa/inet.h>
+#include <netdb.h>
+#include <netinet/in.h>
+
+#define MAXBUFLEN 100
+#define PORTNOLEN 10
+#define ADDRLEN   80
+#define RINGSIZE  1024
+
+#define printable(ch) (isprint((unsigned char)ch) ? ch : '#')
+
+enum {
+	IOURING_RECV,
+	IOURING_SEND,
+	IOURING_RECVMSG,
+	IOURING_SENDMSG
+};
+
+struct ctx
+{
+	struct io_uring ring;
+	struct sockaddr_in6 saddr;
+
+	int sockfd;
+	int buffer_len;
+	int num_pings;
+	bool napi_check;
+
+	union {
+		char buffer[MAXBUFLEN];
+		struct timespec ts;
+	};
+
+	int rtt_index;
+	double *rtt;
+};
+
+struct options
+{
+	int  num_pings;
+	__u32  timeout;
+
+	bool sq_poll;
+	bool busy_loop;
+	bool prefer_busy_poll;
+
+	char port[PORTNOLEN];
+	char addr[ADDRLEN];
+};
+
+static struct option longopts[] =
+{
+	{"address"  , 1, NULL, 'a'},
+	{"busy"     , 0, NULL, 'b'},
+	{"help"     , 0, NULL, 'h'},
+	{"num_pings", 1, NULL, 'n'},
+	{"port"     , 1, NULL, 'p'},
+	{"prefer"   , 1, NULL, 'u'},
+	{"sqpoll"   , 0, NULL, 's'},
+	{"timeout"  , 1, NULL, 't'},
+	{NULL       , 0, NULL,  0 }
+};
+
+static void printUsage(const char *name)
+{
+	fprintf(stderr,
+	"Usage: %s [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll]"
+	" [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-u||--prefer] [-h|--help]\n"
+	"--address\n"
+	"-a        : remote or local ipv6 address\n"
+	"--busy\n"
+	"-b        : busy poll io_uring instead of blocking.\n"
+	"--num_pings\n"
+	"-n        : number of pings\n"
+	"--port\n"
+	"-p        : port\n"
+	"--sqpoll\n"
+	"-s        : Configure io_uring to use SQPOLL thread\n"
+	"--timeout\n"
+	"-t        : Configure NAPI busy poll timeout"
+	"--prefer\n"
+	"-u        : prefer NAPI busy poll\n"
+	"--help\n"
+	"-h        : Display this usage message\n\n",
+	name);
+}
+
+static void printError(const char *msg, int opt)
+{
+	if (msg && opt)
+		fprintf(stderr, "%s (-%c)\n", msg, printable(opt));
+}
+
+static void setProcessScheduler(void)
+{
+	struct sched_param param;
+
+	param.sched_priority = sched_get_priority_max(SCHED_FIFO);
+	if (sched_setscheduler(0, SCHED_FIFO, &param) < 0)
+		fprintf(stderr, "sched_setscheduler() failed: (%d) %s\n",
+			errno, strerror(errno));
+}
+
+static double diffTimespec(const struct timespec *time1, const struct timespec *time0)
+{
+	return (time1->tv_sec - time0->tv_sec)
+		+ (time1->tv_nsec - time0->tv_nsec) / 1000000000.0;
+}
+
+static uint64_t encodeUserData(char type, int fd)
+{
+	return (uint32_t)fd | ((uint64_t)type << 56);
+}
+
+static void decodeUserData(uint64_t data, char *type, int *fd)
+{
+	*type = data >> 56;
+	*fd   = data & 0xffffffffU;
+}
+
+static const char *opTypeToStr(char type)
+{
+	const char *res;
+
+	switch (type) {
+	case IOURING_RECV:
+		res = "IOURING_RECV";
+		break;
+	case IOURING_SEND:
+		res = "IOURING_SEND";
+		break;
+	case IOURING_RECVMSG:
+		res = "IOURING_RECVMSG";
+		break;
+	case IOURING_SENDMSG:
+		res = "IOURING_SENDMSG";
+		break;
+	default:
+		res = "Unknown";
+	}
+
+	return res;
+}
+
+static void reportNapi(struct ctx *ctx)
+{
+	unsigned int napi_id = 0;
+	socklen_t len = sizeof(napi_id);
+
+	getsockopt(ctx->sockfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &napi_id, &len);
+	if (napi_id)
+		printf(" napi id: %d\n", napi_id);
+	else
+		printf(" unassigned napi id\n");
+
+	ctx->napi_check = true;
+}
+
+static void sendPing(struct ctx *ctx)
+{
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+
+	clock_gettime(CLOCK_REALTIME, (struct timespec *)ctx->buffer);
+
+	io_uring_prep_send(sqe, ctx->sockfd, ctx->buffer, sizeof(struct timespec), 0);
+	sqe->user_data = encodeUserData(IOURING_SEND, ctx->sockfd);
+}
+
+static void receivePing(struct ctx *ctx)
+{
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+
+	io_uring_prep_recv(sqe, ctx->sockfd, ctx->buffer, MAXBUFLEN, 0);
+	sqe->user_data = encodeUserData(IOURING_RECV, ctx->sockfd);
+}
+
+static void recordRTT(struct ctx *ctx)
+{
+	struct timespec startTs = ctx->ts;
+
+	// Send next ping.
+	sendPing(ctx);
+
+	// Store round-trip time.
+	ctx->rtt[ctx->rtt_index] = diffTimespec(&ctx->ts, &startTs);
+	ctx->rtt_index++;
+}
+
+static void printStats(struct ctx *ctx)
+{
+	double minRTT    = DBL_MAX;
+	double maxRTT    = 0.0;
+	double avgRTT    = 0.0;
+	double stddevRTT = 0.0;
+
+	// Calculate min, max, avg.
+	for (int i = 0; i < ctx->rtt_index; i++) {
+		if (ctx->rtt[i] < minRTT)
+			minRTT = ctx->rtt[i];
+		if (ctx->rtt[i] > maxRTT)
+			maxRTT = ctx->rtt[i];
+
+		avgRTT += ctx->rtt[i];
+	}
+	avgRTT /= ctx->rtt_index;
+
+	// Calculate stddev.
+	for (int i = 0; i < ctx->rtt_index; i++)
+		stddevRTT += fabs(ctx->rtt[i] - avgRTT);
+	stddevRTT /= ctx->rtt_index;
+
+	fprintf(stdout, " rtt(us) min/avg/max/mdev = %.3f/%.3f/%.3f/%.3f\n",
+		minRTT * 1000000, avgRTT * 1000000, maxRTT * 1000000, stddevRTT * 1000000);
+}
+
+static int completion(struct ctx *ctx, struct io_uring_cqe *cqe)
+{
+	char type;
+	int  fd;
+	int  res = cqe->res;
+
+	decodeUserData(cqe->user_data, &type, &fd);
+	if (res < 0) {
+		fprintf(stderr, "unexpected %s failure: (%d) %s\n",
+			opTypeToStr(type), -res, strerror(-res));
+		return -1;
+	}
+
+	switch (type) {
+	case IOURING_SEND:
+		receivePing(ctx);
+		break;
+	case IOURING_RECV:
+		if (res != sizeof(struct timespec)) {
+			fprintf(stderr, "unexpected ping reply len: %d\n", res);
+			abort();
+		}
+
+		if (!ctx->napi_check) {
+			reportNapi(ctx);
+			sendPing(ctx);
+		} else {
+			recordRTT(ctx);
+		}
+
+		--ctx->num_pings;
+		break;
+
+	default:
+		fprintf(stderr, "unexpected %s completion\n",
+			opTypeToStr(type));
+		return -1;
+		break;
+	}
+
+	return 0;
+}
+
+int main(int argc, char *argv[])
+{
+	struct ctx       ctx;
+	struct options   opt;
+	struct __kernel_timespec *tsPtr;
+	struct __kernel_timespec ts;
+	struct io_uring_params params;
+	struct io_uring_napi napi;
+	int flag;
+
+	memset(&opt, 0, sizeof(struct options));
+
+	// Process flags.
+	while ((flag = getopt_long(argc, argv, ":hsbua:n:p:t:", longopts, NULL)) != -1) {
+		switch (flag) {
+		case 'a':
+			strcpy(opt.addr, optarg);
+			break;
+		case 'b':
+			opt.busy_loop = true;
+			break;
+		case 'h':
+			printUsage(argv[0]);
+			exit(0);
+			break;
+		case 'n':
+			opt.num_pings = atoi(optarg) + 1;
+			break;
+		case 'p':
+			strcpy(opt.port, optarg);
+			break;
+		case 's':
+			opt.sq_poll = true;
+			break;
+		case 't':
+			opt.timeout = atoi(optarg);
+			break;
+		case 'u':
+			opt.prefer_busy_poll = true;
+			break;
+		case ':':
+			printError("Missing argument", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+		case '?':
+			printError("Unrecognized option", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+
+		default:
+			fprintf(stderr, "Fatal: Unexpected case in CmdLineProcessor switch()\n");
+			exit(-1);
+			break;
+		}
+	}
+
+	if (strlen(opt.addr) == 0) {
+		fprintf(stderr, "address option is mandatory\n");
+		printUsage(argv[0]);
+		exit(1);
+	}
+
+	ctx.saddr.sin6_port   = htons(atoi(opt.port));
+	ctx.saddr.sin6_family = AF_INET6;
+
+	if (inet_pton(AF_INET6, opt.addr, &ctx.saddr.sin6_addr) <= 0) {
+		fprintf(stderr, "inet_pton error for %s\n", optarg);
+		printUsage(argv[0]);
+		exit(1);
+	}
+
+	// Connect to server.
+	fprintf(stdout, "Connecting to %s... (port=%s) to send %d pings\n", opt.addr, opt.port, opt.num_pings - 1);
+
+	if ((ctx.sockfd = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
+		fprintf(stderr, "socket() failed: (%d) %s\n", errno, strerror(errno));
+		exit(1);
+	}
+
+	if (connect(ctx.sockfd, (struct sockaddr *)&ctx.saddr, sizeof(struct sockaddr_in6)) < 0) {
+		fprintf(stderr, "connect() failed: (%d) %s\n", errno, strerror(errno));
+		exit(1);
+	}
+
+	// Setup ring.
+	memset(&params, 0, sizeof(params));
+	memset(&ts, 0, sizeof(ts));
+	memset(&napi, 0, sizeof(napi));
+
+	if (opt.sq_poll) {
+		params.flags = IORING_SETUP_SQPOLL;
+		params.sq_thread_idle = 50;
+	}
+
+	if (io_uring_queue_init_params(RINGSIZE, &ctx.ring, &params) < 0) {
+		fprintf(stderr, "io_uring_queue_init_params() failed: (%d) %s\n",
+			errno, strerror(errno));
+		exit(1);
+	}
+
+	if (opt.timeout || opt.prefer_busy_poll) {
+		napi.prefer_busy_poll = opt.prefer_busy_poll;
+		napi.busy_poll_to = opt.timeout;
+
+		io_uring_register_napi(&ctx.ring, &napi);
+	}
+
+	if (opt.busy_loop)
+		tsPtr = &ts;
+	else
+		tsPtr = NULL;
+
+	// Use realtime scheduler.
+	setProcessScheduler();
+
+	// Copy payload.
+	clock_gettime(CLOCK_REALTIME, &ctx.ts);
+
+	// Setup context.
+	ctx.napi_check = false;
+	ctx.buffer_len = sizeof(struct timespec);
+	ctx.num_pings  = opt.num_pings;
+
+	ctx.rtt_index = 0;
+	ctx.rtt = (double *)malloc(sizeof(double) * opt.num_pings);
+	if (!ctx.rtt) {
+		fprintf(stderr, "Cannot allocate results array\n");
+		exit(1);
+	}
+
+	// Send initial message to get napi id.
+	sendPing(&ctx);
+
+	while (ctx.num_pings != 0) {
+		int res;
+		unsigned num_completed = 0;
+		unsigned head;
+		struct io_uring_cqe *cqe;
+
+		do {
+			res = io_uring_submit_and_wait_timeout(&ctx.ring, &cqe, 1, tsPtr, NULL);
+		}
+		while (res < 0 && errno == ETIME);
+
+		io_uring_for_each_cqe(&ctx.ring, head, cqe) {
+			++num_completed;
+			if (completion(&ctx, cqe))
+				goto out;
+		}
+
+		if (num_completed)
+			io_uring_cq_advance(&ctx.ring, num_completed);
+	}
+
+	printStats(&ctx);
+
+out:
+	// Clean up.
+	if (opt.timeout || opt.prefer_busy_poll) {
+		io_uring_unregister_napi(&ctx.ring, &napi);
+		if (opt.timeout          != napi.busy_poll_to ||
+		    opt.prefer_busy_poll != napi.prefer_busy_poll) {
+			fprintf(stderr, "Expected busy poll to = %d, got %d\n",
+				opt.timeout, napi.busy_poll_to);
+			fprintf(stderr, "Expected prefer busy poll = %d, got %d\n",
+				opt.prefer_busy_poll, napi.prefer_busy_poll);
+		}
+	} else {
+		io_uring_unregister_napi(&ctx.ring, NULL);
+	}
+	io_uring_queue_exit(&ctx.ring);
+
+	free(ctx.rtt);
+	close(ctx.sockfd);
+
+	return 0;
+}
diff --git a/examples/napi-busy-poll-server.c b/examples/napi-busy-poll-server.c
new file mode 100644
index 0000000..d3c919f
--- /dev/null
+++ b/examples/napi-busy-poll-server.c
@@ -0,0 +1,386 @@
+#include <ctype.h>
+#include <errno.h>
+#include <getopt.h>
+#include <liburing.h>
+#include <math.h>
+#include <sched.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <time.h>
+#include <unistd.h>
+#include <arpa/inet.h>
+#include <netdb.h>
+#include <netinet/in.h>
+
+#define MAXBUFLEN 100
+#define PORTNOLEN 10
+#define ADDRLEN   80
+#define RINGSIZE  1024
+
+#define printable(ch) (isprint((unsigned char)ch) ? ch : '#')
+
+enum {
+	IOURING_RECV,
+	IOURING_SEND,
+	IOURING_RECVMSG,
+	IOURING_SENDMSG
+};
+
+struct ctx
+{
+	struct io_uring     ring;
+	struct sockaddr_in6 saddr;
+	struct iovec        iov;
+	struct msghdr       msg;
+
+	int sockfd;
+	int buffer_len;
+	int num_pings;
+	bool napi_check;
+
+	union {
+		char buffer[MAXBUFLEN];
+		struct timespec ts;
+	};
+};
+
+struct options
+{
+	int  num_pings;
+	__u32 timeout;
+
+	bool listen;
+	bool sq_poll;
+	bool busy_loop;
+	bool prefer_busy_poll;
+
+	char port[PORTNOLEN];
+	char addr[ADDRLEN];
+};
+
+static struct option longopts[] =
+{
+	{"address"  , 1, NULL, 'a'},
+	{"busy"     , 0, NULL, 'b'},
+	{"help"     , 0, NULL, 'h'},
+	{"listen"   , 0, NULL, 'l'},
+	{"num_pings", 1, NULL, 'n'},
+	{"port"     , 1, NULL, 'p'},
+	{"prefer"   , 1, NULL, 'u'},
+	{"sqpoll"   , 0, NULL, 's'},
+	{"timeout"  , 1, NULL, 't'},
+	{NULL       , 0, NULL,  0 }
+};
+
+static void printUsage(const char *name)
+{
+	fprintf(stderr,
+	"Usage: %s [-l|--listen] [-a|--address ip_address] [-p|--port port-no] [-s|--sqpoll]"
+	" [-b|--busy] [-n|--num pings] [-t|--timeout busy-poll-timeout] [-u|--prefer] [-h|--help]\n"
+	" --listen\n"
+	"-l        : Server mode\n"
+	"--address\n"
+	"-a        : remote or local ipv6 address\n"
+	"--busy\n"
+	"-b        : busy poll io_uring instead of blocking.\n"
+	"--num_pings\n"
+	"-n        : number of pings\n"
+	"--port\n"
+	"-p        : port\n"
+	"--sqpoll\n"
+	"-s        : Configure io_uring to use SQPOLL thread\n"
+	"--timeout\n"
+	"-t        : Configure NAPI busy poll timeout"
+	"--prefer\n"
+	"-u        : prefer NAPI busy poll\n"
+	"--help\n"
+	"-h        : Display this usage message\n\n",
+	name);
+}
+
+static void printError(const char *msg, int opt)
+{
+	if (msg && opt)
+		fprintf(stderr, "%s (-%c)\n", msg, printable(opt));
+}
+
+static void setProcessScheduler(void)
+{
+	struct sched_param param;
+
+	param.sched_priority = sched_get_priority_max(SCHED_FIFO);
+	if (sched_setscheduler(0, SCHED_FIFO, &param) < 0)
+		fprintf(stderr, "sched_setscheduler() failed: (%d) %s\n",
+			errno, strerror(errno));
+}
+
+static uint64_t encodeUserData(char type, int fd)
+{
+	return (uint32_t)fd | ((__u64)type << 56);
+}
+
+static void decodeUserData(uint64_t data, char *type, int *fd)
+{
+	*type = data >> 56;
+	*fd   = data & 0xffffffffU;
+}
+
+static const char *opTypeToStr(char type)
+{
+	const char *res;
+
+	switch (type) {
+	case IOURING_RECV:
+		res = "IOURING_RECV";
+		break;
+	case IOURING_SEND:
+		res = "IOURING_SEND";
+		break;
+	case IOURING_RECVMSG:
+		res = "IOURING_RECVMSG";
+		break;
+	case IOURING_SENDMSG:
+		res = "IOURING_SENDMSG";
+		break;
+	default:
+		res = "Unknown";
+	}
+
+	return res;
+}
+
+static void reportNapi(struct ctx *ctx)
+{
+	unsigned int napi_id = 0;
+	socklen_t len = sizeof(napi_id);
+
+	getsockopt(ctx->sockfd, SOL_SOCKET, SO_INCOMING_NAPI_ID, &napi_id, &len);
+	if (napi_id)
+		printf(" napi id: %d\n", napi_id);
+	else
+		printf(" unassigned napi id\n");
+
+	ctx->napi_check = true;
+}
+
+static void sendPing(struct ctx *ctx)
+{
+
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+
+	io_uring_prep_sendmsg(sqe, ctx->sockfd, &ctx->msg, 0);
+	sqe->user_data = encodeUserData(IOURING_SENDMSG, ctx->sockfd);
+}
+
+static void receivePing(struct ctx *ctx)
+{
+	bzero(&ctx->msg, sizeof(struct msghdr));
+	ctx->msg.msg_name    = &ctx->saddr;
+	ctx->msg.msg_namelen = sizeof(struct sockaddr_in6);
+	ctx->iov.iov_base    = ctx->buffer;
+	ctx->iov.iov_len     = MAXBUFLEN;
+	ctx->msg.msg_iov     = &ctx->iov;
+	ctx->msg.msg_iovlen  = 1;
+
+	struct io_uring_sqe *sqe = io_uring_get_sqe(&ctx->ring);
+	io_uring_prep_recvmsg(sqe, ctx->sockfd, &ctx->msg, 0);
+	sqe->user_data = encodeUserData(IOURING_RECVMSG, ctx->sockfd);
+}
+
+static void completion(struct ctx *ctx, struct io_uring_cqe *cqe)
+{
+	char type;
+	int  fd;
+	int  res = cqe->res;
+
+	decodeUserData(cqe->user_data, &type, &fd);
+	if (res < 0) {
+		fprintf(stderr, "unexpected %s failure: (%d) %s\n",
+			opTypeToStr(type), -res, strerror(-res));
+		abort();
+	}
+
+	switch (type) {
+	case IOURING_SENDMSG:
+		receivePing(ctx);
+		--ctx->num_pings;
+		break;
+	case IOURING_RECVMSG:
+		ctx->iov.iov_len = res;
+		sendPing(ctx);
+		if (!ctx->napi_check)
+			reportNapi(ctx);
+		break;
+	default:
+		fprintf(stderr, "unexpected %s completion\n",
+			opTypeToStr(type));
+		abort();
+		break;
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	int flag;
+	struct ctx       ctx;
+	struct options   opt;
+	struct __kernel_timespec *tsPtr;
+	struct __kernel_timespec ts;
+	struct io_uring_params params;
+	struct io_uring_napi napi;
+
+	memset(&opt, 0, sizeof(struct options));
+
+	// Process flags.
+	while ((flag = getopt_long(argc, argv, ":lhsbua:n:p:t:", longopts, NULL)) != -1) {
+		switch (flag) {
+		case 'a':
+			strcpy(opt.addr, optarg);
+			break;
+		case 'b':
+			opt.busy_loop = true;
+			break;
+		case 'h':
+			printUsage(argv[0]);
+			exit(0);
+			break;
+		case 'l':
+			opt.listen = true;
+			break;
+		case 'n':
+			opt.num_pings = atoi(optarg) + 1;
+			break;
+		case 'p':
+			strcpy(opt.port, optarg);
+			break;
+		case 's':
+			opt.sq_poll = true;
+			break;
+		case 't':
+			opt.timeout = atoi(optarg);
+			break;
+		case 'u':
+			opt.prefer_busy_poll = true;
+			break;
+		case ':':
+			printError("Missing argument", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+		case '?':
+			printError("Unrecognized option", optopt);
+			printUsage(argv[0]);
+			exit(-1);
+			break;
+
+		default:
+			fprintf(stderr, "Fatal: Unexpected case in CmdLineProcessor switch()\n");
+			exit(-1);
+			break;
+		}
+	}
+
+	if (strlen(opt.addr) == 0) {
+		fprintf(stderr, "address option is mandatory\n");
+		printUsage(argv[0]);
+		exit(1);
+	}
+
+	ctx.saddr.sin6_port   = htons(atoi(opt.port));
+	ctx.saddr.sin6_family = AF_INET6;
+
+	if (inet_pton(AF_INET6, opt.addr, &ctx.saddr.sin6_addr) <= 0) {
+		fprintf(stderr, "inet_pton error for %s\n", optarg);
+		printUsage(argv[0]);
+		exit(1);
+	}
+
+	// Connect to server.
+	fprintf(stdout, "Listening %s : %s...\n", opt.addr, opt.port);
+
+	if ((ctx.sockfd = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
+		fprintf(stderr, "socket() failed: (%d) %s\n", errno, strerror(errno));
+		exit(1);
+	}
+
+	if (bind(ctx.sockfd, (struct sockaddr *)&ctx.saddr, sizeof(struct sockaddr_in6)) < 0) {
+		fprintf(stderr, "bind() failed: (%d) %s\n", errno, strerror(errno));
+		exit(1);
+	}
+
+	// Setup ring.
+	memset(&params, 0, sizeof(params));
+	memset(&ts, 0, sizeof(ts));
+	memset(&napi, 0, sizeof(napi));
+
+	if (opt.sq_poll) {
+		params.flags = IORING_SETUP_SQPOLL;
+		params.sq_thread_idle = 50;
+	}
+
+	if (io_uring_queue_init_params(RINGSIZE, &ctx.ring, &params) < 0) {
+		fprintf(stderr, "io_uring_queue_init_params() failed: (%d) %s\n",
+			errno, strerror(errno));
+		exit(1);
+	}
+
+	if (opt.timeout || opt.prefer_busy_poll) {
+		napi.prefer_busy_poll = opt.prefer_busy_poll;
+		napi.busy_poll_to = opt.timeout;
+
+		io_uring_register_napi(&ctx.ring, &napi);
+	}
+
+	if (opt.busy_loop)
+		tsPtr = &ts;
+	else
+		tsPtr = NULL;
+
+
+	// Use realtime scheduler.
+	setProcessScheduler();
+
+	// Copy payload.
+	clock_gettime(CLOCK_REALTIME, &ctx.ts);
+
+	// Setup context.
+	ctx.napi_check = false;
+	ctx.buffer_len = sizeof(struct timespec);
+	ctx.num_pings  = opt.num_pings;
+
+	// Receive initial message to get napi id.
+	receivePing(&ctx);
+
+	while (ctx.num_pings != 0) {
+		int res;
+		unsigned int num_completed = 0;
+		unsigned int head;
+		struct io_uring_cqe *cqe;
+
+		do {
+			res = io_uring_submit_and_wait_timeout(&ctx.ring, &cqe, 1, tsPtr, NULL);
+		}
+		while (res < 0 && errno == ETIME);
+
+		io_uring_for_each_cqe(&ctx.ring, head, cqe) {
+			++num_completed;
+			completion(&ctx, cqe);
+		}
+
+		if (num_completed) {
+			io_uring_cq_advance(&ctx.ring, num_completed);
+		}
+	}
+
+	// Clean up.
+	if (opt.timeout || opt.prefer_busy_poll)
+		io_uring_unregister_napi(&ctx.ring, &napi);
+
+	io_uring_queue_exit(&ctx.ring);
+	close(ctx.sockfd);
+
+	return 0;
+}
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v9 4/4] liburing: update changelog with new feature
  2023-04-25 18:20 [PATCH v9 0/4] liburing: add api for napi busy poll Stefan Roesch
                   ` (2 preceding siblings ...)
  2023-04-25 18:20 ` [PATCH v9 3/4] liburing: add example programs for napi busy poll Stefan Roesch
@ 2023-04-25 18:20 ` Stefan Roesch
  3 siblings, 0 replies; 15+ messages in thread
From: Stefan Roesch @ 2023-04-25 18:20 UTC (permalink / raw)
  To: io-uring, kernel-team; +Cc: shr, axboe, ammarfaizi2

Add a new entry to the changelog file for the napi busy poll feature.

Signed-off-by: Stefan Roesch <[email protected]>
Acked-by: Ammar Faizi <[email protected]>
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 71ca391..3b4144f 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -15,6 +15,7 @@ liburing-2.4 release
   io_uring_prep_socket_direct() factor in being called with
   IORING_FILE_INDEX_ALLOC for allocating a direct descriptor.
 - Add io_uring_prep_sendto() function.
+- Support for napi busy polling
 
 liburing-2.3 release
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2023-04-25 18:20 ` [PATCH v9 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
@ 2024-02-02 16:41   ` Olivier Langlois
  2024-02-02 16:42     ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier Langlois @ 2024-02-02 16:41 UTC (permalink / raw)
  To: Stefan Roesch, io-uring, kernel-team; +Cc: axboe, ammarfaizi2

On Tue, 2023-04-25 at 11:20 -0700, Stefan Roesch wrote:
> +
> +int io_uring_register_napi(struct io_uring *ring, struct
> io_uring_napi *napi)
> +{
> +	return __sys_io_uring_register(ring->ring_fd,
> +				IORING_REGISTER_NAPI, napi, 0);
> +}
> +
> +int io_uring_unregister_napi(struct io_uring *ring, struct
> io_uring_napi *napi)
> +{
> +	return __sys_io_uring_register(ring->ring_fd,
> +				IORING_UNREGISTER_NAPI, napi, 0);
> +}

my apologies if this is not the latest version of the patch but I think
that it is...

nr_args should be 1 to match what __io_uring_register() in the current
corresponding kernel patch expects:

https://git.kernel.dk/cgit/linux/commit/?h=io_uring-napi&id=787d81d3132aaf4eb4a4a5f24ff949e350e537d0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 16:41   ` Olivier Langlois
@ 2024-02-02 16:42     ` Jens Axboe
  2024-02-02 17:41       ` Olivier Langlois
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2024-02-02 16:42 UTC (permalink / raw)
  To: Olivier Langlois, Stefan Roesch, io-uring, kernel-team; +Cc: ammarfaizi2

On 2/2/24 9:41 AM, Olivier Langlois wrote:
> On Tue, 2023-04-25 at 11:20 -0700, Stefan Roesch wrote:
>> +
>> +int io_uring_register_napi(struct io_uring *ring, struct
>> io_uring_napi *napi)
>> +{
>> +	return __sys_io_uring_register(ring->ring_fd,
>> +				IORING_REGISTER_NAPI, napi, 0);
>> +}
>> +
>> +int io_uring_unregister_napi(struct io_uring *ring, struct
>> io_uring_napi *napi)
>> +{
>> +	return __sys_io_uring_register(ring->ring_fd,
>> +				IORING_UNREGISTER_NAPI, napi, 0);
>> +}
> 
> my apologies if this is not the latest version of the patch but I think
> that it is...
> 
> nr_args should be 1 to match what __io_uring_register() in the current
> corresponding kernel patch expects:
> 
> https://git.kernel.dk/cgit/linux/commit/?h=io_uring-napi&id=787d81d3132aaf4eb4a4a5f24ff949e350e537d0

Can you send a patch I can fold in?

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 16:42     ` Jens Axboe
@ 2024-02-02 17:41       ` Olivier Langlois
  2024-02-02 17:48         ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier Langlois @ 2024-02-02 17:41 UTC (permalink / raw)
  To: Jens Axboe, Stefan Roesch, io-uring, kernel-team; +Cc: ammarfaizi2

On Fri, 2024-02-02 at 09:42 -0700, Jens Axboe wrote:
> On 2/2/24 9:41 AM, Olivier Langlois wrote:
> > On Tue, 2023-04-25 at 11:20 -0700, Stefan Roesch wrote:
> > > +
> > > +int io_uring_register_napi(struct io_uring *ring, struct
> > > io_uring_napi *napi)
> > > +{
> > > +	return __sys_io_uring_register(ring->ring_fd,
> > > +				IORING_REGISTER_NAPI, napi, 0);
> > > +}
> > > +
> > > +int io_uring_unregister_napi(struct io_uring *ring, struct
> > > io_uring_napi *napi)
> > > +{
> > > +	return __sys_io_uring_register(ring->ring_fd,
> > > +				IORING_UNREGISTER_NAPI, napi,
> > > 0);
> > > +}
> > 
> > my apologies if this is not the latest version of the patch but I
> > think
> > that it is...
> > 
> > nr_args should be 1 to match what __io_uring_register() in the
> > current
> > corresponding kernel patch expects:
> > 
> > https://git.kernel.dk/cgit/linux/commit/?h=io_uring-napi&id=787d81d3132aaf4eb4a4a5f24ff949e350e537d0
> 
> Can you send a patch I can fold in?
> 
Jens,

I am unsure of what you are asking me.

You would like me to send a patch to fix a liburing patch that has
never been applied AFAIK?

if the v9 patch has never been applied. Wouldn't just editing Stefan
patch by replacing the nr_args param from 0 to 1 directly do it?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 17:41       ` Olivier Langlois
@ 2024-02-02 17:48         ` Jens Axboe
  2024-02-02 20:11           ` Olivier Langlois
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2024-02-02 17:48 UTC (permalink / raw)
  To: Olivier Langlois; +Cc: Stefan Roesch, io-uring, kernel-team, ammarfaizi2

On Feb 2, 2024, at 10:41 AM, Olivier Langlois <[email protected]> wrote:
> 
> On Fri, 2024-02-02 at 09:42 -0700, Jens Axboe wrote:
>>> On 2/2/24 9:41 AM, Olivier Langlois wrote:
>>> On Tue, 2023-04-25 at 11:20 -0700, Stefan Roesch wrote:
>>>> +
>>>> +int io_uring_register_napi(struct io_uring *ring, struct
>>>> io_uring_napi *napi)
>>>> +{
>>>> +    return __sys_io_uring_register(ring->ring_fd,
>>>> +                IORING_REGISTER_NAPI, napi, 0);
>>>> +}
>>>> +
>>>> +int io_uring_unregister_napi(struct io_uring *ring, struct
>>>> io_uring_napi *napi)
>>>> +{
>>>> +    return __sys_io_uring_register(ring->ring_fd,
>>>> +                IORING_UNREGISTER_NAPI, napi,
>>>> 0);
>>>> +}
>>> 
>>> my apologies if this is not the latest version of the patch but I
>>> think
>>> that it is...
>>> 
>>> nr_args should be 1 to match what __io_uring_register() in the
>>> current
>>> corresponding kernel patch expects:
>>> 
>>> https://git.kernel.dk/cgit/linux/commit/?h=io_uring-napi&id=787d81d3132aaf4eb4a4a5f24ff949e350e537d0
>> 
>> Can you send a patch I can fold in?
>> 
> Jens,
> 
> I am unsure of what you are asking me.
> 
> You would like me to send a patch to fix a liburing patch that has
> never been applied AFAIK?
> 
> if the v9 patch has never been applied. Wouldn't just editing Stefan
> patch by replacing the nr_args param from 0 to 1 directly do it?

The current version is what is in that branch you referenced. If you see anything in there that needs changing, send a patch for that. 

I can send out the series again if needed.

— 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 17:48         ` Jens Axboe
@ 2024-02-02 20:11           ` Olivier Langlois
  2024-02-02 20:14             ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier Langlois @ 2024-02-02 20:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Stefan Roesch, io-uring, kernel-team, ammarfaizi2

On Fri, 2024-02-02 at 10:48 -0700, Jens Axboe wrote:
> On Feb 2, 2024, at 10:41 AM, Olivier Langlois
> <[email protected]> wrote:
> > 
> > On Fri, 2024-02-02 at 09:42 -0700, Jens Axboe wrote:
> > > > On 2/2/24 9:41 AM, Olivier Langlois wrote:
> > > > On Tue, 2023-04-25 at 11:20 -0700, Stefan Roesch wrote:
> > > > > +
> > > > > +int io_uring_register_napi(struct io_uring *ring, struct
> > > > > io_uring_napi *napi)
> > > > > +{
> > > > > +    return __sys_io_uring_register(ring->ring_fd,
> > > > > +                IORING_REGISTER_NAPI, napi, 0);
> > > > > +}
> > > > > +
> > > > > +int io_uring_unregister_napi(struct io_uring *ring, struct
> > > > > io_uring_napi *napi)
> > > > > +{
> > > > > +    return __sys_io_uring_register(ring->ring_fd,
> > > > > +                IORING_UNREGISTER_NAPI, napi,
> > > > > 0);
> > > > > +}
> > > > 
> > > > my apologies if this is not the latest version of the patch but
> > > > I
> > > > think
> > > > that it is...
> > > > 
> > > > nr_args should be 1 to match what __io_uring_register() in the
> > > > current
> > > > corresponding kernel patch expects:
> > > > 
> > > > https://git.kernel.dk/cgit/linux/commit/?h=io_uring-napi&id=787d81d3132aaf4eb4a4a5f24ff949e350e537d0
> > > 
> > > Can you send a patch I can fold in?
> > > 
> > Jens,
> > 
> > I am unsure of what you are asking me.
> > 
> > You would like me to send a patch to fix a liburing patch that has
> > never been applied AFAIK?
> > 
> > if the v9 patch has never been applied. Wouldn't just editing
> > Stefan
> > patch by replacing the nr_args param from 0 to 1 directly do it?
> 
> The current version is what is in that branch you referenced. If you
> see anything in there that needs changing, send a patch for that. 
> 
> I can send out the series again if needed.
> 
AFAIK, the kernel patch is fine.

my comment was referring the liburing patch that is calling
__sys_io_uring_register() by passing a nr_args value of 0.

This is problematic because on the kernel side, io_uring_register()
returns -EINVAL if nr_args is not 1.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 20:11           ` Olivier Langlois
@ 2024-02-02 20:14             ` Jens Axboe
  2024-02-02 20:23               ` Olivier Langlois
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2024-02-02 20:14 UTC (permalink / raw)
  To: Olivier Langlois; +Cc: Stefan Roesch, io-uring, kernel-team, ammarfaizi2

On Feb 2, 2024, at 1:11 PM, Olivier Langlois <[email protected]> wrote:
> 
> On Fri, 2024-02-02 at 10:48 -0700, Jens Axboe wrote:
>>> On Feb 2, 2024, at 10:41 AM, Olivier Langlois
>>> <[email protected]> wrote:
>>> 
>>> On Fri, 2024-02-02 at 09:42 -0700, Jens Axboe wrote:
>>>>> On 2/2/24 9:41 AM, Olivier Langlois wrote:
>>>>> On Tue, 2023-04-25 at 11:20 -0700, Stefan Roesch wrote:
>>>>>> +
>>>>>> +int io_uring_register_napi(struct io_uring *ring, struct
>>>>>> io_uring_napi *napi)
>>>>>> +{
>>>>>> +    return __sys_io_uring_register(ring->ring_fd,
>>>>>> +                IORING_REGISTER_NAPI, napi, 0);
>>>>>> +}
>>>>>> +
>>>>>> +int io_uring_unregister_napi(struct io_uring *ring, struct
>>>>>> io_uring_napi *napi)
>>>>>> +{
>>>>>> +    return __sys_io_uring_register(ring->ring_fd,
>>>>>> +                IORING_UNREGISTER_NAPI, napi,
>>>>>> 0);
>>>>>> +}
>>>>> 
>>>>> my apologies if this is not the latest version of the patch but
>>>>> I
>>>>> think
>>>>> that it is...
>>>>> 
>>>>> nr_args should be 1 to match what __io_uring_register() in the
>>>>> current
>>>>> corresponding kernel patch expects:
>>>>> 
>>>>> https://git.kernel.dk/cgit/linux/commit/?h=io_uring-napi&id=787d81d3132aaf4eb4a4a5f24ff949e350e537d0
>>>> 
>>>> Can you send a patch I can fold in?
>>>> 
>>> Jens,
>>> 
>>> I am unsure of what you are asking me.
>>> 
>>> You would like me to send a patch to fix a liburing patch that has
>>> never been applied AFAIK?
>>> 
>>> if the v9 patch has never been applied. Wouldn't just editing
>>> Stefan
>>> patch by replacing the nr_args param from 0 to 1 directly do it?
>> 
>> The current version is what is in that branch you referenced. If you
>> see anything in there that needs changing, send a patch for that.
>> 
>> I can send out the series again if needed.
>> 
> AFAIK, the kernel patch is fine.
> 
> my comment was referring the liburing patch that is calling
> __sys_io_uring_register() by passing a nr_args value of 0.
> 
> This is problematic because on the kernel side, io_uring_register()
> returns -EINVAL if nr_args is not 1.

Ah gotcha, yeah that’s odd and could not ever have worked. I wonder how that was tested…

I’ll setup a liburing branch as well.


— 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 20:14             ` Jens Axboe
@ 2024-02-02 20:23               ` Olivier Langlois
  2024-02-02 22:20                 ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Olivier Langlois @ 2024-02-02 20:23 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Stefan Roesch, io-uring, kernel-team, ammarfaizi2

On Fri, 2024-02-02 at 13:14 -0700, Jens Axboe wrote:
> 
> Ah gotcha, yeah that’s odd and could not ever have worked. I wonder
> how that was tested…
> 
> I’ll setup a liburing branch as well.
> 
It is easy. You omit to check the function return value by telling to
yourself that it cannot fail...

I caught my mistake on a second pass code review...

C++ has a very useful [[nodiscard]] attribute that can help to catch
this simple error... I am not sure if there is something similar to the
[[nodiscard]] in the ISO C standard...


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 20:23               ` Olivier Langlois
@ 2024-02-02 22:20                 ` Jens Axboe
  2024-02-02 22:46                   ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2024-02-02 22:20 UTC (permalink / raw)
  To: Olivier Langlois; +Cc: Stefan Roesch, io-uring, kernel-team, ammarfaizi2

On 2/2/24 1:23 PM, Olivier Langlois wrote:
> On Fri, 2024-02-02 at 13:14 -0700, Jens Axboe wrote:
>>
>> Ah gotcha, yeah that?s odd and could not ever have worked. I wonder
>> how that was tested?
>>
>> I?ll setup a liburing branch as well.
>>
> It is easy. You omit to check the function return value by telling to
> yourself that it cannot fail...
> 
> I caught my mistake on a second pass code review...

Oh I can see how that can happen, but then there should be no functional
changes in terms of latency... Which means that it was never tested. The
test results were from the original postings, so probably just fine.
It's just that later versions would've failed. Looking at the example
test case, it doesn't check the return value.

> C++ has a very useful [[nodiscard]] attribute that can help to catch
> this simple error... I am not sure if there is something similar to the
> [[nodiscard]] in the ISO C standard...

You can use __attribute__((__warn_unused_result__)) - the kernel does
that, for example.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 22:20                 ` Jens Axboe
@ 2024-02-02 22:46                   ` Jens Axboe
  2024-02-03  2:30                     ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2024-02-02 22:46 UTC (permalink / raw)
  To: Olivier Langlois; +Cc: Stefan Roesch, io-uring, kernel-team, ammarfaizi2

On 2/2/24 3:20 PM, Jens Axboe wrote:
> On 2/2/24 1:23 PM, Olivier Langlois wrote:
>> On Fri, 2024-02-02 at 13:14 -0700, Jens Axboe wrote:
>>>
>>> Ah gotcha, yeah that?s odd and could not ever have worked. I wonder
>>> how that was tested?
>>>
>>> I?ll setup a liburing branch as well.
>>>
>> It is easy. You omit to check the function return value by telling to
>> yourself that it cannot fail...
>>
>> I caught my mistake on a second pass code review...
> 
> Oh I can see how that can happen, but then there should be no functional
> changes in terms of latency... Which means that it was never tested. The
> test results were from the original postings, so probably just fine.
> It's just that later versions would've failed. Looking at the example
> test case, it doesn't check the return value.

Setup a 'napi' branch with the patches, and some fixes on top. It's a
start... I'll try the example ping test here, just need to get off a
plane and be able to access test boxes.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/4] liburing: add api to set napi busy poll settings
  2024-02-02 22:46                   ` Jens Axboe
@ 2024-02-03  2:30                     ` Jens Axboe
  0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2024-02-03  2:30 UTC (permalink / raw)
  To: Olivier Langlois; +Cc: Stefan Roesch, io-uring, kernel-team, ammarfaizi2

On 2/2/24 3:46 PM, Jens Axboe wrote:
> On 2/2/24 3:20 PM, Jens Axboe wrote:
>> On 2/2/24 1:23 PM, Olivier Langlois wrote:
>>> On Fri, 2024-02-02 at 13:14 -0700, Jens Axboe wrote:
>>>>
>>>> Ah gotcha, yeah that?s odd and could not ever have worked. I wonder
>>>> how that was tested?
>>>>
>>>> I?ll setup a liburing branch as well.
>>>>
>>> It is easy. You omit to check the function return value by telling to
>>> yourself that it cannot fail...
>>>
>>> I caught my mistake on a second pass code review...
>>
>> Oh I can see how that can happen, but then there should be no functional
>> changes in terms of latency... Which means that it was never tested. The
>> test results were from the original postings, so probably just fine.
>> It's just that later versions would've failed. Looking at the example
>> test case, it doesn't check the return value.
> 
> Setup a 'napi' branch with the patches, and some fixes on top. It's a
> start... I'll try the example ping test here, just need to get off a
> plane and be able to access test boxes.

Added ipv4 support to the napi example code, and here's what I get.
Stock settings, no NAPI, 100k packets:

 rtt(us) min/avg/max/mdev = 31.730/37.006/87.960/0.497

and with -t10 -b enabled:

 rtt(us) min/avg/max/mdev = 23.250/29.795/63.511/1.203

So it seems to work fine as-is, with the current branch. I've added some
tweaks on the liburing napi branch, but mostly just little fixes.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-02-03  2:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-25 18:20 [PATCH v9 0/4] liburing: add api for napi busy poll Stefan Roesch
2023-04-25 18:20 ` [PATCH v9 1/4] liburing: add api to set napi busy poll settings Stefan Roesch
2024-02-02 16:41   ` Olivier Langlois
2024-02-02 16:42     ` Jens Axboe
2024-02-02 17:41       ` Olivier Langlois
2024-02-02 17:48         ` Jens Axboe
2024-02-02 20:11           ` Olivier Langlois
2024-02-02 20:14             ` Jens Axboe
2024-02-02 20:23               ` Olivier Langlois
2024-02-02 22:20                 ` Jens Axboe
2024-02-02 22:46                   ` Jens Axboe
2024-02-03  2:30                     ` Jens Axboe
2023-04-25 18:20 ` [PATCH v9 2/4] liburing: add documentation for new napi busy polling Stefan Roesch
2023-04-25 18:20 ` [PATCH v9 3/4] liburing: add example programs for napi busy poll Stefan Roesch
2023-04-25 18:20 ` [PATCH v9 4/4] liburing: update changelog with new feature Stefan Roesch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox