public inbox for [email protected]
 help / color / mirror / Atom feed
* [RFC 0/7] io_uring lseek
@ 2023-07-26 10:25 Hao Xu
  2023-07-26 10:25 ` [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data() Hao Xu
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:25 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

This series adds lseek for io_uring, the motivation to import this
syscall is in previous io_uring getdents patchset, we lack a way to
rewind the file cursor when it goes to the end of file. Another reason
is lseek is a common syscall, it's good for coding consistency when
users use io_uring as their main loop.

Patch 1 is code clean for iomap
Patch 2 adds IOMAP_NOWAIT logic for iomap lseek
Patch 3 adds a nowait parameter to for IOMAP_NOWAIT control
Patch 4 adds llseek_nowait() for file_operations so that specific
        filesystem can implement it for nowait lseek
Patch 5 adds llseek_nowait() implementation for xfs
Patch 6 adds a new vfs wrapper for io_uring use
Patch 7 is the main io_uring lseek implementation

Note, this series depends on the previous io_uring getdents series.

This is marked RFC since there is (at least) an issue to be discussed:
The work in this series is mainly to reslove a problem that the current
llseek() in struct file_operations doesn't have a place to deliver
nowait info, and adding an argument to it results in update for llseek
implementation of all filesystems (35 functions), so here I introduce
a new llseek_nowait() as a workaround.

For performance, it has about 20%~30% improvement on iops.
The test program is just like the one for io_uring getdents, here is the
link to it: https://github.com/HowHsu/liburing/blob/llseek/test/lseek.c
- Each test runs about 30000 async requests/sync syscalls
- Each test runs 100 times and get the average value.
- offset is randomly generated value
- the file is a 1M all zero file

[howeyxu@~]$ python3 run_lseek.py
test args:  seek mode:SEEK_SET, offset: 334772
Average of  sync :  0.012300650000000002
Average of  iouring :  0.008528009999999999
30.67%

[howeyxu@~]$ python3 run_lseek.py
test args:  seek mode:SEEK_CUR, offset: 389292
Average of  sync :  0.012736129999999995
Average of  iouring :  0.00928725
27.08%

[howeyxu@~]$ python3 run_lseek.py
test args:  seek mode:SEEK_END, offset: 281141
Average of  sync :  0.01221595
Average of  iouring :  0.008442890000000003
30.89%

[howeyxu@~]$ python3 run_lseek.py
test args:  seek mode:SEEK_DATA, offset: 931103
Average of  sync :  0.015496230000000005
Average of  iouring :  0.012341509999999998
20.36%

[howeyxu@~]$ python3 run_lseek.py
test args:  seek mode:SEEK_HOLE, offset: 430194
Average of  sync :  0.01555663000000001
Average of  iouring :  0.012064940000000003
22.45%
 

Hao Xu (7):
  iomap: merge iomap_seek_hole() and iomap_seek_data()
  xfs: add nowait support for xfs_seek_iomap_begin()
  add nowait parameter for iomap_seek()
  add llseek_nowait() for struct file_operations
  add llseek_nowait support for xfs
  add vfs_lseek_nowait()
  add lseek for io_uring

 fs/ext4/file.c                |  9 ++---
 fs/gfs2/inode.c               |  4 +--
 fs/iomap/seek.c               | 42 ++++++-----------------
 fs/read_write.c               | 18 ++++++++++
 fs/xfs/xfs_file.c             | 34 ++++++++++++++++---
 fs/xfs/xfs_iomap.c            |  4 ++-
 include/linux/fs.h            |  4 +++
 include/linux/iomap.h         |  6 ++--
 include/uapi/linux/io_uring.h |  1 +
 io_uring/fs.c                 | 63 +++++++++++++++++++++++++++++++++++
 io_uring/fs.h                 |  3 ++
 io_uring/opdef.c              |  8 +++++
 12 files changed, 145 insertions(+), 51 deletions(-)


base-commit: 4a4b046082eca8ae90b654d772fccc30e9f23f4d
-- 
2.25.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data()
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
@ 2023-07-26 10:25 ` Hao Xu
  2023-07-26 21:50   ` Dave Chinner
  2023-07-26 10:25 ` [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin() Hao Xu
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:25 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

The two functions share almost same code, merge them together.
No functional change in this patch.

Signed-off-by: Hao Xu <[email protected]>
---
 fs/ext4/file.c        |  9 ++-------
 fs/gfs2/inode.c       |  4 ++--
 fs/iomap/seek.c       | 40 ++++++++--------------------------------
 fs/xfs/xfs_file.c     |  5 ++---
 include/linux/iomap.h |  6 ++----
 5 files changed, 16 insertions(+), 48 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index c457c8517f0f..3d59993bce56 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -936,15 +936,10 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int whence)
 		return generic_file_llseek_size(file, offset, whence,
 						maxbytes, i_size_read(inode));
 	case SEEK_HOLE:
-		inode_lock_shared(inode);
-		offset = iomap_seek_hole(inode, offset,
-					 &ext4_iomap_report_ops);
-		inode_unlock_shared(inode);
-		break;
 	case SEEK_DATA:
 		inode_lock_shared(inode);
-		offset = iomap_seek_data(inode, offset,
-					 &ext4_iomap_report_ops);
+		offset = iomap_seek(inode, offset, &ext4_iomap_report_ops,
+				    whence == SEEK_HOLE);
 		inode_unlock_shared(inode);
 		break;
 	}
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 17c994a0c0d0..628f9d014491 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -2111,7 +2111,7 @@ loff_t gfs2_seek_data(struct file *file, loff_t offset)
 	inode_lock_shared(inode);
 	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
 	if (!ret)
-		ret = iomap_seek_data(inode, offset, &gfs2_iomap_ops);
+		ret = iomap_seek(inode, offset, &gfs2_iomap_ops, false);
 	gfs2_glock_dq_uninit(&gh);
 	inode_unlock_shared(inode);
 
@@ -2130,7 +2130,7 @@ loff_t gfs2_seek_hole(struct file *file, loff_t offset)
 	inode_lock_shared(inode);
 	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
 	if (!ret)
-		ret = iomap_seek_hole(inode, offset, &gfs2_iomap_ops);
+		ret = iomap_seek(inode, offset, &gfs2_iomap_ops, true);
 	gfs2_glock_dq_uninit(&gh);
 	inode_unlock_shared(inode);
 
diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index a845c012b50c..5e8641c5f0da 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -30,32 +30,6 @@ static loff_t iomap_seek_hole_iter(const struct iomap_iter *iter,
 	}
 }
 
-loff_t
-iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
-{
-	loff_t size = i_size_read(inode);
-	struct iomap_iter iter = {
-		.inode	= inode,
-		.pos	= pos,
-		.flags	= IOMAP_REPORT,
-	};
-	int ret;
-
-	/* Nothing to be found before or beyond the end of the file. */
-	if (pos < 0 || pos >= size)
-		return -ENXIO;
-
-	iter.len = size - pos;
-	while ((ret = iomap_iter(&iter, ops)) > 0)
-		iter.processed = iomap_seek_hole_iter(&iter, &pos);
-	if (ret < 0)
-		return ret;
-	if (iter.len) /* found hole before EOF */
-		return pos;
-	return size;
-}
-EXPORT_SYMBOL_GPL(iomap_seek_hole);
-
 static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
 		loff_t *hole_pos)
 {
@@ -77,7 +51,8 @@ static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
 }
 
 loff_t
-iomap_seek_data(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
+iomap_seek(struct inode *inode, loff_t pos, const struct iomap_ops *ops,
+	   bool hole)
 {
 	loff_t size = i_size_read(inode);
 	struct iomap_iter iter = {
@@ -93,12 +68,13 @@ iomap_seek_data(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
 
 	iter.len = size - pos;
 	while ((ret = iomap_iter(&iter, ops)) > 0)
-		iter.processed = iomap_seek_data_iter(&iter, &pos);
+		iter.processed = hole ? iomap_seek_hole_iter(&iter, &pos) :
+				 iomap_seek_data_iter(&iter, &pos);
 	if (ret < 0)
 		return ret;
-	if (iter.len) /* found data before EOF */
+	if (iter.len) /* found hole/data before EOF */
 		return pos;
-	/* We've reached the end of the file without finding data */
-	return -ENXIO;
+	/* We've reached the end of the file without finding hole/data */
+	return hole ? size : -ENXIO;
 }
-EXPORT_SYMBOL_GPL(iomap_seek_data);
+EXPORT_SYMBOL_GPL(iomap_seek);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 4f502219ae4f..d7d37f8fb6bc 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1271,10 +1271,9 @@ xfs_file_llseek(
 	default:
 		return generic_file_llseek(file, offset, whence);
 	case SEEK_HOLE:
-		offset = iomap_seek_hole(inode, offset, &xfs_seek_iomap_ops);
-		break;
 	case SEEK_DATA:
-		offset = iomap_seek_data(inode, offset, &xfs_seek_iomap_ops);
+		offset = iomap_seek(inode, offset, &xfs_seek_iomap_ops,
+				    whence == SEEK_HOLE);
 		break;
 	}
 
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e2b836c2e119..22d5f9b19a22 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -274,10 +274,8 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf,
 			const struct iomap_ops *ops);
 int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		u64 start, u64 len, const struct iomap_ops *ops);
-loff_t iomap_seek_hole(struct inode *inode, loff_t offset,
-		const struct iomap_ops *ops);
-loff_t iomap_seek_data(struct inode *inode, loff_t offset,
-		const struct iomap_ops *ops);
+loff_t iomap_seek(struct inode *inode, loff_t offset,
+		  const struct iomap_ops *ops, bool hole);
 sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
 		const struct iomap_ops *ops);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin()
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
  2023-07-26 10:25 ` [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data() Hao Xu
@ 2023-07-26 10:25 ` Hao Xu
  2023-07-26 21:55   ` Dave Chinner
  2023-07-26 10:25 ` [PATCH 3/7] add nowait parameter for iomap_seek() Hao Xu
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:25 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

To support nowait llseek(), IOMAP_NOWAIT semantics should be respected.
In xfs, xfs_seek_iomap_begin() is the only place which may be blocked
by ilock and extent loading. Let's turn it into trylock logic just like
what we've done in xfs_readdir().

Signed-off-by: Hao Xu <[email protected]>
---
 fs/xfs/xfs_iomap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 18c8f168b153..bbd7c6b27701 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1294,7 +1294,9 @@ xfs_seek_iomap_begin(
 	if (xfs_is_shutdown(mp))
 		return -EIO;
 
-	lockmode = xfs_ilock_data_map_shared(ip);
+	lockmode = xfs_ilock_data_map_shared_generic(ip, flags & IOMAP_NOWAIT);
+	if (!lockmode)
+		return -EAGAIN;
 	error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
 	if (error)
 		goto out_unlock;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/7] add nowait parameter for iomap_seek()
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
  2023-07-26 10:25 ` [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data() Hao Xu
  2023-07-26 10:25 ` [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin() Hao Xu
@ 2023-07-26 10:25 ` Hao Xu
  2023-07-26 22:01   ` Dave Chinner
  2023-07-26 10:26 ` [PATCH 4/7] add llseek_nowait() for struct file_operations Hao Xu
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:25 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

Add a nowait parameter for iomap_seek(), later IOMAP_NOWAIT is set
according to this parameter's value.

Signed-off-by: Hao Xu <[email protected]>
---
 fs/ext4/file.c        | 2 +-
 fs/gfs2/inode.c       | 4 ++--
 fs/iomap/seek.c       | 4 +++-
 fs/xfs/xfs_file.c     | 2 +-
 include/linux/iomap.h | 2 +-
 5 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 3d59993bce56..c6c38c34148b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -939,7 +939,7 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int whence)
 	case SEEK_DATA:
 		inode_lock_shared(inode);
 		offset = iomap_seek(inode, offset, &ext4_iomap_report_ops,
-				    whence == SEEK_HOLE);
+				    whence == SEEK_HOLE, false);
 		inode_unlock_shared(inode);
 		break;
 	}
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 628f9d014491..5d6e7471cb07 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -2111,7 +2111,7 @@ loff_t gfs2_seek_data(struct file *file, loff_t offset)
 	inode_lock_shared(inode);
 	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
 	if (!ret)
-		ret = iomap_seek(inode, offset, &gfs2_iomap_ops, false);
+		ret = iomap_seek(inode, offset, &gfs2_iomap_ops, false, false);
 	gfs2_glock_dq_uninit(&gh);
 	inode_unlock_shared(inode);
 
@@ -2130,7 +2130,7 @@ loff_t gfs2_seek_hole(struct file *file, loff_t offset)
 	inode_lock_shared(inode);
 	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
 	if (!ret)
-		ret = iomap_seek(inode, offset, &gfs2_iomap_ops, true);
+		ret = iomap_seek(inode, offset, &gfs2_iomap_ops, true, false);
 	gfs2_glock_dq_uninit(&gh);
 	inode_unlock_shared(inode);
 
diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index 5e8641c5f0da..319ea19fa90d 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -52,7 +52,7 @@ static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
 
 loff_t
 iomap_seek(struct inode *inode, loff_t pos, const struct iomap_ops *ops,
-	   bool hole)
+	   bool hole, bool nowait)
 {
 	loff_t size = i_size_read(inode);
 	struct iomap_iter iter = {
@@ -66,6 +66,8 @@ iomap_seek(struct inode *inode, loff_t pos, const struct iomap_ops *ops,
 	if (pos < 0 || pos >= size)
 		return -ENXIO;
 
+	if (nowait)
+		iter.flags |= IOMAP_NOWAIT;
 	iter.len = size - pos;
 	while ((ret = iomap_iter(&iter, ops)) > 0)
 		iter.processed = hole ? iomap_seek_hole_iter(&iter, &pos) :
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index d7d37f8fb6bc..73adc0aee2ff 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1273,7 +1273,7 @@ xfs_file_llseek(
 	case SEEK_HOLE:
 	case SEEK_DATA:
 		offset = iomap_seek(inode, offset, &xfs_seek_iomap_ops,
-				    whence == SEEK_HOLE);
+				    whence == SEEK_HOLE, nowait);
 		break;
 	}
 
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 22d5f9b19a22..f99769d4fc42 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -275,7 +275,7 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf,
 int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		u64 start, u64 len, const struct iomap_ops *ops);
 loff_t iomap_seek(struct inode *inode, loff_t offset,
-		  const struct iomap_ops *ops, bool hole);
+		  const struct iomap_ops *ops, bool hole, bool nowait);
 sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
 		const struct iomap_ops *ops);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/7] add llseek_nowait() for struct file_operations
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
                   ` (2 preceding siblings ...)
  2023-07-26 10:25 ` [PATCH 3/7] add nowait parameter for iomap_seek() Hao Xu
@ 2023-07-26 10:26 ` Hao Xu
  2023-07-27 13:25   ` Matthew Wilcox
  2023-07-26 10:26 ` [PATCH 5/7] add llseek_nowait support for xfs Hao Xu
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:26 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

Add a new function member llseek_nowait() in struct file_operations for
nowait llseek. It act just like llseek() but has an extra boolean
parameter called nowait to indicate if it's a nowait try, avoid IO and
locks if so.

Signed-off-by: Hao Xu <[email protected]>
---
 include/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index f3e315e8efdd..d37290da2d7e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1823,6 +1823,7 @@ struct file_operations {
 	int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags);
 	int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *,
 				unsigned int poll_flags);
+	loff_t (*llseek_nowait)(struct file *, loff_t, int, bool);
 } __randomize_layout;
 
 struct inode_operations {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/7] add llseek_nowait support for xfs
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
                   ` (3 preceding siblings ...)
  2023-07-26 10:26 ` [PATCH 4/7] add llseek_nowait() for struct file_operations Hao Xu
@ 2023-07-26 10:26 ` Hao Xu
  2023-07-26 22:14   ` Dave Chinner
  2023-07-26 10:26 ` [PATCH 6/7] add vfs_lseek_nowait() Hao Xu
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:26 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

Add llseek_nowait() operation for xfs, it acts just like llseek(). The
thing different is it delivers nowait parameter to iomap layer.

Signed-off-by: Hao Xu <[email protected]>
---
 fs/xfs/xfs_file.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 73adc0aee2ff..cba82264221d 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1257,10 +1257,11 @@ xfs_file_readdir(
 }
 
 STATIC loff_t
-xfs_file_llseek(
+__xfs_file_llseek(
 	struct file	*file,
 	loff_t		offset,
-	int		whence)
+	int		whence,
+	bool		nowait)
 {
 	struct inode		*inode = file->f_mapping->host;
 
@@ -1282,6 +1283,28 @@ xfs_file_llseek(
 	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
 }
 
+STATIC loff_t
+xfs_file_llseek(
+	struct file	*file,
+	loff_t		offset,
+	int		whence)
+{
+	return __xfs_file_llseek(file, offset, whence, false);
+}
+
+STATIC loff_t
+xfs_file_llseek_nowait(
+	struct file	*file,
+	loff_t		offset,
+	int		whence,
+	bool		nowait)
+{
+	if (file->f_op == &xfs_file_operations)
+		return __xfs_file_llseek(file, offset, whence, nowait);
+	else
+		return generic_file_llseek(file, offset, whence);
+}
+
 #ifdef CONFIG_FS_DAX
 static inline vm_fault_t
 xfs_dax_fault(
@@ -1442,6 +1465,7 @@ xfs_file_mmap(
 
 const struct file_operations xfs_file_operations = {
 	.llseek		= xfs_file_llseek,
+	.llseek_nowait	= xfs_file_llseek_nowait,
 	.read_iter	= xfs_file_read_iter,
 	.write_iter	= xfs_file_write_iter,
 	.splice_read	= xfs_file_splice_read,
@@ -1467,6 +1491,7 @@ const struct file_operations xfs_dir_file_operations = {
 	.read		= generic_read_dir,
 	.iterate_shared	= xfs_file_readdir,
 	.llseek		= generic_file_llseek,
+	.llseek_nowait	= xfs_file_llseek_nowait,
 	.unlocked_ioctl	= xfs_file_ioctl,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= xfs_file_compat_ioctl,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/7] add vfs_lseek_nowait()
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
                   ` (4 preceding siblings ...)
  2023-07-26 10:26 ` [PATCH 5/7] add llseek_nowait support for xfs Hao Xu
@ 2023-07-26 10:26 ` Hao Xu
  2023-07-26 10:26 ` [PATCH 7/7] add lseek for io_uring Hao Xu
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:26 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

Add a new vfs wrapper for io_uring lseek usage. The reason is the
current vfs_lseek() calls llseek() but what we need is llseek_nowait().

Signed-off-by: Hao Xu <[email protected]>
---
 fs/read_write.c    | 18 ++++++++++++++++++
 include/linux/fs.h |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index b07de77ef126..b4c3bcf706e2 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -290,6 +290,24 @@ loff_t vfs_llseek(struct file *file, loff_t offset, int whence)
 }
 EXPORT_SYMBOL(vfs_llseek);
 
+loff_t vfs_lseek_nowait(struct file *file, off_t offset,
+			 int whence, bool nowait)
+{
+	if (!(file->f_mode & FMODE_LSEEK))
+		return -ESPIPE;
+	/*
+	 * This function is only used by io_uring, thus
+	 * returning -ENOTSUPP is not proper since doing
+	 * nonblock lseek as the first try is asked internally
+	 * by io_uring not by users. Return -ENOTSUPP to users
+	 * is not sane.
+	 */
+	if (!file->f_op->llseek_nowait)
+		return -EAGAIN;
+	return file->f_op->llseek_nowait(file, offset, whence, nowait);
+}
+EXPORT_SYMBOL(vfs_lseek_nowait);
+
 static off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence)
 {
 	off_t retval;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d37290da2d7e..cb804d1f1650 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2654,6 +2654,9 @@ extern loff_t default_llseek(struct file *file, loff_t offset, int whence);
 
 extern loff_t vfs_llseek(struct file *file, loff_t offset, int whence);
 
+extern loff_t vfs_lseek_nowait(struct file *file, off_t offset,
+			       int whence, bool nowait);
+
 extern int inode_init_always(struct super_block *, struct inode *);
 extern void inode_init_once(struct inode *);
 extern void address_space_init_once(struct address_space *mapping);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/7] add lseek for io_uring
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
                   ` (5 preceding siblings ...)
  2023-07-26 10:26 ` [PATCH 6/7] add vfs_lseek_nowait() Hao Xu
@ 2023-07-26 10:26 ` Hao Xu
  2023-07-26 13:22 ` [RFC 0/7] io_uring lseek Christian Brauner
  2023-07-27 12:30 ` Hao Xu
  8 siblings, 0 replies; 19+ messages in thread
From: Hao Xu @ 2023-07-26 10:26 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

From: Hao Xu <[email protected]>

This is related with previous io_uring getdents patchset, we need a way
to rewind the cursor of file when it comes to the end of a file by
getdents. Introduce lseek to io_uring for this, besides, it's also a
common syscall users call. So it's good for coding consistency when
users use io_uring as their main loop.

Signed-off-by: Hao Xu <[email protected]>
---
 include/uapi/linux/io_uring.h |  1 +
 io_uring/fs.c                 | 63 +++++++++++++++++++++++++++++++++++
 io_uring/fs.h                 |  3 ++
 io_uring/opdef.c              |  8 +++++
 4 files changed, 75 insertions(+)

diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index c3efe241e310..d445876d4afc 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -236,6 +236,7 @@ enum io_uring_op {
 	IORING_OP_SEND_ZC,
 	IORING_OP_SENDMSG_ZC,
 	IORING_OP_GETDENTS,
+	IORING_OP_LSEEK,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
diff --git a/io_uring/fs.c b/io_uring/fs.c
index 793eceb562a7..3992a19195ff 100644
--- a/io_uring/fs.c
+++ b/io_uring/fs.c
@@ -53,6 +53,12 @@ struct io_getdents {
 	unsigned int			count;
 };
 
+struct io_lseek {
+	struct file			*file;
+	off_t				offset;
+	unsigned int			whence;
+};
+
 int io_renameat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	struct io_rename *ren = io_kiocb_to_cmd(req, struct io_rename);
@@ -348,3 +354,60 @@ int io_getdents(struct io_kiocb *req, unsigned int issue_flags)
 	return 0;
 }
 
+int io_lseek_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	struct io_lseek *lsk = io_kiocb_to_cmd(req, struct io_lseek);
+
+	if (unlikely(req->flags & REQ_F_FIXED_FILE))
+		return -EBADF;
+
+	lsk->offset = READ_ONCE(sqe->addr);
+	lsk->whence = READ_ONCE(sqe->len);
+
+	return 0;
+}
+
+int io_lseek(struct io_kiocb *req, unsigned int issue_flags)
+{
+	struct io_lseek *lsk = io_kiocb_to_cmd(req, struct io_lseek);
+	struct file *file = req->file;
+	bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+	bool should_lock = file->f_mode & FMODE_ATOMIC_POS;
+	unsigned int whence = lsk->whence;
+	loff_t res;
+	off_t ret;
+
+	if (whence > SEEK_MAX)
+		return -EINVAL;
+
+	if (force_nonblock) {
+		if (!(file->f_flags & O_NONBLOCK) &&
+		    !(file->f_mode & FMODE_NOWAIT))
+			return -EAGAIN;
+	}
+
+	if (should_lock) {
+		if (!force_nonblock)
+			mutex_lock(&file->f_pos_lock);
+		else if (!mutex_trylock(&file->f_pos_lock))
+			return -EAGAIN;
+	}
+
+	res = vfs_lseek_nowait(file, lsk->offset, whence, force_nonblock);
+	if (res == -EAGAIN && force_nonblock) {
+		if (should_lock)
+			mutex_unlock(&file->f_pos_lock);
+		return -EAGAIN;
+	}
+
+	ret = res;
+	if (res != (loff_t)ret)
+		ret = -EOVERFLOW;
+
+	if (should_lock)
+		mutex_unlock(&file->f_pos_lock);
+
+	io_req_set_res(req, ret, 0);
+	return 0;
+}
+
diff --git a/io_uring/fs.h b/io_uring/fs.h
index f83a6f3a678d..32a8441c5142 100644
--- a/io_uring/fs.h
+++ b/io_uring/fs.h
@@ -21,3 +21,6 @@ void io_link_cleanup(struct io_kiocb *req);
 
 int io_getdents_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe);
 int io_getdents(struct io_kiocb *req, unsigned int issue_flags);
+
+int io_lseek_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe);
+int io_lseek(struct io_kiocb *req, unsigned int issue_flags);
diff --git a/io_uring/opdef.c b/io_uring/opdef.c
index 1bae6b2a8d0b..eb1f7ee4f079 100644
--- a/io_uring/opdef.c
+++ b/io_uring/opdef.c
@@ -433,6 +433,11 @@ const struct io_issue_def io_issue_defs[] = {
 		.prep			= io_getdents_prep,
 		.issue			= io_getdents,
 	},
+	[IORING_OP_LSEEK] = {
+		.needs_file		= 1,
+		.prep			= io_lseek_prep,
+		.issue			= io_lseek,
+	},
 };
 
 
@@ -656,6 +661,9 @@ const struct io_cold_def io_cold_defs[] = {
 	[IORING_OP_GETDENTS] = {
 		.name			= "GETDENTS",
 	},
+	[IORING_OP_LSEEK] = {
+		.name			= "LSEEK",
+	},
 };
 
 const char *io_uring_get_opcode(u8 opcode)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC 0/7] io_uring lseek
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
                   ` (6 preceding siblings ...)
  2023-07-26 10:26 ` [PATCH 7/7] add lseek for io_uring Hao Xu
@ 2023-07-26 13:22 ` Christian Brauner
  2023-07-27 12:30 ` Hao Xu
  8 siblings, 0 replies; 19+ messages in thread
From: Christian Brauner @ 2023-07-26 13:22 UTC (permalink / raw)
  To: Hao Xu
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On Wed, Jul 26, 2023 at 06:25:56PM +0800, Hao Xu wrote:
> From: Hao Xu <[email protected]>
> 
> This series adds lseek for io_uring, the motivation to import this
> syscall is in previous io_uring getdents patchset, we lack a way to
> rewind the file cursor when it goes to the end of file. Another reason
> is lseek is a common syscall, it's good for coding consistency when
> users use io_uring as their main loop.

While I understand this it is a time consuming review to make sure
things work correctly. So before we get this thing going we better get
getdents correct first.

> 
> Patch 1 is code clean for iomap
> Patch 2 adds IOMAP_NOWAIT logic for iomap lseek
> Patch 3 adds a nowait parameter to for IOMAP_NOWAIT control
> Patch 4 adds llseek_nowait() for file_operations so that specific
>         filesystem can implement it for nowait lseek
> Patch 5 adds llseek_nowait() implementation for xfs
> Patch 6 adds a new vfs wrapper for io_uring use
> Patch 7 is the main io_uring lseek implementation
> 
> Note, this series depends on the previous io_uring getdents series.
> 
> This is marked RFC since there is (at least) an issue to be discussed:
> The work in this series is mainly to reslove a problem that the current
> llseek() in struct file_operations doesn't have a place to deliver
> nowait info, and adding an argument to it results in update for llseek
> implementation of all filesystems (35 functions), so here I introduce
> a new llseek_nowait() as a workaround.

My intuition would be to update all filesystems. Adding new inode
operations always starts as a temporary thing and then we live with two
different methods for the next years or possibly forever.

But it'd be good to hear what others think.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data()
  2023-07-26 10:25 ` [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data() Hao Xu
@ 2023-07-26 21:50   ` Dave Chinner
  2023-07-27 12:10     ` Hao Xu
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-07-26 21:50 UTC (permalink / raw)
  To: Hao Xu
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On Wed, Jul 26, 2023 at 06:25:57PM +0800, Hao Xu wrote:
> From: Hao Xu <[email protected]>
> 
> The two functions share almost same code, merge them together.
> No functional change in this patch.

No, please don't. seek data and seek hole have subtly different
semantics and return values, and we've explicitly kept them separate
because it's much easier to maintain them as separate functions with
separate semantics than combine them into a single function with
lots of non-obvious conditional behaviour.

The fact that the new iomap_seek() API requires a boolean "SEEK_DATA
or SEEK_HOLE" field to indicate that the caller wants makes it clear
that this isn't actually an improvement over explicit
iomap_seek_data() and iomap_seek_hole() API calls.

-Dave.
-- 
Dave Chinner
[email protected]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin()
  2023-07-26 10:25 ` [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin() Hao Xu
@ 2023-07-26 21:55   ` Dave Chinner
  2023-07-26 22:14     ` Darrick J. Wong
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-07-26 21:55 UTC (permalink / raw)
  To: Hao Xu
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On Wed, Jul 26, 2023 at 06:25:58PM +0800, Hao Xu wrote:
> From: Hao Xu <[email protected]>
> 
> To support nowait llseek(), IOMAP_NOWAIT semantics should be respected.
> In xfs, xfs_seek_iomap_begin() is the only place which may be blocked
> by ilock and extent loading. Let's turn it into trylock logic just like
> what we've done in xfs_readdir().
> 
> Signed-off-by: Hao Xu <[email protected]>
> ---
>  fs/xfs/xfs_iomap.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 18c8f168b153..bbd7c6b27701 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1294,7 +1294,9 @@ xfs_seek_iomap_begin(
>  	if (xfs_is_shutdown(mp))
>  		return -EIO;
>  
> -	lockmode = xfs_ilock_data_map_shared(ip);
> +	lockmode = xfs_ilock_data_map_shared_generic(ip, flags & IOMAP_NOWAIT);

What does this magic XFS function I can't find anywhere in this
patch set do?

-Dave.
-- 
Dave Chinner
[email protected]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/7] add nowait parameter for iomap_seek()
  2023-07-26 10:25 ` [PATCH 3/7] add nowait parameter for iomap_seek() Hao Xu
@ 2023-07-26 22:01   ` Dave Chinner
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Chinner @ 2023-07-26 22:01 UTC (permalink / raw)
  To: Hao Xu
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On Wed, Jul 26, 2023 at 06:25:59PM +0800, Hao Xu wrote:
> From: Hao Xu <[email protected]>
> 
> Add a nowait parameter for iomap_seek(), later IOMAP_NOWAIT is set
> according to this parameter's value.

This shows the problem with adding booleans as flags to
functions. Now there are -two- anonymous boolean flags to
iomap_seek(), and reading the code becomes really hard to know what
is actually being asked of iomap_seek().

This is another reason for not combining iomap_seek_data/hole()...

-Dave.
-- 
Dave Chinner
[email protected]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/7] add llseek_nowait support for xfs
  2023-07-26 10:26 ` [PATCH 5/7] add llseek_nowait support for xfs Hao Xu
@ 2023-07-26 22:14   ` Dave Chinner
  2023-07-27 12:26     ` Hao Xu
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-07-26 22:14 UTC (permalink / raw)
  To: Hao Xu
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On Wed, Jul 26, 2023 at 06:26:01PM +0800, Hao Xu wrote:
> From: Hao Xu <[email protected]>
> 
> Add llseek_nowait() operation for xfs, it acts just like llseek(). The
> thing different is it delivers nowait parameter to iomap layer.
> 
> Signed-off-by: Hao Xu <[email protected]>
> ---
>  fs/xfs/xfs_file.c | 29 +++++++++++++++++++++++++++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 73adc0aee2ff..cba82264221d 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1257,10 +1257,11 @@ xfs_file_readdir(
>  }
>  
>  STATIC loff_t
> -xfs_file_llseek(
> +__xfs_file_llseek(
>  	struct file	*file,
>  	loff_t		offset,
> -	int		whence)
> +	int		whence,
> +	bool		nowait)
>  {
>  	struct inode		*inode = file->f_mapping->host;
>  
> @@ -1282,6 +1283,28 @@ xfs_file_llseek(
>  	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
>  }
>  
> +STATIC loff_t
> +xfs_file_llseek(
> +	struct file	*file,
> +	loff_t		offset,
> +	int		whence)
> +{
> +	return __xfs_file_llseek(file, offset, whence, false);
> +}
> +
> +STATIC loff_t
> +xfs_file_llseek_nowait(
> +	struct file	*file,
> +	loff_t		offset,
> +	int		whence,
> +	bool		nowait)
> +{
> +	if (file->f_op == &xfs_file_operations)
> +		return __xfs_file_llseek(file, offset, whence, nowait);
> +	else
> +		return generic_file_llseek(file, offset, whence);
> +}
> +
>  #ifdef CONFIG_FS_DAX
>  static inline vm_fault_t
>  xfs_dax_fault(
> @@ -1442,6 +1465,7 @@ xfs_file_mmap(
>  
>  const struct file_operations xfs_file_operations = {
>  	.llseek		= xfs_file_llseek,
> +	.llseek_nowait	= xfs_file_llseek_nowait,
>  	.read_iter	= xfs_file_read_iter,
>  	.write_iter	= xfs_file_write_iter,
>  	.splice_read	= xfs_file_splice_read,
> @@ -1467,6 +1491,7 @@ const struct file_operations xfs_dir_file_operations = {
>  	.read		= generic_read_dir,
>  	.iterate_shared	= xfs_file_readdir,
>  	.llseek		= generic_file_llseek,
> +	.llseek_nowait	= xfs_file_llseek_nowait,
>  	.unlocked_ioctl	= xfs_file_ioctl,
>  #ifdef CONFIG_COMPAT
>  	.compat_ioctl	= xfs_file_compat_ioctl,

This is pretty nasty. It would be far better just to change the
.llseek method than to inflict this on every filesystem for the
forseeable future.

Not that I'm a fan of passing "nowait" booleans all through the file
operations methods - that way lies madness. We use a control
structure for the IO path operations (kiocb) to hold per-call
context information, perhaps we need something similar for these
other methods that people are wanting to hook up to io_uring (e.g.
readdir) so taht we don't have to play whack-a-mole with every new
io_uring method that people want and then end up with a different
nowait solution for every method.

-Dave.
-- 
Dave Chinner
[email protected]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin()
  2023-07-26 21:55   ` Dave Chinner
@ 2023-07-26 22:14     ` Darrick J. Wong
  2023-07-27 12:17       ` Hao Xu
  0 siblings, 1 reply; 19+ messages in thread
From: Darrick J. Wong @ 2023-07-26 22:14 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Hao Xu, io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	linux-fsdevel, linux-xfs, linux-ext4, Wanpeng Li

On Thu, Jul 27, 2023 at 07:55:47AM +1000, Dave Chinner wrote:
> On Wed, Jul 26, 2023 at 06:25:58PM +0800, Hao Xu wrote:
> > From: Hao Xu <[email protected]>
> > 
> > To support nowait llseek(), IOMAP_NOWAIT semantics should be respected.
> > In xfs, xfs_seek_iomap_begin() is the only place which may be blocked
> > by ilock and extent loading. Let's turn it into trylock logic just like
> > what we've done in xfs_readdir().
> > 
> > Signed-off-by: Hao Xu <[email protected]>
> > ---
> >  fs/xfs/xfs_iomap.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> > index 18c8f168b153..bbd7c6b27701 100644
> > --- a/fs/xfs/xfs_iomap.c
> > +++ b/fs/xfs/xfs_iomap.c
> > @@ -1294,7 +1294,9 @@ xfs_seek_iomap_begin(
> >  	if (xfs_is_shutdown(mp))
> >  		return -EIO;
> >  
> > -	lockmode = xfs_ilock_data_map_shared(ip);
> > +	lockmode = xfs_ilock_data_map_shared_generic(ip, flags & IOMAP_NOWAIT);
> 
> What does this magic XFS function I can't find anywhere in this
> patch set do?

It's in (iirc) the io_uring getdents patchset that wasn't cc'd to
linux-xfs and that I haven't looked at yet.

--D

> -Dave.
> -- 
> Dave Chinner
> [email protected]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data()
  2023-07-26 21:50   ` Dave Chinner
@ 2023-07-27 12:10     ` Hao Xu
  0 siblings, 0 replies; 19+ messages in thread
From: Hao Xu @ 2023-07-27 12:10 UTC (permalink / raw)
  To: Dave Chinner
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On 7/27/23 05:50, Dave Chinner wrote:
> On Wed, Jul 26, 2023 at 06:25:57PM +0800, Hao Xu wrote:
>> From: Hao Xu <[email protected]>
>>
>> The two functions share almost same code, merge them together.
>> No functional change in this patch.
> 
> No, please don't. seek data and seek hole have subtly different
> semantics and return values, and we've explicitly kept them separate
> because it's much easier to maintain them as separate functions with
> separate semantics than combine them into a single function with
> lots of non-obvious conditional behaviour.
> 
> The fact that the new iomap_seek() API requires a boolean "SEEK_DATA
> or SEEK_HOLE" field to indicate that the caller wants makes it clear
> that this isn't actually an improvement over explicit
> iomap_seek_data() and iomap_seek_hole() API calls.
> 
> -Dave.

I'll revert it back in next version. Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin()
  2023-07-26 22:14     ` Darrick J. Wong
@ 2023-07-27 12:17       ` Hao Xu
  0 siblings, 0 replies; 19+ messages in thread
From: Hao Xu @ 2023-07-27 12:17 UTC (permalink / raw)
  To: Darrick J. Wong, Dave Chinner
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	linux-fsdevel, linux-xfs, linux-ext4, Wanpeng Li

On 7/27/23 06:14, Darrick J. Wong wrote:
> On Thu, Jul 27, 2023 at 07:55:47AM +1000, Dave Chinner wrote:
>> On Wed, Jul 26, 2023 at 06:25:58PM +0800, Hao Xu wrote:
>>> From: Hao Xu <[email protected]>
>>>
>>> To support nowait llseek(), IOMAP_NOWAIT semantics should be respected.
>>> In xfs, xfs_seek_iomap_begin() is the only place which may be blocked
>>> by ilock and extent loading. Let's turn it into trylock logic just like
>>> what we've done in xfs_readdir().
>>>
>>> Signed-off-by: Hao Xu <[email protected]>
>>> ---
>>>   fs/xfs/xfs_iomap.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
>>> index 18c8f168b153..bbd7c6b27701 100644
>>> --- a/fs/xfs/xfs_iomap.c
>>> +++ b/fs/xfs/xfs_iomap.c
>>> @@ -1294,7 +1294,9 @@ xfs_seek_iomap_begin(
>>>   	if (xfs_is_shutdown(mp))
>>>   		return -EIO;
>>>   
>>> -	lockmode = xfs_ilock_data_map_shared(ip);
>>> +	lockmode = xfs_ilock_data_map_shared_generic(ip, flags & IOMAP_NOWAIT);
>>
>> What does this magic XFS function I can't find anywhere in this
>> patch set do?

Sorry, forgot to say, It was xfs_ilock_for_readdir() in io_uring
getdents patchset, I changed the name since it is now used for
lseek as well.

> 
> It's in (iirc) the io_uring getdents patchset that wasn't cc'd to
> linux-xfs and that I haven't looked at yet.
> 

Hi Darrick, I forwarded the xfs related patch in that series. Forgot to
cc xfs list at the beginning. I'll make xfs list be Cc-ed when sending
next version. Sorry for inconvenience.

Regards,
Hao

> --D
> 
>> -Dave.
>> -- 
>> Dave Chinner
>> [email protected]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/7] add llseek_nowait support for xfs
  2023-07-26 22:14   ` Dave Chinner
@ 2023-07-27 12:26     ` Hao Xu
  0 siblings, 0 replies; 19+ messages in thread
From: Hao Xu @ 2023-07-27 12:26 UTC (permalink / raw)
  To: Dave Chinner
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On 7/27/23 06:14, Dave Chinner wrote:
> On Wed, Jul 26, 2023 at 06:26:01PM +0800, Hao Xu wrote:
>> From: Hao Xu <[email protected]>
>>
>> Add llseek_nowait() operation for xfs, it acts just like llseek(). The
>> thing different is it delivers nowait parameter to iomap layer.
>>
>> Signed-off-by: Hao Xu <[email protected]>
>> ---
>>   fs/xfs/xfs_file.c | 29 +++++++++++++++++++++++++++--
>>   1 file changed, 27 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
>> index 73adc0aee2ff..cba82264221d 100644
>> --- a/fs/xfs/xfs_file.c
>> +++ b/fs/xfs/xfs_file.c
>> @@ -1257,10 +1257,11 @@ xfs_file_readdir(
>>   }
>>   
>>   STATIC loff_t
>> -xfs_file_llseek(
>> +__xfs_file_llseek(
>>   	struct file	*file,
>>   	loff_t		offset,
>> -	int		whence)
>> +	int		whence,
>> +	bool		nowait)
>>   {
>>   	struct inode		*inode = file->f_mapping->host;
>>   
>> @@ -1282,6 +1283,28 @@ xfs_file_llseek(
>>   	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
>>   }
>>   
>> +STATIC loff_t
>> +xfs_file_llseek(
>> +	struct file	*file,
>> +	loff_t		offset,
>> +	int		whence)
>> +{
>> +	return __xfs_file_llseek(file, offset, whence, false);
>> +}
>> +
>> +STATIC loff_t
>> +xfs_file_llseek_nowait(
>> +	struct file	*file,
>> +	loff_t		offset,
>> +	int		whence,
>> +	bool		nowait)
>> +{
>> +	if (file->f_op == &xfs_file_operations)
>> +		return __xfs_file_llseek(file, offset, whence, nowait);
>> +	else
>> +		return generic_file_llseek(file, offset, whence);
>> +}
>> +
>>   #ifdef CONFIG_FS_DAX
>>   static inline vm_fault_t
>>   xfs_dax_fault(
>> @@ -1442,6 +1465,7 @@ xfs_file_mmap(
>>   
>>   const struct file_operations xfs_file_operations = {
>>   	.llseek		= xfs_file_llseek,
>> +	.llseek_nowait	= xfs_file_llseek_nowait,
>>   	.read_iter	= xfs_file_read_iter,
>>   	.write_iter	= xfs_file_write_iter,
>>   	.splice_read	= xfs_file_splice_read,
>> @@ -1467,6 +1491,7 @@ const struct file_operations xfs_dir_file_operations = {
>>   	.read		= generic_read_dir,
>>   	.iterate_shared	= xfs_file_readdir,
>>   	.llseek		= generic_file_llseek,
>> +	.llseek_nowait	= xfs_file_llseek_nowait,
>>   	.unlocked_ioctl	= xfs_file_ioctl,
>>   #ifdef CONFIG_COMPAT
>>   	.compat_ioctl	= xfs_file_compat_ioctl,
> 
> This is pretty nasty. It would be far better just to change the
> .llseek method than to inflict this on every filesystem for the
> forseeable future.
> 
> Not that I'm a fan of passing "nowait" booleans all through the file
> operations methods - that way lies madness. We use a control
> structure for the IO path operations (kiocb) to hold per-call
> context information, perhaps we need something similar for these
> other methods that people are wanting to hook up to io_uring (e.g.
> readdir) so taht we don't have to play whack-a-mole with every new
> io_uring method that people want and then end up with a different
> nowait solution for every method.
> 
> -Dave.

whack-a-mole is exactly what I thought when I tried to find a place to
hold that...I'll think about it in next version, a generic structure for
io_uring is good I believe. I'll update this patchset after the getdents
stuff are merged.

Thanks,
Hao

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC 0/7] io_uring lseek
  2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
                   ` (7 preceding siblings ...)
  2023-07-26 13:22 ` [RFC 0/7] io_uring lseek Christian Brauner
@ 2023-07-27 12:30 ` Hao Xu
  8 siblings, 0 replies; 19+ messages in thread
From: Hao Xu @ 2023-07-27 12:30 UTC (permalink / raw)
  To: io-uring, Jens Axboe
  Cc: Dominique Martinet, Pavel Begunkov, Christian Brauner,
	Alexander Viro, Stefan Roesch, Clay Harris, Dave Chinner,
	Darrick J . Wong, linux-fsdevel, linux-xfs, linux-ext4,
	Wanpeng Li

On 7/26/23 18:25, Hao Xu wrote:
> From: Hao Xu <[email protected]>
> 
> This series adds lseek for io_uring, the motivation to import this
> syscall is in previous io_uring getdents patchset, we lack a way to
> rewind the file cursor when it goes to the end of file. Another reason
> is lseek is a common syscall, it's good for coding consistency when
> users use io_uring as their main loop.
> 
> Patch 1 is code clean for iomap
> Patch 2 adds IOMAP_NOWAIT logic for iomap lseek
> Patch 3 adds a nowait parameter to for IOMAP_NOWAIT control
> Patch 4 adds llseek_nowait() for file_operations so that specific
>          filesystem can implement it for nowait lseek
> Patch 5 adds llseek_nowait() implementation for xfs
> Patch 6 adds a new vfs wrapper for io_uring use
> Patch 7 is the main io_uring lseek implementation
> 
> Note, this series depends on the previous io_uring getdents series.
> 
> This is marked RFC since there is (at least) an issue to be discussed:
> The work in this series is mainly to reslove a problem that the current
> llseek() in struct file_operations doesn't have a place to deliver
> nowait info, and adding an argument to it results in update for llseek
> implementation of all filesystems (35 functions), so here I introduce
> a new llseek_nowait() as a workaround.
> 
> For performance, it has about 20%~30% improvement on iops.
> The test program is just like the one for io_uring getdents, here is the
> link to it: https://github.com/HowHsu/liburing/blob/llseek/test/lseek.c
> - Each test runs about 30000 async requests/sync syscalls
> - Each test runs 100 times and get the average value.
> - offset is randomly generated value
> - the file is a 1M all zero file
> 
> [howeyxu@~]$ python3 run_lseek.py
> test args:  seek mode:SEEK_SET, offset: 334772
> Average of  sync :  0.012300650000000002
> Average of  iouring :  0.008528009999999999
> 30.67%
> 
> [howeyxu@~]$ python3 run_lseek.py
> test args:  seek mode:SEEK_CUR, offset: 389292
> Average of  sync :  0.012736129999999995
> Average of  iouring :  0.00928725
> 27.08%
> 
> [howeyxu@~]$ python3 run_lseek.py
> test args:  seek mode:SEEK_END, offset: 281141
> Average of  sync :  0.01221595
> Average of  iouring :  0.008442890000000003
> 30.89%
> 
> [howeyxu@~]$ python3 run_lseek.py
> test args:  seek mode:SEEK_DATA, offset: 931103
> Average of  sync :  0.015496230000000005
> Average of  iouring :  0.012341509999999998
> 20.36%
> 
> [howeyxu@~]$ python3 run_lseek.py
> test args:  seek mode:SEEK_HOLE, offset: 430194
> Average of  sync :  0.01555663000000001
> Average of  iouring :  0.012064940000000003
> 22.45%
>   
> 
> Hao Xu (7):
>    iomap: merge iomap_seek_hole() and iomap_seek_data()
>    xfs: add nowait support for xfs_seek_iomap_begin()
>    add nowait parameter for iomap_seek()
>    add llseek_nowait() for struct file_operations
>    add llseek_nowait support for xfs
>    add vfs_lseek_nowait()
>    add lseek for io_uring
> 
>   fs/ext4/file.c                |  9 ++---
>   fs/gfs2/inode.c               |  4 +--
>   fs/iomap/seek.c               | 42 ++++++-----------------
>   fs/read_write.c               | 18 ++++++++++
>   fs/xfs/xfs_file.c             | 34 ++++++++++++++++---
>   fs/xfs/xfs_iomap.c            |  4 ++-
>   include/linux/fs.h            |  4 +++
>   include/linux/iomap.h         |  6 ++--
>   include/uapi/linux/io_uring.h |  1 +
>   io_uring/fs.c                 | 63 +++++++++++++++++++++++++++++++++++
>   io_uring/fs.h                 |  3 ++
>   io_uring/opdef.c              |  8 +++++
>   12 files changed, 145 insertions(+), 51 deletions(-)
> 
> 
> base-commit: 4a4b046082eca8ae90b654d772fccc30e9f23f4d

Hi folks,
I'll leave this patchset here before the io_uring getdents is merged,
then come back and update this one.

Thanks,
Hao

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/7] add llseek_nowait() for struct file_operations
  2023-07-26 10:26 ` [PATCH 4/7] add llseek_nowait() for struct file_operations Hao Xu
@ 2023-07-27 13:25   ` Matthew Wilcox
  0 siblings, 0 replies; 19+ messages in thread
From: Matthew Wilcox @ 2023-07-27 13:25 UTC (permalink / raw)
  To: Hao Xu
  Cc: io-uring, Jens Axboe, Dominique Martinet, Pavel Begunkov,
	Christian Brauner, Alexander Viro, Stefan Roesch, Clay Harris,
	Dave Chinner, Darrick J . Wong, linux-fsdevel, linux-xfs,
	linux-ext4, Wanpeng Li

On Wed, Jul 26, 2023 at 06:26:00PM +0800, Hao Xu wrote:
> +	loff_t (*llseek_nowait)(struct file *, loff_t, int, bool);

You don't have to name the struct file, but an unnamed int and an
unnamed bool is just not acceptable.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-07-27 13:26 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-26 10:25 [RFC 0/7] io_uring lseek Hao Xu
2023-07-26 10:25 ` [PATCH 1/7] iomap: merge iomap_seek_hole() and iomap_seek_data() Hao Xu
2023-07-26 21:50   ` Dave Chinner
2023-07-27 12:10     ` Hao Xu
2023-07-26 10:25 ` [PATCH 2/7] xfs: add nowait support for xfs_seek_iomap_begin() Hao Xu
2023-07-26 21:55   ` Dave Chinner
2023-07-26 22:14     ` Darrick J. Wong
2023-07-27 12:17       ` Hao Xu
2023-07-26 10:25 ` [PATCH 3/7] add nowait parameter for iomap_seek() Hao Xu
2023-07-26 22:01   ` Dave Chinner
2023-07-26 10:26 ` [PATCH 4/7] add llseek_nowait() for struct file_operations Hao Xu
2023-07-27 13:25   ` Matthew Wilcox
2023-07-26 10:26 ` [PATCH 5/7] add llseek_nowait support for xfs Hao Xu
2023-07-26 22:14   ` Dave Chinner
2023-07-27 12:26     ` Hao Xu
2023-07-26 10:26 ` [PATCH 6/7] add vfs_lseek_nowait() Hao Xu
2023-07-26 10:26 ` [PATCH 7/7] add lseek for io_uring Hao Xu
2023-07-26 13:22 ` [RFC 0/7] io_uring lseek Christian Brauner
2023-07-27 12:30 ` Hao Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox