public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jan Kara <[email protected]>
To: Christian Brauner <[email protected]>
Cc: [email protected], Jan Kara <[email protected]>,
	Christoph Hellwig <[email protected]>, Jens Axboe <[email protected]>,
	Alexander Viro <[email protected]>,
	Dave Chinner <[email protected]>,
	[email protected]
Subject: Re: [PATCH v2] fs: claw back a few FMODE_* bits
Date: Tue, 2 Apr 2024 12:51:17 +0200	[thread overview]
Message-ID: <20240402105117.bkbbqvzk7eqh23wa@quack3> (raw)
In-Reply-To: <20240328-gewendet-spargel-aa60a030ef74@brauner>

On Thu 28-03-24 13:27:24, Christian Brauner wrote:
> There's a bunch of flags that are purely based on what the file
> operations support while also never being conditionally set or unset.
> IOW, they're not subject to change for individual files. Imho, such
> flags don't need to live in f_mode they might as well live in the fops
> structs itself. And the fops struct already has that lonely
> mmap_supported_flags member. We might as well turn that into a generic
> fop_flags member and move a few flags from FMODE_* space into FOP_*
> space. That gets us four FMODE_* bits back and the ability for new
> static flags that are about file ops to not have to live in FMODE_*
> space but in their own FOP_* space. It's not the most beautiful thing
> ever but it gets the job done. Yes, there'll be an additional pointer
> chase but hopefully that won't matter for these flags.
> 
> I suspect there's a few more we can move into there and that we can also
> redirect a bunch of new flag suggestions that follow this pattern into
> the fop_flags field instead of f_mode.
> 
> (Fwiw, FMODE_NOACCOUNT and FMODE_BACKING could live in fop_flags as
>  well because they're also completely static but they aren't really
>  about file operations so they're better suited for FMODE_* imho.)
> 
> Signed-off-by: Christian Brauner <[email protected]>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

								Honza

> ---
>  block/bdev.c         |  2 +-
>  block/fops.c         |  1 +
>  drivers/dax/device.c |  2 +-
>  fs/btrfs/file.c      |  4 ++--
>  fs/ext4/file.c       |  6 +++---
>  fs/f2fs/file.c       |  3 ++-
>  fs/read_write.c      |  2 +-
>  fs/xfs/xfs_file.c    |  8 +++++---
>  include/linux/fs.h   | 22 ++++++++++++----------
>  io_uring/io_uring.c  |  2 +-
>  io_uring/rw.c        |  9 +++++----
>  mm/mmap.c            |  4 +++-
>  12 files changed, 37 insertions(+), 28 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index b8e32d933a63..dd26d37356aa 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -903,7 +903,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
>  		disk_unblock_events(disk);
>  
>  	bdev_file->f_flags |= O_LARGEFILE;
> -	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
> +	bdev_file->f_mode |= FMODE_CAN_ODIRECT;
>  	if (bdev_nowait(bdev))
>  		bdev_file->f_mode |= FMODE_NOWAIT;
>  	if (mode & BLK_OPEN_RESTRICT_WRITES)
> diff --git a/block/fops.c b/block/fops.c
> index 679d9b752fe8..af6c244314af 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -863,6 +863,7 @@ const struct file_operations def_blk_fops = {
>  	.splice_read	= filemap_splice_read,
>  	.splice_write	= iter_file_splice_write,
>  	.fallocate	= blkdev_fallocate,
> +	.fop_flags	= FOP_BUFFER_RASYNC,
>  };
>  
>  static __init int blkdev_init(void)
> diff --git a/drivers/dax/device.c b/drivers/dax/device.c
> index 93ebedc5ec8c..c24ef4d3cf31 100644
> --- a/drivers/dax/device.c
> +++ b/drivers/dax/device.c
> @@ -377,7 +377,7 @@ static const struct file_operations dax_fops = {
>  	.release = dax_release,
>  	.get_unmapped_area = dax_get_unmapped_area,
>  	.mmap = dax_mmap,
> -	.mmap_supported_flags = MAP_SYNC,
> +	.fop_flags = FOP_MMAP_SYNC,
>  };
>  
>  static void dev_dax_cdev_del(void *cdev)
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index f9d76072398d..1640c46f2153 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -3719,8 +3719,7 @@ static int btrfs_file_open(struct inode *inode, struct file *filp)
>  {
>  	int ret;
>  
> -	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC |
> -		        FMODE_CAN_ODIRECT;
> +	filp->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
>  
>  	ret = fsverity_file_open(inode, filp);
>  	if (ret)
> @@ -3850,6 +3849,7 @@ const struct file_operations btrfs_file_operations = {
>  	.compat_ioctl	= btrfs_compat_ioctl,
>  #endif
>  	.remap_file_range = btrfs_remap_file_range,
> +	.fop_flags	= FOP_BUFFER_RASYNC | FOP_BUFFER_WASYNC,
>  };
>  
>  int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end)
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 54d6ff22585c..28c51b0cc4db 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -885,8 +885,7 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
>  			return ret;
>  	}
>  
> -	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC |
> -			FMODE_DIO_PARALLEL_WRITE;
> +	filp->f_mode |= FMODE_NOWAIT;
>  	return dquot_file_open(inode, filp);
>  }
>  
> @@ -938,7 +937,6 @@ const struct file_operations ext4_file_operations = {
>  	.compat_ioctl	= ext4_compat_ioctl,
>  #endif
>  	.mmap		= ext4_file_mmap,
> -	.mmap_supported_flags = MAP_SYNC,
>  	.open		= ext4_file_open,
>  	.release	= ext4_release_file,
>  	.fsync		= ext4_sync_file,
> @@ -946,6 +944,8 @@ const struct file_operations ext4_file_operations = {
>  	.splice_read	= ext4_file_splice_read,
>  	.splice_write	= iter_file_splice_write,
>  	.fallocate	= ext4_fallocate,
> +	.fop_flags	= FOP_MMAP_SYNC | FOP_BUFFER_RASYNC |
> +			  FOP_DIO_PARALLEL_WRITE,
>  };
>  
>  const struct inode_operations ext4_file_inode_operations = {
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 1761ad125f97..2b65e09822d4 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -569,7 +569,7 @@ static int f2fs_file_open(struct inode *inode, struct file *filp)
>  	if (err)
>  		return err;
>  
> -	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
> +	filp->f_mode |= FMODE_NOWAIT;
>  	filp->f_mode |= FMODE_CAN_ODIRECT;
>  
>  	return dquot_file_open(inode, filp);
> @@ -5045,4 +5045,5 @@ const struct file_operations f2fs_file_operations = {
>  	.splice_read	= f2fs_file_splice_read,
>  	.splice_write	= iter_file_splice_write,
>  	.fadvise	= f2fs_file_fadvise,
> +	.fop_flags	= FOP_BUFFER_RASYNC,
>  };
> diff --git a/fs/read_write.c b/fs/read_write.c
> index d4c036e82b6c..2115d1f40bd5 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1685,7 +1685,7 @@ int generic_write_checks_count(struct kiocb *iocb, loff_t *count)
>  
>  	if ((iocb->ki_flags & IOCB_NOWAIT) &&
>  	    !((iocb->ki_flags & IOCB_DIRECT) ||
> -	      (file->f_mode & FMODE_BUF_WASYNC)))
> +	      (file->f_op->fop_flags & FOP_BUFFER_WASYNC)))
>  		return -EINVAL;
>  
>  	return generic_write_check_limits(iocb->ki_filp, iocb->ki_pos, count);
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 632653e00906..147439ad3581 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1230,8 +1230,7 @@ xfs_file_open(
>  {
>  	if (xfs_is_shutdown(XFS_M(inode->i_sb)))
>  		return -EIO;
> -	file->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC |
> -			FMODE_DIO_PARALLEL_WRITE | FMODE_CAN_ODIRECT;
> +	file->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
>  	return generic_file_open(inode, file);
>  }
>  
> @@ -1490,7 +1489,6 @@ const struct file_operations xfs_file_operations = {
>  	.compat_ioctl	= xfs_file_compat_ioctl,
>  #endif
>  	.mmap		= xfs_file_mmap,
> -	.mmap_supported_flags = MAP_SYNC,
>  	.open		= xfs_file_open,
>  	.release	= xfs_file_release,
>  	.fsync		= xfs_file_fsync,
> @@ -1498,6 +1496,8 @@ const struct file_operations xfs_file_operations = {
>  	.fallocate	= xfs_file_fallocate,
>  	.fadvise	= xfs_file_fadvise,
>  	.remap_file_range = xfs_file_remap_range,
> +	.fop_flags	= FOP_MMAP_SYNC | FOP_BUFFER_RASYNC | FOP_BUFFER_WASYNC |
> +			  FOP_DIO_PARALLEL_WRITE,
>  };
>  
>  const struct file_operations xfs_dir_file_operations = {
> @@ -1510,4 +1510,6 @@ const struct file_operations xfs_dir_file_operations = {
>  	.compat_ioctl	= xfs_file_compat_ioctl,
>  #endif
>  	.fsync		= xfs_dir_fsync,
> +	.fop_flags	= FOP_MMAP_SYNC | FOP_BUFFER_RASYNC | FOP_BUFFER_WASYNC |
> +			  FOP_DIO_PARALLEL_WRITE,
>  };
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8dfd53b52744..ece6e681ec77 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -165,9 +165,6 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
>  
>  #define	FMODE_NOREUSE		((__force fmode_t)0x800000)
>  
> -/* File supports non-exclusive O_DIRECT writes from multiple threads */
> -#define FMODE_DIO_PARALLEL_WRITE	((__force fmode_t)0x1000000)
> -
>  /* File is embedded in backing_file object */
>  #define FMODE_BACKING		((__force fmode_t)0x2000000)
>  
> @@ -183,12 +180,6 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
>  /* File does not contribute to nr_files count */
>  #define FMODE_NOACCOUNT		((__force fmode_t)0x20000000)
>  
> -/* File supports async buffered reads */
> -#define FMODE_BUF_RASYNC	((__force fmode_t)0x40000000)
> -
> -/* File supports async nowait buffered writes */
> -#define FMODE_BUF_WASYNC	((__force fmode_t)0x80000000)
> -
>  /*
>   * Attribute flags.  These should be or-ed together to figure out what
>   * has been changed!
> @@ -2003,8 +1994,11 @@ struct iov_iter;
>  struct io_uring_cmd;
>  struct offset_ctx;
>  
> +typedef unsigned int __bitwise fop_flags_t;
> +
>  struct file_operations {
>  	struct module *owner;
> +	fop_flags_t fop_flags;
>  	loff_t (*llseek) (struct file *, loff_t, int);
>  	ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
>  	ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
> @@ -2017,7 +2011,6 @@ struct file_operations {
>  	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
>  	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
>  	int (*mmap) (struct file *, struct vm_area_struct *);
> -	unsigned long mmap_supported_flags;
>  	int (*open) (struct inode *, struct file *);
>  	int (*flush) (struct file *, fl_owner_t id);
>  	int (*release) (struct inode *, struct file *);
> @@ -2048,6 +2041,15 @@ struct file_operations {
>  				unsigned int poll_flags);
>  } __randomize_layout;
>  
> +/* Supports async buffered reads */
> +#define FOP_BUFFER_RASYNC	((__force fop_flags_t)(1 << 0))
> +/* Supports async buffered writes */
> +#define FOP_BUFFER_WASYNC	((__force fop_flags_t)(1 << 1))
> +/* Supports synchronous page faults for mappings */
> +#define FOP_MMAP_SYNC		((__force fop_flags_t)(1 << 2))
> +/* Supports non-exclusive O_DIRECT writes from multiple threads */
> +#define FOP_DIO_PARALLEL_WRITE	((__force fop_flags_t)(1 << 3))
> +
>  /* Wrap a directory iterator that needs exclusive inode access */
>  int wrap_directory_iterator(struct file *, struct dir_context *,
>  			    int (*) (struct file *, struct dir_context *));
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 5d4b448fdc50..d73c9ad2d2f8 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -471,7 +471,7 @@ static void io_prep_async_work(struct io_kiocb *req)
>  
>  		/* don't serialize this request if the fs doesn't need it */
>  		if (should_hash && (req->file->f_flags & O_DIRECT) &&
> -		    (req->file->f_mode & FMODE_DIO_PARALLEL_WRITE))
> +		    (req->file->f_op->fop_flags & FOP_DIO_PARALLEL_WRITE))
>  			should_hash = false;
>  		if (should_hash || (ctx->flags & IORING_SETUP_IOPOLL))
>  			io_wq_hash_work(&req->work, file_inode(req->file));
> diff --git a/io_uring/rw.c b/io_uring/rw.c
> index 0585ebcc9773..d9dfde1142a1 100644
> --- a/io_uring/rw.c
> +++ b/io_uring/rw.c
> @@ -683,7 +683,8 @@ static bool io_rw_should_retry(struct io_kiocb *req)
>  	 * just use poll if we can, and don't attempt if the fs doesn't
>  	 * support callback based unlocks
>  	 */
> -	if (io_file_can_poll(req) || !(req->file->f_mode & FMODE_BUF_RASYNC))
> +	if (io_file_can_poll(req) ||
> +	    !(req->file->f_op->fop_flags & FOP_BUFFER_RASYNC))
>  		return false;
>  
>  	wait->wait.func = io_async_buf_func;
> @@ -1022,10 +1023,10 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
>  		if (unlikely(!io_file_supports_nowait(req)))
>  			goto copy_iov;
>  
> -		/* File path supports NOWAIT for non-direct_IO only for block devices. */
> +		/* Check if we can support NOWAIT. */
>  		if (!(kiocb->ki_flags & IOCB_DIRECT) &&
> -			!(kiocb->ki_filp->f_mode & FMODE_BUF_WASYNC) &&
> -			(req->flags & REQ_F_ISREG))
> +		    !(req->file->f_op->fop_flags & FOP_BUFFER_WASYNC) &&
> +		    (req->flags & REQ_F_ISREG))
>  			goto copy_iov;
>  
>  		kiocb->ki_flags |= IOCB_NOWAIT;
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 6dbda99a47da..3490af70f259 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1294,7 +1294,9 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>  		if (!file_mmap_ok(file, inode, pgoff, len))
>  			return -EOVERFLOW;
>  
> -		flags_mask = LEGACY_MAP_MASK | file->f_op->mmap_supported_flags;
> +		flags_mask = LEGACY_MAP_MASK;
> +		if (file->f_op->fop_flags & FOP_MMAP_SYNC)
> +			flags_mask |= MAP_SYNC;
>  
>  		switch (flags & MAP_TYPE) {
>  		case MAP_SHARED:
> -- 
> 2.43.0
> 
-- 
Jan Kara <[email protected]>
SUSE Labs, CR

  parent reply	other threads:[~2024-04-02 10:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 12:27 [PATCH v2] fs: claw back a few FMODE_* bits Christian Brauner
2024-03-28 13:08 ` Christoph Hellwig
2024-04-02 10:51 ` Jan Kara [this message]
2024-04-02 12:59 ` Jens Axboe
2024-04-03 21:12 ` Mark Brown
2024-04-04  9:12   ` Jan Kara
2024-04-04 11:43     ` Mark Brown
2024-04-05 10:27       ` Christian Brauner
2024-04-05 11:12         ` Mark Brown
2024-04-04  0:18 ` Al Viro
2024-04-05 10:06   ` Christian Brauner
2024-04-06  6:10 ` Al Viro
2024-04-06  6:16   ` Al Viro
2024-04-09  9:12     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240402105117.bkbbqvzk7eqh23wa@quack3 \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox