public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Jan Kara <jack@suse.cz>, Anuj Gupta <anuj20.g@samsung.com>,
	 Kanchan Joshi <joshi.k@samsung.com>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	 io-uring@vger.kernel.org
Subject: Re: [PATCH 1/2] fs: add a FMODE_ flag to indicate IOCB_HAS_METADATA availability
Date: Wed, 20 Aug 2025 11:40:36 +0200	[thread overview]
Message-ID: <20250820-voruntersuchung-fehlzeiten-4dcf7e45c29f@brauner> (raw)
In-Reply-To: <20250819133447.GA16775@lst.de>

On Tue, Aug 19, 2025 at 03:34:47PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 19, 2025 at 12:14:26PM +0200, Christian Brauner wrote:
> > On Tue, Aug 19, 2025 at 11:22:19AM +0200, Christoph Hellwig wrote:
> > > On Tue, Aug 19, 2025 at 11:14:41AM +0200, Christian Brauner wrote:
> > > > It kind of feels like that f_iocb_flags should be changed so that
> > > > subsystems like block can just raise some internal flags directly
> > > > instead of grabbing a f_mode flag everytime they need to make some
> > > > IOCB_* flag conditional on the file. That would mean changing the
> > > > unconditional assigment to file->f_iocb_flags to a |= to not mask flags
> > > > raised by the kernel itself.
> > > 
> > > This isn't about block.  I will be setting this for a file system
> > > operation as well and use the same io_uring code for that.  That's
> > > how I ran into the issue.
> > 
> > Yes, I get that. That's not what this is about. If IOCB_* flags keep
> > getting added that then need an additional opt-out via an FMODE_* flag
> > it's very annoying because you keep taking FMODE_* bits.
> 
> Agreed.
> 
> > The thing is
> > that it should be possible to keep that information completely contained
> > to f_iocb_flags without polluting f_mode.
> 
> I don't really understand how that would work.  The basic problem is that
> we add optional features/flags to read and write, and we need a way to
> check that they are supported and reject them without each time having
> to update all instances.  For that VFS-level code needs some way to do
> a per-instance check of available features.

I meant something like this which should effectively be the same thing
just that we move the burden of having to use two bits completely into
file->f_iocb_flags instead of wasting a file->f_mode bit:

diff --git a/block/fops.c b/block/fops.c
index ddbc69c0922b..a90f1127d035 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -689,7 +689,7 @@ static int blkdev_open(struct inode *inode, struct file *filp)
        if (bdev_can_atomic_write(bdev))
                filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
        if (blk_get_integrity(bdev->bd_disk))
-               filp->f_mode |= FMODE_HAS_METADATA;
+               filp->f_iocb_flags |= IOCB_MAY_USE_METADATA;

        ret = bdev_open(bdev, mode, filp->private_data, NULL, filp);
        if (ret)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 601d036a6c78..a40a1bf7bad5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -149,9 +149,6 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 /* Expect random access pattern */
 #define FMODE_RANDOM           ((__force fmode_t)(1 << 12))

-/* Supports IOCB_HAS_METADATA */
-#define FMODE_HAS_METADATA     ((__force fmode_t)(1 << 13))
-
 /* File is opened with O_PATH; almost nothing can be done with it */
 #define FMODE_PATH             ((__force fmode_t)(1 << 14))

@@ -384,25 +381,27 @@ struct readahead_control;
 /* kiocb is a read or write operation submitted by fs/aio.c. */
 #define IOCB_AIO_RW            (1 << 23)
 #define IOCB_HAS_METADATA      (1 << 24)
+#define IOCB_MAY_USE_METADATA  (1 << 25)

 /* for use in trace events */
 #define TRACE_IOCB_STRINGS \
-       { IOCB_HIPRI,           "HIPRI" }, \
-       { IOCB_DSYNC,           "DSYNC" }, \
-       { IOCB_SYNC,            "SYNC" }, \
-       { IOCB_NOWAIT,          "NOWAIT" }, \
-       { IOCB_APPEND,          "APPEND" }, \
-       { IOCB_ATOMIC,          "ATOMIC" }, \
-       { IOCB_DONTCACHE,       "DONTCACHE" }, \
-       { IOCB_EVENTFD,         "EVENTFD"}, \
-       { IOCB_DIRECT,          "DIRECT" }, \
-       { IOCB_WRITE,           "WRITE" }, \
-       { IOCB_WAITQ,           "WAITQ" }, \
-       { IOCB_NOIO,            "NOIO" }, \
-       { IOCB_ALLOC_CACHE,     "ALLOC_CACHE" }, \
-       { IOCB_DIO_CALLER_COMP, "CALLER_COMP" }, \
-       { IOCB_AIO_RW,          "AIO_RW" }, \
-       { IOCB_HAS_METADATA,    "AIO_HAS_METADATA" }
+       { IOCB_HIPRI,                   "HIPRI" }, \
+       { IOCB_DSYNC,                   "DSYNC" }, \
+       { IOCB_SYNC,                    "SYNC" }, \
+       { IOCB_NOWAIT,                  "NOWAIT" }, \
+       { IOCB_APPEND,                  "APPEND" }, \
+       { IOCB_ATOMIC,                  "ATOMIC" }, \
+       { IOCB_DONTCACHE,               "DONTCACHE" }, \
+       { IOCB_EVENTFD,                 "EVENTFD"}, \
+       { IOCB_DIRECT,                  "DIRECT" }, \
+       { IOCB_WRITE,                   "WRITE" }, \
+       { IOCB_WAITQ,                   "WAITQ" }, \
+       { IOCB_NOIO,                    "NOIO" }, \
+       { IOCB_ALLOC_CACHE,             "ALLOC_CACHE" }, \
+       { IOCB_DIO_CALLER_COMP,         "CALLER_COMP" }, \
+       { IOCB_AIO_RW,                  "AIO_RW" }, \
+       { IOCB_HAS_METADATA,            "AIO_HAS_METADATA" }, \
+       { IOCB_MAY_USE_METADATA,        "AIO_MAY_USE_METADATA" }

 struct kiocb {
        struct file             *ki_filp;
@@ -3786,6 +3785,10 @@ static inline bool vma_is_fsdax(struct vm_area_struct *vma)
 static inline int iocb_flags(struct file *file)
 {
        int res = 0;
+
+       /* Retain flags that the kernel raises internally. */
+       res |= (file->f_iocb_flags & (IOCB_HAS_METADATA | IOCB_MAY_USE_METADATA));
+
        if (file->f_flags & O_APPEND)
                res |= IOCB_APPEND;
        if (file->f_flags & O_DIRECT)
diff --git a/io_uring/rw.c b/io_uring/rw.c
index af5a54b5db12..23e9103c62d4 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -886,7 +886,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
        if (req->flags & REQ_F_HAS_METADATA) {
                struct io_async_rw *io = req->async_data;

-               if (!(file->f_mode & FMODE_HAS_METADATA))
+               if (!(file->f_iocb_flags & IOCB_MAY_USE_METADATA))
                        return -EINVAL;

                /*


  reply	other threads:[~2025-08-20  9:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-19  8:24 io_uring / dio metadata fixes Christoph Hellwig
2025-08-19  8:25 ` [PATCH 1/2] fs: add a FMODE_ flag to indicate IOCB_HAS_METADATA availability Christoph Hellwig
2025-08-19  9:14   ` Christian Brauner
2025-08-19  9:22     ` Christoph Hellwig
2025-08-19 10:14       ` Christian Brauner
2025-08-19 13:34         ` Christoph Hellwig
2025-08-20  9:40           ` Christian Brauner [this message]
2025-08-21  8:42             ` Christoph Hellwig
2025-08-25 12:01               ` Christian Brauner
2025-08-25 13:35                 ` Christoph Hellwig
2025-08-19  8:25 ` [PATCH 2/2] block: don't silently ignore metadata for sync read/write Christoph Hellwig
2025-08-20  3:23   ` Martin K. Petersen
2025-08-20  9:13 ` io_uring / dio metadata fixes Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250820-voruntersuchung-fehlzeiten-4dcf7e45c29f@brauner \
    --to=brauner@kernel.org \
    --cc=anuj20.g@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=jack@suse.cz \
    --cc=joshi.k@samsung.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox