public inbox for [email protected]
 help / color / mirror / Atom feed
From: Jens Axboe <[email protected]>
To: "Darrick J. Wong" <[email protected]>,
	Christoph Hellwig <[email protected]>
Cc: Miklos Szeredi <[email protected]>,
	Bernd Schubert <[email protected]>,
	[email protected], [email protected],
	[email protected], [email protected],
	[email protected]
Subject: Re: [PATCH 1/2] fs: add FMODE_DIO_PARALLEL_WRITE flag
Date: Sat, 15 Apr 2023 07:15:49 -0600	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <20230414153612.GB360881@frogsfrogsfrogs>

On 4/14/23 9:36?AM, Darrick J. Wong wrote:
> On Thu, Apr 13, 2023 at 10:11:28PM -0700, Christoph Hellwig wrote:
>> On Thu, Apr 13, 2023 at 09:40:29AM +0200, Miklos Szeredi wrote:
>>> fuse_direct_write_iter():
>>>
>>> bool exclusive_lock =
>>>     !(ff->open_flags & FOPEN_PARALLEL_DIRECT_WRITES) ||
>>>     iocb->ki_flags & IOCB_APPEND ||
>>>     fuse_direct_write_extending_i_size(iocb, from);
>>>
>>> If the write is size extending, then it will take the lock exclusive.
>>> OTOH, I guess that it would be unusual for lots of  size extending
>>> writes to be done in parallel.
>>>
>>> What would be the effect of giving the  FMODE_DIO_PARALLEL_WRITE hint
>>> and then still serializing the writes?
>>
>> I have no idea how this flags work, but XFS also takes i_rwsem
>> exclusively for appends, when the positions and size aren't aligned to
>> the block size, and a few other cases.
> 
> IIUC uring wants to avoid the situation where someone sends 300 writes
> to the same file, all of which end up in background workers, and all of
> which then contend on exclusive i_rwsem.  Hence it has some hashing
> scheme that executes io requests serially if they hash to the same value
> (which iirc is the inode number?) to prevent resource waste.
> 
> This flag turns off that hashing behavior on the assumption that each of
> those 300 writes won't serialize on the other 299 writes, hence it's ok
> to start up 300 workers.
> 
> (apologies for precoffee garbled response)

Yep, that is pretty much it. If all writes to that inode are serialized
by a lock on the fs side, then we'll get a lot of contention on that
mutex. And since, originally, nothing supported async writes, everything
would get punted to the io-wq workers. io_uring added per-inode hashing
for this, so that any punt to io-wq of a write would get serialized.

IOW, it's an efficiency thing, not a correctness thing.

-- 
Jens Axboe


  reply	other threads:[~2023-04-15 13:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07 17:20 [PATCHSET for-next 0/2] Flag file systems as supporting parallel dio writes Jens Axboe
2023-03-07 17:20 ` [PATCH 1/2] fs: add FMODE_DIO_PARALLEL_WRITE flag Jens Axboe
2023-04-12 13:40   ` Bernd Schubert
2023-04-12 13:43     ` Bernd Schubert
2023-04-13  7:40     ` Miklos Szeredi
2023-04-13  9:25       ` Bernd Schubert
2023-04-14  5:11       ` Christoph Hellwig
2023-04-14 15:36         ` Darrick J. Wong
2023-04-15 13:15           ` Jens Axboe [this message]
2023-04-18 12:42             ` Miklos Szeredi
2023-04-18 12:55               ` Bernd Schubert
2023-04-18 22:13                 ` Dave Chinner
2023-04-19  1:28                   ` Jens Axboe
2023-04-16  5:54           ` Christoph Hellwig
2023-04-19  1:29             ` Jens Axboe
2023-03-07 17:20 ` [PATCH 2/2] io_uring: avoid hashing O_DIRECT writes if the filesystem doesn't need it Jens Axboe
2023-03-15 17:40 ` [PATCHSET for-next 0/2] Flag file systems as supporting parallel dio writes Jens Axboe
2023-03-16  4:29   ` Darrick J. Wong
2023-03-17  2:53     ` Jens Axboe
2023-04-03 12:24 ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox