From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10FF0C77B6E for ; Fri, 14 Apr 2023 15:36:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229853AbjDNPgQ (ORCPT ); Fri, 14 Apr 2023 11:36:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229494AbjDNPgP (ORCPT ); Fri, 14 Apr 2023 11:36:15 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 787E855A1; Fri, 14 Apr 2023 08:36:14 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 14B876160E; Fri, 14 Apr 2023 15:36:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6E141C4339B; Fri, 14 Apr 2023 15:36:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1681486573; bh=fQbshHMyenXjbLKMFET/GjpZqotL8SVaYFHwivrmZGc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=EPOFVu/DuwyuZANHqHcIFm3updxEbeHUXzp/iRet476VBB6uNbMtHE1khP4Sh1gSE Q6Y694QdGbv5cpKGotEUfJRvJFmAmSEOCyaATRoC+Yokm5MNHTdT1IGLIy/xBIH5zi Stm+pAFlxwJkRhantTSMdg4sGEej5MG3L7DrM1oAQsogxP1WRzj6x/rfKNWXFN86I/ JlS0RrvRFAaR3ouKEs+sy9hsqIIaceSojg+zNfCUkEJDRaAO3UdU6vlJzG7Qch+KvH wkp97f4xUGTIbmDQdTdWmg0VGlJsaciHwLK93XPh4PBN6K/LDYrQlWfmFnmq4/fqu+ 4Q2Wnj0bxfFeA== Date: Fri, 14 Apr 2023 08:36:12 -0700 From: "Darrick J. Wong" To: Christoph Hellwig Cc: Miklos Szeredi , Bernd Schubert , axboe@kernel.dk, io-uring@vger.kernel.org, linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, dsingh@ddn.com Subject: Re: [PATCH 1/2] fs: add FMODE_DIO_PARALLEL_WRITE flag Message-ID: <20230414153612.GB360881@frogsfrogsfrogs> References: <20230307172015.54911-2-axboe@kernel.dk> <20230412134057.381941-1-bschubert@ddn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org On Thu, Apr 13, 2023 at 10:11:28PM -0700, Christoph Hellwig wrote: > On Thu, Apr 13, 2023 at 09:40:29AM +0200, Miklos Szeredi wrote: > > fuse_direct_write_iter(): > > > > bool exclusive_lock = > > !(ff->open_flags & FOPEN_PARALLEL_DIRECT_WRITES) || > > iocb->ki_flags & IOCB_APPEND || > > fuse_direct_write_extending_i_size(iocb, from); > > > > If the write is size extending, then it will take the lock exclusive. > > OTOH, I guess that it would be unusual for lots of size extending > > writes to be done in parallel. > > > > What would be the effect of giving the FMODE_DIO_PARALLEL_WRITE hint > > and then still serializing the writes? > > I have no idea how this flags work, but XFS also takes i_rwsem > exclusively for appends, when the positions and size aren't aligned to > the block size, and a few other cases. IIUC uring wants to avoid the situation where someone sends 300 writes to the same file, all of which end up in background workers, and all of which then contend on exclusive i_rwsem. Hence it has some hashing scheme that executes io requests serially if they hash to the same value (which iirc is the inode number?) to prevent resource waste. This flag turns off that hashing behavior on the assumption that each of those 300 writes won't serialize on the other 299 writes, hence it's ok to start up 300 workers. (apologies for precoffee garbled response) --D