public inbox for [email protected]
 help / color / mirror / Atom feed
From: Qu Wenruo <[email protected]>
To: Jens Axboe <[email protected]>,
	"[email protected]" <[email protected]>,
	Linux FS Devel <[email protected]>,
	[email protected]
Subject: Re: Possible io_uring related race leads to btrfs data csum mismatch
Date: Thu, 17 Aug 2023 09:19:21 +0800	[thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>



On 2023/8/17 09:12, Jens Axboe wrote:
> On 8/16/23 7:05 PM, Qu Wenruo wrote:
>>
>>
>> On 2023/8/17 06:28, Jens Axboe wrote:
>> [...]
>>>
>>>>> 2) What's the .config you are using?
>>>>
>>>> Pretty common config, no heavy debug options (KASAN etc).
>>>
>>> Please just send the .config, I'd rather not have to guess. Things like
>>> preempt etc may make a difference in reproducing this.
>>
>> Sure, please see the attached config.gz
>
> Thanks
>
>>> And just to be sure, this is not mixing dio and buffered, right?
>>
>> I'd say it's mixing, there are dwrite() and writev() for the same file,
>> but at least not overlapping using this particular seed, nor they are
>> concurrent (all inside the same process sequentially).
>>
>> But considering if only uring_write is disabled, then no more reproduce,
>> thus there must be some untested btrfs path triggered by uring_write.
>
> That would be one conclusion, another would be that timing is just
> different and that triggers and issue. Or it could of course be a bug in
> io_uring, perhaps a short write that gets retried or something like
> that. I've run the tests for hours here and don't hit anything, I've
> pulled in the for-next branch for btrfs and see if that'll make a
> difference. I'll check your .config too.

Just to mention, the problem itself was pretty hard to hit before if
using any debug kernel configs.

Not sure why but later I switched both my CPUs (from a desktop i7-13700K
but with limited 160W power, to a laptop 7940HS), dropping all heavy
debug kernel configs, then it's 100% reproducible here.

So I guess a faster CPU is also one factor?

>
> Might not be a bad idea to have the writes contain known data, and when
> you hit the failure to verify the csum, dump the data where the csum
> says it's wrong and figure out at what offset, what content, etc it is?
> If that can get correlated to the log of what happened, that might shed
> some light on this.
>
Thanks for the advice, would definitely try this method, would keep you
updated when I found something valuable.

Thanks,
Qu

  reply	other threads:[~2023-08-17  1:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16  6:52 Possible io_uring related race leads to btrfs data csum mismatch Qu Wenruo
2023-08-16 14:33 ` Jens Axboe
2023-08-16 14:49   ` Jens Axboe
2023-08-16 21:46   ` Qu Wenruo
2023-08-16 22:28     ` Jens Axboe
2023-08-17  1:05       ` Qu Wenruo
2023-08-17  1:12         ` Jens Axboe
2023-08-17  1:19           ` Qu Wenruo [this message]
2023-08-17  1:23             ` Jens Axboe
2023-08-17  1:31               ` Qu Wenruo
2023-08-17  1:32                 ` Jens Axboe
2023-08-19 23:59                   ` Qu Wenruo
2023-08-20  0:22                     ` Qu Wenruo
2023-08-20 13:26                       ` Jens Axboe
2023-08-20 14:11                         ` Jens Axboe
2023-08-20 18:18                           ` Matthew Wilcox
2023-08-20 18:40                             ` Jens Axboe
2023-08-21  0:38                           ` Qu Wenruo
2023-08-21 14:57                             ` Jens Axboe
2023-08-21 21:42                               ` Qu Wenruo
2023-08-16 22:36     ` Jens Axboe
2023-08-17  0:40       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox