public inbox for [email protected]
 help / color / mirror / Atom feed
From: Martin Raiber <[email protected]>
To: Pavel Begunkov <[email protected]>,
	Jens Axboe <[email protected]>,
	[email protected]
Subject: Re: Fixed buffers have out-dated content
Date: Thu, 14 Jan 2021 21:50:05 +0000	[thread overview]
Message-ID: <0102017702e086ca-cdb34993-86ad-4ec6-bea5-b6a5ad055a62-000000@eu-west-1.amazonses.com> (raw)
In-Reply-To: <01020176ed350725-cc3c8fa7-7771-46c9-8fa9-af433acb2453-000000@eu-west-1.amazonses.com>

On 10.01.2021 17:50 Martin Raiber wrote:
> On 09.01.2021 21:32 Pavel Begunkov wrote:
>> On 09/01/2021 16:58, Martin Raiber wrote:
>>> On 09.01.2021 17:23 Jens Axboe wrote:
>>>> On 1/8/21 4:39 PM, Martin Raiber wrote:
>>>>> Hi,
>>>>>
>>>>> I have a gnarly issue with io_uring and fixed buffers (fixed
>>>>> read/write). It seems the contents of those buffers contain old 
>>>>> data in
>>>>> some rare cases under memory pressure after a read/during a write.
>>>>>
>>>>> Specifically I use io_uring with fuse and to confirm this is not some
>>>>> user space issue let fuse print the unique id it adds to each 
>>>>> request.
>>>>> Fuse adds this request data to a pipe, and when the pipe buffer is 
>>>>> later
>>>>> copied to the io_uring fixed buffer it has the id of a fuse request
>>>>> returned earlier using the same buffer while returning the size of 
>>>>> the
>>>>> new request. Or I set the unique id in the buffer, write it to 
>>>>> fuse (via
>>>>> writing to a pipe, then splicing) and then fuse returns with e.g.
>>>>> ENOENT, because the unique id is not correct because in kernel it 
>>>>> reads
>>>>> the id of the previous, already completed, request using this buffer.
>>>>>
>>>>> To make reproducing this faster running memtester (which mlocks a
>>>>> configurable amount of memory) with a large amount of user memory 
>>>>> every
>>>>> 30s helps. So it has something to do with swapping? It seems to not
>>>>> occur if no swap space is active. Problem occurs without warning when
>>>>> the kernel is build with KASAN and slab debugging.
>>>>>
>>>>> If I don't use the _FIXED opcodes (which is easy to do), the problem
>>>>> does not occur.
>>>>>
>>>>> Problem occurs with 5.9.16 and 5.10.5.
>>>> Can you mention more about what kind of IO you are doing, I'm assuming
>>>> it's O_DIRECT? I'll see if I can reproduce this.
>>> It's writing to/reading from pipes (nonblocking, no O_DIRECT).
>> A blind guess, does it handle short reads and writes? If not, can you
>> check whether they happen or not?
>
> Something like this was what I suspected at first as well. It does 
> check for short read/writes and I added (unnecessary -- because the 
> fuse request structure is 40 bytes and it does io in page sizes) code 
> for retrying short reads at some point. I also checked for the pipes 
> to be empty before they are used at some point and let the kernel log 
> allocation failures (idea was that it was short pipe read/writes 
> because of allocation failure or that something doesn't get rewound 
> properly in this case). Beyond that three things that make a user 
> space problem unlikely:
>
>  - occurs only when using fixed buffers and does not occur when 
> running same code without fixed buffer opcodes
>  - doesn't occur when there is no memory pressure
>  - I added print(k/f) logging that pointed me in this direction as well
>
>>> I can reproduce it with https://github.com/uroni/fuseuring on e.g. a 
>>> 2GB VPS. Modify bench.sh so that fio loops. Add swap, then run 1400M 
>>> memtester while it runs (so it swaps, I guess). I can try further 
>>> reducing the reproducer, but I wanted to avoid that work in case it 
>>> is something obvious. The next step would be to remove fuse from the 
>>> equation -- it does try to move the pages from the pipe when 
>>> splicing to it, for example.

When I use 5.10.7 with 09854ba94c6aad7886996bfbee2530b3d8a7f4f4 ("mm: 
do_wp_page() simplification"), 1a0cf26323c80e2f1c58fc04f15686de61bfab0c 
("mm/ksm: Remove reuse_ksm_page()") and 
be068f29034fb00530a053d18b8cf140c32b12b3 ("mm: fix misplaced unlock_page 
in do_wp_page()") reverted the issue doesn't seem to occur.


  reply	other threads:[~2021-01-14 21:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 23:39 Fixed buffer have out-dated content Martin Raiber
2021-01-09 16:23 ` Jens Axboe
2021-01-09 16:58   ` Martin Raiber
2021-01-09 20:32     ` Pavel Begunkov
2021-01-10 16:50       ` Martin Raiber
2021-01-14 21:50         ` Martin Raiber [this message]
2021-01-16 19:30           ` Fixed buffers " Pavel Begunkov
2021-01-16 19:39             ` Jens Axboe
2021-01-16 22:12           ` Jens Axboe
2021-01-16 23:05             ` Linus Torvalds
2021-01-16 23:34               ` Linus Torvalds
2021-01-17 20:07                 ` Martin Raiber
2021-01-17 20:14                   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0102017702e086ca-cdb34993-86ad-4ec6-bea5-b6a5ad055a62-000000@eu-west-1.amazonses.com \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox