From: Stefan Roesch <[email protected]>
To: Matthew Wilcox <[email protected]>
Cc: [email protected], [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected]
Subject: Re: [PATCH v8 06/14] iomap: Return -EAGAIN from iomap_write_iter()
Date: Thu, 9 Jun 2022 11:49:34 -0700 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 6/8/22 12:02 PM, Matthew Wilcox wrote:
> On Wed, Jun 08, 2022 at 10:17:33AM -0700, Stefan Roesch wrote:
>> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> index b06a5c24a4db..f701dcb7c26a 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -829,7 +829,13 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>> length -= status;
>> } while (iov_iter_count(i) && length);
>>
>> - return written ? written : status;
>> + if (status == -EAGAIN) {
>> + iov_iter_revert(i, written);
>> + return -EAGAIN;
>> + }
>> + if (written)
>> + return written;
>> + return status;
>> }
>
> I still don't understand how this can possibly work. Walk me through it.
>
> Let's imagine we have a file laid out such that extent 1 is bytes
> 0-4095 of the file and extent 2 is extent 4096-16385 of the file.
> We do a write of 5000 bytes starting at offset 4000 of the file.
>
> iomap_iter() tells us about the first extent and we write the first
> 96 bytes of our data to the first extent, returning 96. iomap_iter()
> tells us about the second extent, and we write the next 4000 bytes to
> the second extent. Then we get a page fault and get to the -EAGAIN case.
> We rewind the iter 4000 bytes.
>
We have two data structures, the iomap_iter and iov_iter. After the first
96 bytes, the iov_iter offset get updated in iomap_write_iter() and then the
iomap_iter pos gets updated in iomap_iter()->iomap_iter_advance().
We then get the second extend from iomap_iter(). In iomap_write_iter() the
first page is obtained and written successfully, then the second page is
faulted. At this point the iov offset of the iov_iter has advanced. To reset
it to the state when the function iomap_write_iter() was entered, the iov_iter
is reset to iov_offset - written bytes.
iomap_write_iter() is exited and returns -EAGAIN. As iomap_write_iter() returns
an error, the iomap_iter pos is not updated in iomap_iter(). Only the number
of bytes written in the write of the first extent from iomap_file_buffered_write()
is returned from iomap_file_buffered_write().
In xfs_file_buffered_write we updated the iocb->ki_pos with the number of
bytes written. In io-uring, the io_write() call receives the short write result.
It copies the iov_iter struct into the work context for the io worker.
The io_worker uses that information to complete the rest of the write.
The above reset is required to keep the pos in iomap_iter and the offset in
iov_iter in sync.
Side Note:
I had an earlier version of the patch that was changing the signature of the
function iomap_write_iter(). It was returning a return code and changing the
processed value of the iomap_iter (which then also changes the pos value of
the iomap_iter). This version (version 7 of the patch) does not require to
reset the offset of the iov_iter. It can update the pos in iomap_iter even
when -EAGAIN is returned.
> How do we not end up writing garbage when the kworker does the retry?
> I'd understand if we rewound the iter all the way to the start. Or if
> we didn't rewind the iter at all and were able to pick up partway through
> the write. But rewinding to the start of the extent feels like it can't
> possibly work.
next prev parent reply other threads:[~2022-06-09 18:50 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-08 17:17 [PATCH v8 00/14] io-uring/xfs: support async buffered writes Stefan Roesch
2022-06-08 17:17 ` [PATCH v8 01/14] mm: Move starting of background writeback into the main balancing loop Stefan Roesch
2022-06-08 17:17 ` [PATCH v8 02/14] mm: Move updates of dirty_exceeded into one place Stefan Roesch
2022-06-08 17:17 ` [PATCH v8 03/14] mm: Add balance_dirty_pages_ratelimited_flags() function Stefan Roesch
2022-06-08 18:44 ` Matthew Wilcox
2022-06-08 17:17 ` [PATCH v8 04/14] iomap: Add flags parameter to iomap_page_create() Stefan Roesch
2022-06-08 18:51 ` Matthew Wilcox
2022-06-08 17:17 ` [PATCH v8 05/14] iomap: Add async buffered write support Stefan Roesch
2022-06-08 17:17 ` [PATCH v8 06/14] iomap: Return -EAGAIN from iomap_write_iter() Stefan Roesch
2022-06-08 19:02 ` Matthew Wilcox
2022-06-09 18:49 ` Stefan Roesch [this message]
2022-06-08 17:17 ` [PATCH v8 07/14] fs: Add check for async buffered writes to generic_write_checks Stefan Roesch
2022-06-10 11:50 ` Christian Brauner
2022-06-08 17:17 ` [PATCH v8 08/14] fs: add __remove_file_privs() with flags parameter Stefan Roesch
2022-06-10 11:53 ` Christian Brauner
2022-06-08 17:17 ` [PATCH v8 09/14] fs: Split off inode_needs_update_time and __file_update_time Stefan Roesch
2022-06-10 11:55 ` Christian Brauner
2022-06-08 17:17 ` [PATCH v8 10/14] fs: Add async write file modification handling Stefan Roesch
2022-06-10 12:38 ` Christian Brauner
2022-06-08 17:17 ` [PATCH v8 11/14] io_uring: Add support for async buffered writes Stefan Roesch
2022-06-08 17:17 ` [PATCH v8 12/14] io_uring: Add tracepoint for short writes Stefan Roesch
2022-06-08 17:17 ` [PATCH v8 13/14] xfs: Specify lockmode when calling xfs_ilock_for_iomap() Stefan Roesch
2022-06-08 17:17 ` [PATCH v8 14/14] xfs: Add async buffered write support Stefan Roesch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox