public inbox for [email protected]
 help / color / mirror / Atom feed
* [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep
@ 2021-07-11 15:09 Matthew Wilcox (Oracle)
  2021-07-11 15:09 ` [PATCH 1/2] mm/readahead: Add gfp_flags to ractl Matthew Wilcox (Oracle)
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-07-11 15:09 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Matthew Wilcox (Oracle), io-uring, linux-mm, linux-fsdevel,
	linux-aio

I noticed a theoretical case where an IOCB_NOWAIT read could sleep:

filemap_get_pages
  filemap_get_read_batch
  page_cache_sync_readahead
    page_cache_sync_ra
      ondemand_readahead
        do_page_cache_ra
        page_cache_ra_unbounded
          gfp_t gfp_mask = readahead_gfp_mask(mapping);
          memalloc_nofs_save()
          __page_cache_alloc(gfp_mask);

We're in a nofs context, so we're not going to start new IO, but we might
wait for writeback to complete.  We generally don't want to sleep for IO,
particularly not for IO that isn't related to us.

Jens, can you run this through your test rig and see if it makes any
practical difference?

Matthew Wilcox (Oracle) (2):
  mm/readahead: Add gfp_flags to ractl
  mm/filemap: Prevent waiting for memory for NOWAIT reads

 include/linux/pagemap.h |  3 +++
 mm/filemap.c            | 31 +++++++++++++++++++------------
 mm/readahead.c          | 16 ++++++++--------
 3 files changed, 30 insertions(+), 20 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] mm/readahead: Add gfp_flags to ractl
  2021-07-11 15:09 [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Matthew Wilcox (Oracle)
@ 2021-07-11 15:09 ` Matthew Wilcox (Oracle)
  2021-07-12 11:34   ` Christoph Hellwig
  2021-07-11 15:09 ` [PATCH 2/2] mm/filemap: Prevent waiting for memory for NOWAIT reads Matthew Wilcox (Oracle)
  2021-07-12  1:44 ` [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Jens Axboe
  2 siblings, 1 reply; 8+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-07-11 15:09 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Matthew Wilcox (Oracle), io-uring, linux-mm, linux-fsdevel,
	linux-aio

It is currently possible for an I/O request that specifies IOCB_NOWAIT
to sleep waiting for I/O to complete in order to allocate pages for
readahead.  In order to fix that, we need the caller to be able to
specify the GFP flags to use for memory allocation in the rest of the
readahead path.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
 include/linux/pagemap.h |  3 +++
 mm/readahead.c          | 16 ++++++++--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ed02aa522263..00abb2b2f158 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -833,11 +833,13 @@ static inline int add_to_page_cache(struct page *page,
  *	  May be NULL if invoked internally by the filesystem.
  * @mapping: Readahead this filesystem object.
  * @ra: File readahead state.  May be NULL.
+ * @gfp_flags: Memory allocation flags to use.
  */
 struct readahead_control {
 	struct file *file;
 	struct address_space *mapping;
 	struct file_ra_state *ra;
+	gfp_t gfp_flags;
 /* private: use the readahead_* accessors instead */
 	pgoff_t _index;
 	unsigned int _nr_pages;
@@ -849,6 +851,7 @@ struct readahead_control {
 		.file = f,						\
 		.mapping = m,						\
 		.ra = r,						\
+		.gfp_flags = readahead_gfp_mask(m),			\
 		._index = i,						\
 	}
 
diff --git a/mm/readahead.c b/mm/readahead.c
index d589f147f4c2..58937d1fe3f7 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -177,7 +177,6 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 	struct address_space *mapping = ractl->mapping;
 	unsigned long index = readahead_index(ractl);
 	LIST_HEAD(page_pool);
-	gfp_t gfp_mask = readahead_gfp_mask(mapping);
 	unsigned long i;
 
 	/*
@@ -212,14 +211,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 			continue;
 		}
 
-		page = __page_cache_alloc(gfp_mask);
+		page = __page_cache_alloc(ractl->gfp_flags);
 		if (!page)
 			break;
 		if (mapping->a_ops->readpages) {
 			page->index = index + i;
 			list_add(&page->lru, &page_pool);
 		} else if (add_to_page_cache_lru(page, mapping, index + i,
-					gfp_mask) < 0) {
+					ractl->gfp_flags) < 0) {
 			put_page(page);
 			read_pages(ractl, &page_pool, true);
 			i = ractl->_index + ractl->_nr_pages - index - 1;
@@ -663,7 +662,6 @@ void readahead_expand(struct readahead_control *ractl,
 	struct address_space *mapping = ractl->mapping;
 	struct file_ra_state *ra = ractl->ra;
 	pgoff_t new_index, new_nr_pages;
-	gfp_t gfp_mask = readahead_gfp_mask(mapping);
 
 	new_index = new_start / PAGE_SIZE;
 
@@ -675,10 +673,11 @@ void readahead_expand(struct readahead_control *ractl,
 		if (page && !xa_is_value(page))
 			return; /* Page apparently present */
 
-		page = __page_cache_alloc(gfp_mask);
+		page = __page_cache_alloc(ractl->gfp_flags);
 		if (!page)
 			return;
-		if (add_to_page_cache_lru(page, mapping, index, gfp_mask) < 0) {
+		if (add_to_page_cache_lru(page, mapping, index,
+					  ractl->gfp_flags) < 0) {
 			put_page(page);
 			return;
 		}
@@ -698,10 +697,11 @@ void readahead_expand(struct readahead_control *ractl,
 		if (page && !xa_is_value(page))
 			return; /* Page apparently present */
 
-		page = __page_cache_alloc(gfp_mask);
+		page = __page_cache_alloc(ractl->gfp_flags);
 		if (!page)
 			return;
-		if (add_to_page_cache_lru(page, mapping, index, gfp_mask) < 0) {
+		if (add_to_page_cache_lru(page, mapping, index,
+					  ractl->gfp_flags) < 0) {
 			put_page(page);
 			return;
 		}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm/filemap: Prevent waiting for memory for NOWAIT reads
  2021-07-11 15:09 [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Matthew Wilcox (Oracle)
  2021-07-11 15:09 ` [PATCH 1/2] mm/readahead: Add gfp_flags to ractl Matthew Wilcox (Oracle)
@ 2021-07-11 15:09 ` Matthew Wilcox (Oracle)
  2021-07-12  1:44 ` [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Jens Axboe
  2 siblings, 0 replies; 8+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-07-11 15:09 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Matthew Wilcox (Oracle), io-uring, linux-mm, linux-fsdevel,
	linux-aio

Readahead memory allocations won't block for much today, as they're
already marked as NOFS and NORETRY, but they can still sleep, and
they shouldn't if the read is marked as IOCB_NOWAIT.  Clearing the
DIRECT_RECLAIM flag will prevent sleeping.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
 mm/filemap.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d1458ecf2f51..2be27b686518 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2435,21 +2435,27 @@ static int filemap_create_page(struct file *file,
 
 static int filemap_readahead(struct kiocb *iocb, struct file *file,
 		struct address_space *mapping, struct page *page,
-		pgoff_t last_index)
+		pgoff_t index, pgoff_t last_index)
 {
+	DEFINE_READAHEAD(ractl, file, &file->f_ra, mapping, index);
+
 	if (iocb->ki_flags & IOCB_NOIO)
 		return -EAGAIN;
-	page_cache_async_readahead(mapping, &file->f_ra, file, page,
-			page->index, last_index - page->index);
+	if (iocb->ki_flags & IOCB_NOWAIT)
+		ractl.gfp_flags &= ~__GFP_DIRECT_RECLAIM;
+
+	if (page)
+		page_cache_async_ra(&ractl, page, last_index - index);
+	else
+		page_cache_sync_ra(&ractl, last_index - index);
 	return 0;
 }
 
 static int filemap_get_pages(struct kiocb *iocb, struct iov_iter *iter,
 		struct pagevec *pvec)
 {
-	struct file *filp = iocb->ki_filp;
-	struct address_space *mapping = filp->f_mapping;
-	struct file_ra_state *ra = &filp->f_ra;
+	struct file *file = iocb->ki_filp;
+	struct address_space *mapping = file->f_mapping;
 	pgoff_t index = iocb->ki_pos >> PAGE_SHIFT;
 	pgoff_t last_index;
 	struct page *page;
@@ -2462,16 +2468,16 @@ static int filemap_get_pages(struct kiocb *iocb, struct iov_iter *iter,
 
 	filemap_get_read_batch(mapping, index, last_index, pvec);
 	if (!pagevec_count(pvec)) {
-		if (iocb->ki_flags & IOCB_NOIO)
-			return -EAGAIN;
-		page_cache_sync_readahead(mapping, ra, filp, index,
-				last_index - index);
+		err = filemap_readahead(iocb, file, mapping, NULL, index,
+					last_index);
+		if (err)
+			return err;
 		filemap_get_read_batch(mapping, index, last_index, pvec);
 	}
 	if (!pagevec_count(pvec)) {
 		if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ))
 			return -EAGAIN;
-		err = filemap_create_page(filp, mapping,
+		err = filemap_create_page(file, mapping,
 				iocb->ki_pos >> PAGE_SHIFT, pvec);
 		if (err == AOP_TRUNCATED_PAGE)
 			goto retry;
@@ -2480,7 +2486,8 @@ static int filemap_get_pages(struct kiocb *iocb, struct iov_iter *iter,
 
 	page = pvec->pages[pagevec_count(pvec) - 1];
 	if (PageReadahead(page)) {
-		err = filemap_readahead(iocb, filp, mapping, page, last_index);
+		err = filemap_readahead(iocb, file, mapping, page, page->index,
+					last_index);
 		if (err)
 			goto err;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep
  2021-07-11 15:09 [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Matthew Wilcox (Oracle)
  2021-07-11 15:09 ` [PATCH 1/2] mm/readahead: Add gfp_flags to ractl Matthew Wilcox (Oracle)
  2021-07-11 15:09 ` [PATCH 2/2] mm/filemap: Prevent waiting for memory for NOWAIT reads Matthew Wilcox (Oracle)
@ 2021-07-12  1:44 ` Jens Axboe
  2021-07-24 18:22   ` Matthew Wilcox
  2 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2021-07-12  1:44 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: io-uring, linux-mm, linux-fsdevel, linux-aio

On 7/11/21 9:09 AM, Matthew Wilcox (Oracle) wrote:
> I noticed a theoretical case where an IOCB_NOWAIT read could sleep:
> 
> filemap_get_pages
>   filemap_get_read_batch
>   page_cache_sync_readahead
>     page_cache_sync_ra
>       ondemand_readahead
>         do_page_cache_ra
>         page_cache_ra_unbounded
>           gfp_t gfp_mask = readahead_gfp_mask(mapping);
>           memalloc_nofs_save()
>           __page_cache_alloc(gfp_mask);
> 
> We're in a nofs context, so we're not going to start new IO, but we might
> wait for writeback to complete.  We generally don't want to sleep for IO,
> particularly not for IO that isn't related to us.
> 
> Jens, can you run this through your test rig and see if it makes any
> practical difference?

You bet, I'll see if I can trigger this condition and verify we're no
longer blocking on writeback. Thanks for hacking this up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/readahead: Add gfp_flags to ractl
  2021-07-11 15:09 ` [PATCH 1/2] mm/readahead: Add gfp_flags to ractl Matthew Wilcox (Oracle)
@ 2021-07-12 11:34   ` Christoph Hellwig
  2021-07-12 14:37     ` Matthew Wilcox
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-07-12 11:34 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Jens Axboe, io-uring, linux-mm, linux-fsdevel, linux-aio

On Sun, Jul 11, 2021 at 04:09:26PM +0100, Matthew Wilcox (Oracle) wrote:
> It is currently possible for an I/O request that specifies IOCB_NOWAIT
> to sleep waiting for I/O to complete in order to allocate pages for
> readahead.  In order to fix that, we need the caller to be able to
> specify the GFP flags to use for memory allocation in the rest of the
> readahead path.

The file systems also need to respect it for their bio or private
data allocation.  And be able to cope with failure, which they currently
don't have to for sleeping bio allocations.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/readahead: Add gfp_flags to ractl
  2021-07-12 11:34   ` Christoph Hellwig
@ 2021-07-12 14:37     ` Matthew Wilcox
  0 siblings, 0 replies; 8+ messages in thread
From: Matthew Wilcox @ 2021-07-12 14:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, io-uring, linux-mm, linux-fsdevel, linux-aio

On Mon, Jul 12, 2021 at 12:34:23PM +0100, Christoph Hellwig wrote:
> On Sun, Jul 11, 2021 at 04:09:26PM +0100, Matthew Wilcox (Oracle) wrote:
> > It is currently possible for an I/O request that specifies IOCB_NOWAIT
> > to sleep waiting for I/O to complete in order to allocate pages for
> > readahead.  In order to fix that, we need the caller to be able to
> > specify the GFP flags to use for memory allocation in the rest of the
> > readahead path.
> 
> The file systems also need to respect it for their bio or private
> data allocation.  And be able to cope with failure, which they currently
> don't have to for sleeping bio allocations.

Yes, they should.  This patch doesn't make that problem worse than it is
today, and gets the desired GFP flags down to the file systems, which is
needed for the full fix.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep
  2021-07-12  1:44 ` [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Jens Axboe
@ 2021-07-24 18:22   ` Matthew Wilcox
  2021-07-24 19:46     ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2021-07-24 18:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, linux-mm, linux-fsdevel, linux-aio

On Sun, Jul 11, 2021 at 07:44:07PM -0600, Jens Axboe wrote:
> On 7/11/21 9:09 AM, Matthew Wilcox (Oracle) wrote:
> > I noticed a theoretical case where an IOCB_NOWAIT read could sleep:
> > 
> > filemap_get_pages
> >   filemap_get_read_batch
> >   page_cache_sync_readahead
> >     page_cache_sync_ra
> >       ondemand_readahead
> >         do_page_cache_ra
> >         page_cache_ra_unbounded
> >           gfp_t gfp_mask = readahead_gfp_mask(mapping);
> >           memalloc_nofs_save()
> >           __page_cache_alloc(gfp_mask);
> > 
> > We're in a nofs context, so we're not going to start new IO, but we might
> > wait for writeback to complete.  We generally don't want to sleep for IO,
> > particularly not for IO that isn't related to us.
> > 
> > Jens, can you run this through your test rig and see if it makes any
> > practical difference?
> 
> You bet, I'll see if I can trigger this condition and verify we're no
> longer blocking on writeback. Thanks for hacking this up.

Did you have any success yet?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep
  2021-07-24 18:22   ` Matthew Wilcox
@ 2021-07-24 19:46     ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2021-07-24 19:46 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: io-uring, linux-mm, linux-fsdevel, linux-aio

On 7/24/21 12:22 PM, Matthew Wilcox wrote:
> On Sun, Jul 11, 2021 at 07:44:07PM -0600, Jens Axboe wrote:
>> On 7/11/21 9:09 AM, Matthew Wilcox (Oracle) wrote:
>>> I noticed a theoretical case where an IOCB_NOWAIT read could sleep:
>>>
>>> filemap_get_pages
>>>   filemap_get_read_batch
>>>   page_cache_sync_readahead
>>>     page_cache_sync_ra
>>>       ondemand_readahead
>>>         do_page_cache_ra
>>>         page_cache_ra_unbounded
>>>           gfp_t gfp_mask = readahead_gfp_mask(mapping);
>>>           memalloc_nofs_save()
>>>           __page_cache_alloc(gfp_mask);
>>>
>>> We're in a nofs context, so we're not going to start new IO, but we might
>>> wait for writeback to complete.  We generally don't want to sleep for IO,
>>> particularly not for IO that isn't related to us.
>>>
>>> Jens, can you run this through your test rig and see if it makes any
>>> practical difference?
>>
>> You bet, I'll see if I can trigger this condition and verify we're no
>> longer blocking on writeback. Thanks for hacking this up.
> 
> Did you have any success yet?

Sorry forgot to report back - I did run some testing last week, and
didn't manage to make it hit the blocking condition. Did various
read/write mix on the same file, made sure there was memory pressure,
etc. I'll give it another go, please let me know if you have an idea on
how to make this easier to hit... I know it's one of those things that
you hit all the time in certain workloads (hence why I would love to see
it get fixed), but I just didn't manage to provoke it when I tried.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-24 19:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-07-11 15:09 [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Matthew Wilcox (Oracle)
2021-07-11 15:09 ` [PATCH 1/2] mm/readahead: Add gfp_flags to ractl Matthew Wilcox (Oracle)
2021-07-12 11:34   ` Christoph Hellwig
2021-07-12 14:37     ` Matthew Wilcox
2021-07-11 15:09 ` [PATCH 2/2] mm/filemap: Prevent waiting for memory for NOWAIT reads Matthew Wilcox (Oracle)
2021-07-12  1:44 ` [PATCH 0/2] Close a hole where IOCB_NOWAIT reads could sleep Jens Axboe
2021-07-24 18:22   ` Matthew Wilcox
2021-07-24 19:46     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox