[patch] mm, migrate: increment fail count on ENOMEM

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[patch] mm, migrate: increment fail count on ENOMEM

David Rientjes
If page migration fails due to -ENOMEM, nr_failed should still be
incremented for proper statistics.

This was encountered recently when all page migration vmstats showed 0,
and inferred that migrate_pages() was never called, although in reality
the first page migration failed because compaction_alloc() failed to find
a migration target.

This patch increments nr_failed so the vmstat is properly accounted on
ENOMEM.

Signed-off-by: David Rientjes <[hidden email]>
---
 mm/migrate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/migrate.c b/mm/migrate.c
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1171,6 +1171,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 
  switch(rc) {
  case -ENOMEM:
+ nr_failed++;
  goto out;
  case -EAGAIN:
  retry++;
Reply | Threaded
Open this post in threaded view
|

Re: [patch] mm, migrate: increment fail count on ENOMEM

Michal Hocko-4
On Thu 19-05-16 15:11:23, David Rientjes wrote:

> If page migration fails due to -ENOMEM, nr_failed should still be
> incremented for proper statistics.
>
> This was encountered recently when all page migration vmstats showed 0,
> and inferred that migrate_pages() was never called, although in reality
> the first page migration failed because compaction_alloc() failed to find
> a migration target.
>
> This patch increments nr_failed so the vmstat is properly accounted on
> ENOMEM.
>
> Signed-off-by: David Rientjes <[hidden email]>

Acked-by: Michal Hocko <[hidden email]>

One question though

> ---
>  mm/migrate.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1171,6 +1171,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  
>   switch(rc) {
>   case -ENOMEM:
> + nr_failed++;
>   goto out;
>   case -EAGAIN:
>   retry++;

Why don't we need also to count also retries?
---
diff --git a/mm/migrate.c b/mm/migrate.c
index 53ab6398e7a2..ef9c5211ae3c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1190,9 +1190,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
  }
  }
  }
+out:
  nr_failed += retry;
  rc = nr_failed;
-out:
  if (nr_succeeded)
  count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
  if (nr_failed)
--
Michal Hocko
SUSE Labs
Reply | Threaded
Open this post in threaded view
|

Re: [patch] mm, migrate: increment fail count on ENOMEM

Vlastimil Babka
On 05/20/2016 03:06 PM, Michal Hocko wrote:

> On Thu 19-05-16 15:11:23, David Rientjes wrote:
>> If page migration fails due to -ENOMEM, nr_failed should still be
>> incremented for proper statistics.
>>
>> This was encountered recently when all page migration vmstats showed 0,
>> and inferred that migrate_pages() was never called, although in reality
>> the first page migration failed because compaction_alloc() failed to find
>> a migration target.
>>
>> This patch increments nr_failed so the vmstat is properly accounted on
>> ENOMEM.
>>
>> Signed-off-by: David Rientjes <[hidden email]>
>
> Acked-by: Michal Hocko <[hidden email]>

Acked-by: Vlastimil Babka <[hidden email]>

> One question though
>
>> ---
>>   mm/migrate.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1171,6 +1171,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>
>>   switch(rc) {
>>   case -ENOMEM:
>> + nr_failed++;
>>   goto out;
>>   case -EAGAIN:
>>   retry++;
>
> Why don't we need also to count also retries?

We could, but not like you suggest.

> ---
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 53ab6398e7a2..ef9c5211ae3c 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1190,9 +1190,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   }
>   }
>   }
> +out:
>   nr_failed += retry;
>   rc = nr_failed;

This overwrites rc == -ENOMEM, which at least compaction needs to
recognize. But we could duplicate "nr_failed += retry" in the case -ENOMEM.

> -out:
>   if (nr_succeeded)
>   count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
>   if (nr_failed)
>

Reply | Threaded
Open this post in threaded view
|

Re: [patch] mm, migrate: increment fail count on ENOMEM

Michal Hocko-4
On Fri 20-05-16 15:19:12, Vlastimil Babka wrote:
> On 05/20/2016 03:06 PM, Michal Hocko wrote:
[...]

> > Why don't we need also to count also retries?
>
> We could, but not like you suggest.
>
> > ---
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 53ab6398e7a2..ef9c5211ae3c 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1190,9 +1190,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >   }
> >   }
> >   }
> > +out:
> >   nr_failed += retry;
> >   rc = nr_failed;
>
> This overwrites rc == -ENOMEM, which at least compaction needs to recognize.
> But we could duplicate "nr_failed += retry" in the case -ENOMEM.

Right you are. So we should do
---
diff --git a/mm/migrate.c b/mm/migrate.c
index 53ab6398e7a2..123fed94022b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1171,6 +1171,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 
  switch(rc) {
  case -ENOMEM:
+ nr_failed += retry + 1;
  goto out;
  case -EAGAIN:
  retry++;
       

--
Michal Hocko
SUSE Labs
Reply | Threaded
Open this post in threaded view
|

Re: [patch] mm, migrate: increment fail count on ENOMEM

akpm
On Fri, 20 May 2016 15:31:21 +0200 Michal Hocko <[hidden email]> wrote:

> On Fri 20-05-16 15:19:12, Vlastimil Babka wrote:
> > On 05/20/2016 03:06 PM, Michal Hocko wrote:
> [...]
> > > Why don't we need also to count also retries?
> >
> > We could, but not like you suggest.
> >
> > > ---
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index 53ab6398e7a2..ef9c5211ae3c 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1190,9 +1190,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > >   }
> > >   }
> > >   }
> > > +out:
> > >   nr_failed += retry;
> > >   rc = nr_failed;
> >
> > This overwrites rc == -ENOMEM, which at least compaction needs to recognize.
> > But we could duplicate "nr_failed += retry" in the case -ENOMEM.
>
> Right you are. So we should do
> ---
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 53ab6398e7a2..123fed94022b 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1171,6 +1171,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  
>   switch(rc) {
>   case -ENOMEM:
> + nr_failed += retry + 1;
>   goto out;
>   case -EAGAIN:
>   retry++;
>
>

argh, this was lost.  Please resend as a real patch sometime?

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: [patch] mm, migrate: increment fail count on ENOMEM

Hugh Dickins-2
On Mon, 23 May 2016, Andrew Morton wrote:

> On Fri, 20 May 2016 15:31:21 +0200 Michal Hocko <[hidden email]> wrote:
> > On Fri 20-05-16 15:19:12, Vlastimil Babka wrote:
> > > On 05/20/2016 03:06 PM, Michal Hocko wrote:
> > [...]
> > > > Why don't we need also to count also retries?
> > >
> > > We could, but not like you suggest.
> > >
> > > > ---
> > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > index 53ab6398e7a2..ef9c5211ae3c 100644
> > > > --- a/mm/migrate.c
> > > > +++ b/mm/migrate.c
> > > > @@ -1190,9 +1190,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > > >   }
> > > >   }
> > > >   }
> > > > +out:
> > > >   nr_failed += retry;
> > > >   rc = nr_failed;
> > >
> > > This overwrites rc == -ENOMEM, which at least compaction needs to recognize.
> > > But we could duplicate "nr_failed += retry" in the case -ENOMEM.
> >
> > Right you are. So we should do
> > ---
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 53ab6398e7a2..123fed94022b 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1171,6 +1171,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >  
> >   switch(rc) {
> >   case -ENOMEM:
> > + nr_failed += retry + 1;
> >   goto out;
> >   case -EAGAIN:
> >   retry++;
> >
> >
>
> argh, this was lost.  Please resend as a real patch sometime?

It's not correct.  "retry" is reset to 0 each time around the
loop, and it's only a meaningful number to add on to nr_failed, in
the case when we've gone through the whole list: not in this "goto
out" case.  We could add another loop to count how many are left
when we hit -ENOMEM, and add that on to nr_failed; but I'm not
convinced that it's worth the bother.

Hugh
Reply | Threaded
Open this post in threaded view
|

Re: [patch] mm, migrate: increment fail count on ENOMEM

Michal Hocko-4
On Mon 23-05-16 16:32:56, Hugh Dickins wrote:

> On Mon, 23 May 2016, Andrew Morton wrote:
> > On Fri, 20 May 2016 15:31:21 +0200 Michal Hocko <[hidden email]> wrote:
> > > On Fri 20-05-16 15:19:12, Vlastimil Babka wrote:
> > > > On 05/20/2016 03:06 PM, Michal Hocko wrote:
> > > [...]
> > > > > Why don't we need also to count also retries?
> > > >
> > > > We could, but not like you suggest.
> > > >
> > > > > ---
> > > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > > index 53ab6398e7a2..ef9c5211ae3c 100644
> > > > > --- a/mm/migrate.c
> > > > > +++ b/mm/migrate.c
> > > > > @@ -1190,9 +1190,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > > > >   }
> > > > >   }
> > > > >   }
> > > > > +out:
> > > > >   nr_failed += retry;
> > > > >   rc = nr_failed;
> > > >
> > > > This overwrites rc == -ENOMEM, which at least compaction needs to recognize.
> > > > But we could duplicate "nr_failed += retry" in the case -ENOMEM.
> > >
> > > Right you are. So we should do
> > > ---
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index 53ab6398e7a2..123fed94022b 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1171,6 +1171,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > >  
> > >   switch(rc) {
> > >   case -ENOMEM:
> > > + nr_failed += retry + 1;
> > >   goto out;
> > >   case -EAGAIN:
> > >   retry++;
> > >
> > >
> >
> > argh, this was lost.  Please resend as a real patch sometime?
>
> It's not correct.  "retry" is reset to 0 each time around the
> loop, and it's only a meaningful number to add on to nr_failed, in
> the case when we've gone through the whole list: not in this "goto
> out" case.

You are right! I've missed that, my bad.

> We could add another loop to count how many are left
> when we hit -ENOMEM, and add that on to nr_failed; but I'm not
> convinced that it's worth the bother.

Agreed!

Sorry about the noise!
--
Michal Hocko
SUSE Labs