mm/page_alloc: introduce vm.percpu_pagelist_high_fraction

This introduces a new sysctl vm.percpu_pagelist_high_fraction.  It is
similar to the old vm.percpu_pagelist_fraction.  The old sysctl increased
both pcp->batch and pcp->high with the higher pcp->high potentially
reducing zone->lock contention.  However, the higher pcp->batch value also
potentially increased allocation latency while the PCP was refilled.  This
sysctl only adjusts pcp->high so that zone->lock contention is potentially
reduced but allocation latency during a PCP refill remains the same.

  # grep -E "high:|batch" /proc/zoneinfo | tail -2
              high:  649
              batch: 63

  # sysctl vm.percpu_pagelist_high_fraction=8
  # grep -E "high:|batch" /proc/zoneinfo | tail -2
              high:  35071
              batch: 63

  # sysctl vm.percpu_pagelist_high_fraction=64
              high:  4383
              batch: 63

  # sysctl vm.percpu_pagelist_high_fraction=0
              high:  649
              batch: 63

[mgorman@techsingularity.net: fix documentation]
  Link: https://lkml.kernel.org/r/20210528151010.GQ30378@techsingularity.net

Link: https://lkml.kernel.org/r/20210525080119.5455-7-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Mel Gorman
2021-06-28 19:42:24 -07:00
committed by Linus Torvalds
parent c49c2c47da
commit 74f4482209
4 changed files with 94 additions and 7 deletions

View File

@@ -64,6 +64,7 @@ Currently, these files are in /proc/sys/vm:
- overcommit_ratio
- page-cluster
- panic_on_oom
- percpu_pagelist_high_fraction
- stat_interval
- stat_refresh
- numa_stat
@@ -789,6 +790,26 @@ panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.
percpu_pagelist_high_fraction
=============================
This is the fraction of pages in each zone that are can be stored to
per-cpu page lists. It is an upper boundary that is divided depending
on the number of online CPUs. The min value for this is 8 which means
that we do not allow more than 1/8th of pages in each zone to be stored
on per-cpu page lists. This entry only changes the value of hot per-cpu
page lists. A user can specify a number like 100 to allocate 1/100th of
each zone between per-cpu lists.
The batch value of each per-cpu page list remains the same regardless of
the value of the high fraction so allocation latencies are unaffected.
The initial value is zero. Kernel uses this value to set the high pcp->high
mark based on the low watermark for the zone and the number of local
online CPUs. If the user writes '0' to this sysctl, it will revert to
this default behavior.
stat_interval
=============