[LTP] [PATCH v3] lib: memutils: respect minimum memory watermark when polluting memory

Krzysztof Kozlowski krzysztof.kozlowski@canonical.com
Thu Oct 21 09:55:20 CEST 2021


On 21/10/2021 09:36, Krzysztof Kozlowski wrote:
> On 21/10/2021 09:21, Li Wang wrote:
>>
>>
>> On Wed, Oct 20, 2021 at 5:14 PM Krzysztof Kozlowski
>> <krzysztof.kozlowski@canonical.com
>> <mailto:krzysztof.kozlowski@canonical.com>> wrote:
>>
>>     Previous fix for an out-of-memory killer killing ioctl_sg01 process
>>     in commit 4d2e3d44fad5 ("lib: memutils: don't pollute
>>     entire system memory to avoid OoM") was not fully effective.  While it
>>     covers most of the cases, an ARM64 machine with 128 GB of memory, 64 kB
>>     page size and v5.11 kernel hit it again.  Polluting the memory fails
>>     with OoM:
>>
>>       LTP: starting ioctl_sg01
>>       ioctl_sg01 invoked oom-killer:
>>     gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0,
>>     oom_score_adj=0
>>       ...
>>       Mem-Info:
>>       active_anon:309 inactive_anon:1964781 isolated_anon:0
>>                       active_file:94 inactive_file:0 isolated_file:0
>>                       unevictable:305 dirty:0 writeback:0
>>                       slab_reclaimable:1510 slab_unreclaimable:5012
>>                       mapped:115 shmem:339 pagetables:463 bounce:0
>>                       free:112043 free_pcp:1 free_cma:3159
>>       Node 0 active_anon:19776kB inactive_anon:125745984kB
>>     active_file:6016kB inactive_file:0kB unevictable:19520kB ...
>>       Node 0 DMA free:710656kB min:205120kB low:256384kB high:307648kB
>>     reserved_highatomic:0KB active_anon:0kB inactive_anon:3332032kB ...
>>       lowmem_reserve[]: 0 0 7908 7908 7908
>>       Node 0 Normal free:6460096kB min:6463168kB low:8078912kB
>>     high:9694656kB reserved_highatomic:0KB active_anon:19776kB
>>     inactive_anon:122413952kB ...
>>       lowmem_reserve[]: 0 0 0 0 0
>>
>>     The important part are details of memory on Node 0 in Normal zone:
>>     1. free memory: 6460096 kB
>>     2. min (minimum watermark): 6463168 kB
>>
>>     Parse the /proc/sys/vm/min_free_kbytes which contains the free
>>     memory level used as minimum watermark (triggering OoM killer).
>>
>>     Signed-off-by: Krzysztof Kozlowski
>>     <krzysztof.kozlowski@canonical.com
>>     <mailto:krzysztof.kozlowski@canonical.com>>
>>
>>     ---
>>
>>     Changes since v2:
>>     1. Use /proc/sys/vm/min_free_kbytes instead of parsing zoneinfo, thanks
>>        tgo Liu Xinpeng.
>>
>>     Changes since v1:
>>     1. Add static and rename to count_min_pages().
>>     ---
>>      lib/tst_memutils.c | 8 +++++++-
>>      1 file changed, 7 insertions(+), 1 deletion(-)
>>
>>     diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c
>>     index af132bcc6c24..df53c542d239 100644
>>     --- a/lib/tst_memutils.c
>>     +++ b/lib/tst_memutils.c
>>     @@ -16,12 +16,18 @@
>>      void tst_pollute_memory(size_t maxsize, int fillchar)
>>      {
>>             size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE;
>>     +       unsigned long min_free;
>>             void **map_blocks;
>>             struct sysinfo info;
>>
>>     +       SAFE_FILE_SCANF("/proc/sys/vm/min_free_kbytes", "%lu",
>>     &min_free);
>>     +       min_free *= 1024;
>>     +       /* Apply a margin because we cannot get below "min" watermark */
>>     +       min_free += min_free / 10;
>>     +
>>             SAFE_SYSINFO(&info);
>>             safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 *
>>     1024);
>>     -       safety = MAX(safety, (info.freeram / 64));
>>     +       safety = MAX(safety, min_free);
>>
>>
>> Therically this is correct, and I believe it will work on your tested
>> machine.
>>
>> But my concern is ioctl_sg01 still fails on the special system which
>> MemAvai < MemFree.
>>
>> Just like the one Xinpeng mentioned before:
>> https://lists.linux.it/pipermail/ltp/2021-January/020817.html
>> <https://lists.linux.it/pipermail/ltp/2021-January/020817.html>
>>
>> [root@test-env-nm05-compute-14e5e72e38
>> <mailto:root@test-env-nm05-compute-14e5e72e38>~]# cat /proc/meminfo
>>
>> MemTotal:       526997420 kB
>> MemFree:        520224908 kB
>> MemAvailable:   519936744 kB
>> ...
>>
>> [root@test-env-nm05-compute-14e5e72e38 <mailto:root@test-env-nm05-compute-14e5e72e38> ~]# cat  /proc/sys/vm/min_free_kbytes
>> 90112
>>
>>
>> There even reserve the safety to the 128MB, still less than the gap
>> between MemFree and MemAvailable. 
>>
>> MemFree (520224908 kB) - MemAvailable (520224908 kB) = 288164 kB  > safety
> 
> I don't have such case and I am not sure it is reasonable.
> 
> As mentioned in the thread there it looks unusual to have less available
> memory than free. Maybe the system has some weird memory accounting
> because MemAvailable is counted from MemFree by adding memory which can
> be reclaimed. When adding a non-negative number, you should not end up
> with lower MemAvailable than MemFree. :)
> 
> Maybe that's the reason why that patch was not accepted - the system is
> not vanilla, not common?

OK, I found a possible explanation (on vanilla kernel) - the
totalreserve_pages. This is the only subtraction from free memory when
counting available. This could happen if someone was setting sysctl
lowmem_reserve_ratio or min_free_kbytes.

When setting min_free_kbytes, this will be reflected in
/proc/sys/vm/min_free_kbytes, so we are good.

When setting vm.lowmem_reserve_ratio, this will be missed by my patch
and MemAvailable could be lower than MemFree.

I'll send a v4.

Best regards,
Krzysztof


More information about the ltp mailing list