[LTP] [PATCH v3] lib: memutils: respect minimum memory watermark when polluting memory
Krzysztof Kozlowski
krzysztof.kozlowski@canonical.com
Thu Oct 21 09:55:20 CEST 2021
On 21/10/2021 09:36, Krzysztof Kozlowski wrote:
> On 21/10/2021 09:21, Li Wang wrote:
>>
>>
>> On Wed, Oct 20, 2021 at 5:14 PM Krzysztof Kozlowski
>> <krzysztof.kozlowski@canonical.com
>> <mailto:krzysztof.kozlowski@canonical.com>> wrote:
>>
>> Previous fix for an out-of-memory killer killing ioctl_sg01 process
>> in commit 4d2e3d44fad5 ("lib: memutils: don't pollute
>> entire system memory to avoid OoM") was not fully effective. While it
>> covers most of the cases, an ARM64 machine with 128 GB of memory, 64 kB
>> page size and v5.11 kernel hit it again. Polluting the memory fails
>> with OoM:
>>
>> LTP: starting ioctl_sg01
>> ioctl_sg01 invoked oom-killer:
>> gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0,
>> oom_score_adj=0
>> ...
>> Mem-Info:
>> active_anon:309 inactive_anon:1964781 isolated_anon:0
>> active_file:94 inactive_file:0 isolated_file:0
>> unevictable:305 dirty:0 writeback:0
>> slab_reclaimable:1510 slab_unreclaimable:5012
>> mapped:115 shmem:339 pagetables:463 bounce:0
>> free:112043 free_pcp:1 free_cma:3159
>> Node 0 active_anon:19776kB inactive_anon:125745984kB
>> active_file:6016kB inactive_file:0kB unevictable:19520kB ...
>> Node 0 DMA free:710656kB min:205120kB low:256384kB high:307648kB
>> reserved_highatomic:0KB active_anon:0kB inactive_anon:3332032kB ...
>> lowmem_reserve[]: 0 0 7908 7908 7908
>> Node 0 Normal free:6460096kB min:6463168kB low:8078912kB
>> high:9694656kB reserved_highatomic:0KB active_anon:19776kB
>> inactive_anon:122413952kB ...
>> lowmem_reserve[]: 0 0 0 0 0
>>
>> The important part are details of memory on Node 0 in Normal zone:
>> 1. free memory: 6460096 kB
>> 2. min (minimum watermark): 6463168 kB
>>
>> Parse the /proc/sys/vm/min_free_kbytes which contains the free
>> memory level used as minimum watermark (triggering OoM killer).
>>
>> Signed-off-by: Krzysztof Kozlowski
>> <krzysztof.kozlowski@canonical.com
>> <mailto:krzysztof.kozlowski@canonical.com>>
>>
>> ---
>>
>> Changes since v2:
>> 1. Use /proc/sys/vm/min_free_kbytes instead of parsing zoneinfo, thanks
>> tgo Liu Xinpeng.
>>
>> Changes since v1:
>> 1. Add static and rename to count_min_pages().
>> ---
>> lib/tst_memutils.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c
>> index af132bcc6c24..df53c542d239 100644
>> --- a/lib/tst_memutils.c
>> +++ b/lib/tst_memutils.c
>> @@ -16,12 +16,18 @@
>> void tst_pollute_memory(size_t maxsize, int fillchar)
>> {
>> size_t i, map_count = 0, safety = 0, blocksize = BLOCKSIZE;
>> + unsigned long min_free;
>> void **map_blocks;
>> struct sysinfo info;
>>
>> + SAFE_FILE_SCANF("/proc/sys/vm/min_free_kbytes", "%lu",
>> &min_free);
>> + min_free *= 1024;
>> + /* Apply a margin because we cannot get below "min" watermark */
>> + min_free += min_free / 10;
>> +
>> SAFE_SYSINFO(&info);
>> safety = MAX(4096 * SAFE_SYSCONF(_SC_PAGESIZE), 128 * 1024 *
>> 1024);
>> - safety = MAX(safety, (info.freeram / 64));
>> + safety = MAX(safety, min_free);
>>
>>
>> Therically this is correct, and I believe it will work on your tested
>> machine.
>>
>> But my concern is ioctl_sg01 still fails on the special system which
>> MemAvai < MemFree.
>>
>> Just like the one Xinpeng mentioned before:
>> https://lists.linux.it/pipermail/ltp/2021-January/020817.html
>> <https://lists.linux.it/pipermail/ltp/2021-January/020817.html>
>>
>> [root@test-env-nm05-compute-14e5e72e38
>> <mailto:root@test-env-nm05-compute-14e5e72e38>~]# cat /proc/meminfo
>>
>> MemTotal: 526997420 kB
>> MemFree: 520224908 kB
>> MemAvailable: 519936744 kB
>> ...
>>
>> [root@test-env-nm05-compute-14e5e72e38 <mailto:root@test-env-nm05-compute-14e5e72e38> ~]# cat /proc/sys/vm/min_free_kbytes
>> 90112
>>
>>
>> There even reserve the safety to the 128MB, still less than the gap
>> between MemFree and MemAvailable.
>>
>> MemFree (520224908 kB) - MemAvailable (520224908 kB) = 288164 kB > safety
>
> I don't have such case and I am not sure it is reasonable.
>
> As mentioned in the thread there it looks unusual to have less available
> memory than free. Maybe the system has some weird memory accounting
> because MemAvailable is counted from MemFree by adding memory which can
> be reclaimed. When adding a non-negative number, you should not end up
> with lower MemAvailable than MemFree. :)
>
> Maybe that's the reason why that patch was not accepted - the system is
> not vanilla, not common?
OK, I found a possible explanation (on vanilla kernel) - the
totalreserve_pages. This is the only subtraction from free memory when
counting available. This could happen if someone was setting sysctl
lowmem_reserve_ratio or min_free_kbytes.
When setting min_free_kbytes, this will be reflected in
/proc/sys/vm/min_free_kbytes, so we are good.
When setting vm.lowmem_reserve_ratio, this will be missed by my patch
and MemAvailable could be lower than MemFree.
I'll send a v4.
Best regards,
Krzysztof
More information about the ltp
mailing list