[LTP] [PATCH v4 4/5] controllers/memcg: increase memory limit in subgroup charge
Krzysztof Kozlowski
krzysztof.kozlowski@canonical.com
Tue Jul 13 11:48:09 CEST 2021
On 13/07/2021 11:40, Richard Palethorpe wrote:
> Hello Krzysztof,
>
> Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> writes:
>
>> The memcg_subgroup_charge was failing on kernel v5.8 in around 10% cases
>> with:
>>
>> memcg_subgroup_charge 1 TINFO: Running memcg_process --mmap-anon -s 135168
>> memcg_subgroup_charge 1 TINFO: Warming up pid: 19289
>> memcg_subgroup_charge 1 TINFO: Process is still here after warm up: 19289
>> memcg_subgroup_charge 1 TFAIL: rss is 0, 135168 expected
>> memcg_subgroup_charge 1 TPASS: rss is 0 as expected
>>
>> In dmesg one could see that OOM killer killed the process even though
>> group memory limit was matching the usage:
>>
>> memcg_process invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
>> CPU: 4 PID: 19289 Comm: memcg_process Not tainted 5.8.0-1031-oracle #32~20.04.2-Ubuntu
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.4.1 12/03/2020
>> ...
>> memory: usage 132kB, limit 132kB, failcnt 9
>> memory+swap: usage 132kB, limit 9007199254740988kB, failcnt 0
>> kmem: usage 4kB, limit 9007199254740988kB, failcnt 0
>> ...
>> Tasks state (memory values in pages):
>> [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
>> [ 19289] 0 19289 669 389 40960 0 0 memcg_process
>> oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/ltp_19257,task_memcg=/ltp_19257,task=memcg_process,pid=19289,uid=0
>> Memory cgroup out of memory: Killed process 19289 (memcg_process) total-vm:2676kB, anon-rss:84kB, file-rss:1468kB, shmem-rss:4kB, UID:0 pgtables:40kB oom_score_adj:0
>> oom_reaper: reaped process 19289 (memcg_process), now anon-rss:0kB, file-rss:0kB, shmem-rss:4kB
>>
>> It seems actual kernel memory usage by a given group depends on number
>> of CPUs, where at least one page is allocated per-cpu beside regular
>> (expected) allocation. Fix the test on machines with more CPUs by
>> including this per-CPU memory in group limits, plus some slack of 4
>> PAGES. Increase also memory allocation from 32 to 64 pages to be more
>> distinctive from kernel per-CPU memory.
>
> Actually I think it is 66 pages? Because PAGESIZES=pagesize*33.
>
Yes, right. Maybe this could be fixed when applying - no need for resend.
Best regards,
Krzysztof
More information about the ltp
mailing list