[LTP] [RFC] enable OOM protection for the library and test process?

Li Wang liwang@redhat.com
Mon Dec 13 09:03:50 CET 2021


Hi All,

As we observed that oom tests occasionally ended with TBROK (Test killed)
on small
RAM system, the reason seems test process(test_pid) get killed early than
the expected
victim process so that can't report the status correctly.

I'm thinking maybe we can purposely make the OOM ignore test
process(test_pid)
and the main process? (achieve this only in mem library for OOM test)

e.g.

set oom_score_adj to -1000 for pid-305071 and main-process

oom03:
main ---> tst_run_tcases --> ... --> fork_testrun
   (pid 305071)    testrun  --> run_tests --> ... --> testoom --> oom()
            (pid 305072)    child_alloc --> child_alloc_thread --> alloc_mem


=============

3 cmdline="oom03"
...
10 mem.c:218: TINFO: start normal OOM testing.
11 mem.c:140: TINFO: expected victim is 305072.

12 mem.c:39: TINFO: thread (7fe173d1a700), allocating 3221225472 bytes.
13 mem.c:39: TINFO: thread (7fe173d1a700), allocating 3221225472 bytes.

14 tst_test.c:1410: TINFO: If you are running on slow machine, try
exporting LTP_TIMEOUT_MUL > 1
15 tst_test.c:1411: TBROK: Test killed! (timeout?)

==========

[ 1117.558867] Tasks state (memory values in pages):
[ 1117.559373] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
swapents oom_score_adj name
[ 1117.560167] [ 305071]     0 305071     2215       31    61440        4
          0 oom03
[ 1117.560889] [ 305072]     0 305072 1577128 259389 10326016 1019452 0
oom03
...

[ 1117.596510]
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/ltp/test-305071,task_memcg=/ltp/test-305071,task=oom03,pid=305071,uid=0

[ 1117.597963] Memory cgroup out of memory: Killed process 305071 (oom03)
total-vm:8860kB, anon-rss:124kB, file-rss:0kB, shmem-rss:0kB, UID:0
pgtables:60kB oom_score_adj:0

=============

# free -h
              total        used        free      shared  buff/cache
available
Mem:          3.6Gi       270Mi       2.3Gi        18Mi       1.1Gi
3.3Gi
Swap:         4.0Gi          0B       4.0Gi

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           2
NUMA node(s):        1


-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20211213/24c96f51/attachment.htm>


More information about the ltp mailing list