[LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8
Harshvardhan Jha
harshvardhan.j.jha@oracle.com
Tue Nov 19 09:25:30 CET 2024
Hi there,
I've been getting test failures on the memcg_stat_rss testcase for
mainline 6.12 kernels with 3 tests failing and one being broken.
Running tests.......
<<<test_start>>>
tag=memcg_stat_rss stime=1732003500
cmdline="memcg_stat_rss.sh"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh
memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp
6.12.0-master.20241021.el9.v1.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 21
06:24:22 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux
memcg_stat_rss 1 TINFO: Using
/tempdir/ltp-Y4AEUmKVIE/LTP_memcg_stat_rss.kEhD0QvvMw as tmpdir (xfs
filesystem)
memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s
memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy
to 0 failed
memcg_stat_rss 1 TINFO: Setting shmmax
memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240
memcg_stat_rss 1 TINFO: Warming up pid: 9367
memcg_stat_rss 1 TINFO: Process is still here after warm up: 9367
memcg_stat_rss 1 TFAIL: rss is 0, 266240 expected
memcg_stat_rss 2 TINFO: Running memcg_process --mmap-file -s 4096
memcg_stat_rss 2 TINFO: Warming up pid: 9383
memcg_stat_rss 2 TINFO: Process is still here after warm up: 9383
memcg_stat_rss 2 TPASS: rss is 0 as expected
memcg_stat_rss 3 TINFO: Running memcg_process --shm -k 3 -s 4096
memcg_stat_rss 3 TINFO: Warming up pid: 9446
memcg_stat_rss 3 TINFO: Process is still here after warm up: 9446
memcg_stat_rss 3 TPASS: rss is 0 as expected
memcg_stat_rss 4 TINFO: Running memcg_process --mmap-anon --mmap-file
--shm -s 266240
memcg_stat_rss 4 TINFO: Warming up pid: 9462
memcg_stat_rss 4 TINFO: Process is still here after warm up: 9462
memcg_stat_rss 4 TPASS: rss is 266240 as expected
memcg_stat_rss 5 TINFO: Running memcg_process --mmap-lock1 -s 266240
memcg_stat_rss 5 TINFO: Warming up pid: 9479
memcg_stat_rss 5 TINFO: Process is still here after warm up: 9479
memcg_stat_rss 5 TFAIL: rss is 0, 266240 expected
memcg_stat_rss 6 TINFO: Running memcg_process --mmap-anon -s 266240
memcg_stat_rss 6 TINFO: Warming up pid: 9495
memcg_stat_rss 6 TINFO: Process is still here after warm up: 9495
memcg_stat_rss 6 TFAIL: rss is 0, 266240 expected
memcg_stat_rss 6 TBROK: timed out on memory.usage_in_bytes 4096 266240
266240
/opt/ltp-20240930/testcases/bin/tst_test.sh: line 158: 9495
Killed memcg_process "$@" (wd:
/sys/fs/cgroup/memory/ltp/test-9308/ltp_9308)
Summary:
passed 3
failed 3
broken 1
skipped 0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=17 termination_type=exited termination_id=3 corefile=no
cutime=13 cstime=58
<<<test_end>>>
INFO: ltp-pan reported some tests FAIL
LTP Version: 20240930
I'm not sure whether this error is due to the kernel or the testcase
being outdated. I know that since cgroup v2 is the default upstream and
cgroup v1 is now a legacy option, this specific testcase is not
particularly higher in the priority list, but just to be sure, I wanted
to verify this from your side. Please let me know whether this error is
coming due to the testcase being outdated or this in fact is a valid
kernel error.
I ran a bisect on memcg_stat_rss test upon mainline kernels and saw the
bisect range narrow down between 6.7 and 6.8 which further isolated to:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7d7ef0a4686abe43cd76a141b340a348f45ecdf2
This commit was part of a 5 patch series and I wasn't able to revert it
on 6.12 without getting a series of conflicts.
So, what I did was checkout the SHA before this patch series
4a3bfbd1699e2306731809d50d480634012ed4de and after the patch series
7d7ef0a4686abe43cd76a141b340a348f45ecdf2 and ran this test.
The machine had 32GB Ram and 4CPUs.
The steps to reproduce this are:
#!/bin/bash
# After setting default kernel to the desired one
if ! grep -q "unified_cgroup_hierarchy=0" /proc/cmdline; then
sudo grubby --update-kernel DEFAULT
--args="systemd.unified_cgroup_hierarchy=0"
sudo grubby --update-kernel DEFAULT
--args="systemd.legacy_systemd_cgroup_controller"
sudo grubby --update-kernel DEFAULT --args selinux=0
sudo sed -i "/^SELINUX=/s/=.*/=disabled/" /etc/selinux/config
sudo reboot
fi
cd /opt/ltp
rm -rf /tmpdir
mkdir /tempdir
./runltp -d /tempdir -s memcg_stat_rss
The results obtained were:
Pre bisect culprit (4a3bfbd1699e2306731809d50d480634012ed4de):
<<<test_start>>>
tag=memcg_stat_rss stime=1731754078
cmdline="memcg_stat_rss.sh"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh
memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp
6.7.0-masterpre.2024111.el9.rc1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 15
11:56:10 PST 2024 x86_64 x86_64 x86_64 GNU/Linux
memcg_stat_rss 1 TINFO: Using
/tempdir/ltp-SzE9ADK6MM/LTP_memcg_stat_rss.6op28sMXO2 as tmpdir (xfs
filesystem)
memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s
memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy
to 0 failed
memcg_stat_rss 1 TINFO: Setting shmmax
memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240
memcg_stat_rss 1 TINFO: Warming up pid: 34237
memcg_stat_rss 1 TINFO: Process is still here after warm up: 34237
memcg_stat_rss 1 TPASS: rss is 266240 as expected
memcg_stat_rss 1 TBROK: timed out on memory.usage_in_bytes 4096 266240
266240
/opt/ltp-20240930/testcases/bin/tst_test.sh: line 158: 34237
Killed memcg_process "$@" (wd:
/sys/fs/cgroup/memory/ltp/test-34180/ltp_34180)
Summary:
passed 1
failed 0
broken 1
skipped 0
warnings 0
<<<execution_status>>>
Post bisect culprit(7d7ef0a4686abe43cd76a141b340a348f45ecdf2):
<<<test_start>>>
tag=memcg_stat_rss stime=1731755339
cmdline="memcg_stat_rss.sh"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh
memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp
6.7.0-masterpost.2024111.el9.rc1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov
15 11:55:41 PST 2024 x86_64 x86_64 x86_64 GNU/Linux
memcg_stat_rss 1 TINFO: Using
/tempdir/ltp-G6cge4CkrR/LTP_memcg_stat_rss.1zrm6X02CO as tmpdir (xfs
filesystem)
memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s
memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy
to 0 failed
memcg_stat_rss 1 TINFO: Setting shmmax
memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240
memcg_stat_rss 1 TINFO: Warming up pid: 9083
memcg_stat_rss 1 TINFO: Process is still here after warm up: 9083
memcg_stat_rss 1 TFAIL: rss is 0, 266240 expected
memcg_stat_rss 1 TBROK: timed out on memory.usage_in_bytes 4096 266240
266240
/opt/ltp-20240930/testcases/bin/tst_test.sh: line 158: 9083
Killed memcg_process "$@" (wd:
/sys/fs/cgroup/memory/ltp/test-9024/ltp_9024)
Summary:
passed 0
failed 1
broken 1
skipped 0
warnings 0
<<<execution_status>>>
Thanks & Regards,
Harshvardhan
More information about the ltp
mailing list