[LTP] [PATCH] Fix memcontrol04 test failures on PowerPC64 architecture.

Sachin Sant sachinp@linux.ibm.com
Thu Apr 23 16:24:45 CEST 2026



On 10/04/26 5:32 pm, Andrea Cervesato via ltp wrote:
> Hi Pavithra,
>
> please take a look at this review before proceeding. I think this
> patch is not needed, but at least you can try and see if other
> people will spot something useful.
>
> https://github.com/acerv/ltp-agent/actions/runs/24189485745
>
> memcontrol0[34] have a long story of sporadic failures related to
> the kernel async nature and memory management implementation, so
> I believe this is not gonna fix the test in the long run.
>
> But maybe I'm wrong. It's better if you send a v2 first, fixing
> the issues, then other people can properly review it.
Hi Pavithra,

The fix doesn't seem to address the real problem. This test does seem flaky.
Depending on how many iterations are executed, the fail count can vary.

I tried some debugging on one of the system I had access to.

The test fails with memory.current values much lower than expected:
   TFAIL: (A/B/C memory.current=6684672) ~= 34603008
   TFAIL: (A/B/D memory.current=5373952) ~= 17825792

It seems the child processes allocating pagecache were exiting
immediately after allocation (via tst_reap_children()), causing the
pagecache to be freed before the test could measure memory.current values.

A potential fix can be as follows:
Modify alloc_pagecache_in_child() to keep children alive during test:
- Add TEST_DONE checkpoint for child lifecycle coordination
- Parent waits for CHILD_IDLE checkpoint before proceeding
- Child signals CHILD_IDLE after allocation and fsync
- Child waits for TEST_DONE to keep memory allocated during test
- Modify cleanup_sub_groups() to wake waiting children before cleanup
- Change alloc_anon_in_child() to use SAFE_WAITPID() for specific child

This will ensure pagecache remains allocated during memory pressure
testing, allowing correct memory.current measurements.

Untested patch:

diff --git a/testcases/kernel/controllers/memcg/memcontrol04.c 
b/testcases/kernel/controllers/memcg/memcontrol04.c
index 715cc5bcd..d0188a1da 100644
--- a/testcases/kernel/controllers/memcg/memcontrol04.c
+++ b/testcases/kernel/controllers/memcg/memcontrol04.c
@@ -47,7 +47,8 @@ static struct tst_cg_group *leaf_cg[4];
  static int fd = -1;

  enum checkpoints {
-       CHILD_IDLE
+       CHILD_IDLE,
+       TEST_DONE,
  };

  enum trunk_cg {
@@ -67,6 +68,16 @@ static void cleanup_sub_groups(void)
  {
         size_t i;

+       for (i = ARRAY_SIZE(leaf_cg); i > 0; i--) {
+               if (!leaf_cg[i - 1])
+                       continue;
+
+               TST_CHECKPOINT_WAKE2(TEST_DONE,
+                                    ARRAY_SIZE(leaf_cg) - 1);
+               tst_reap_children();
+               break;
+       }
+
         for (i = ARRAY_SIZE(leaf_cg); i > 0; i--) {
                 if (!leaf_cg[i - 1])
                         continue;
@@ -88,7 +99,7 @@ static void alloc_anon_in_child(const struct 
tst_cg_group *const cg,
         const pid_t pid = SAFE_FORK();

         if (pid) {
-               tst_reap_children();
+               SAFE_WAITPID(pid, NULL, 0);
                 return;
         }

@@ -107,7 +118,7 @@ static void alloc_pagecache_in_child(const struct 
tst_cg_group *const cg,
         const pid_t pid = SAFE_FORK();

         if (pid) {
-               tst_reap_children();
+               TST_CHECKPOINT_WAIT(CHILD_IDLE);
                 return;
         }

@@ -117,6 +128,11 @@ static void alloc_pagecache_in_child(const struct 
tst_cg_group *const cg,
                 getpid(), tst_cg_group_name(cg), size);
         alloc_pagecache(fd, size);

+       SAFE_FSYNC(fd);
+
+       TST_CHECKPOINT_WAKE(CHILD_IDLE);
+       TST_CHECKPOINT_WAIT(TEST_DONE);
+
         exit(0);
  }


-- 
Thanks
- Sachin




More information about the ltp mailing list