[LTP] [PATCH v4] min_free_kbytes: Handle transient memory drops in check_monitor

Wei Gao wegao@suse.com
Tue Jun 2 03:00:42 CEST 2026


High memory pressure can cause MemFree to temporarily drop below the
min_free_kbytes threshold before the kernel reclaimer can catch up.
This results in intermittent test failures, observed on openQA aarch64
virtual machines.

Implement a 2-second grace period with high-accuracy 10ms fixed polling
in check_monitor() to allow the kernel time to reclaim memory.

Introduce a 10% tolerance (90% threshold) for the MemFree check. My
measurements showed that under extreme pressure, MemFree can dip as low
as ~50% to ~70% of the target. While it typically recovers above 90%
within one second, hitting the exact 100% watermark sometimes can take
significantly longer. This tolerance prevents false positives during the
slow recovery tail while still ensuring memory is maintained near the
required level.

Also, increase the monitor's idle polling frequency from 2s to 100ms
to improve responsiveness during the test run.

Enhanced diagnostics are added to report the minimum memory level seen
during the pressure period to aid in future calibration.

Signed-off-by: Wei Gao <wegao@suse.com>
---
v3->v4:
- Remove redundant 'retry_count' variable.
- Remove 'MemAvailable' from diagnostics.
- Provide quantitative justification for the 10% tolerance (measured dips to ~50%).

 .../kernel/mem/tunable/min_free_kbytes.c      | 43 ++++++++++++++++---
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/testcases/kernel/mem/tunable/min_free_kbytes.c b/testcases/kernel/mem/tunable/min_free_kbytes.c
index 7882c6072..6b6d009dd 100644
--- a/testcases/kernel/mem/tunable/min_free_kbytes.c
+++ b/testcases/kernel/mem/tunable/min_free_kbytes.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 /*
- * Copyright (c) Linux Test Project, 2012-2025
+ * Copyright (c) Linux Test Project, 2012-2026
  * Copyright (C) 2012-2017  Red Hat, Inc.
  */
 
@@ -177,20 +177,49 @@ static int eatup_mem(unsigned long overcommit_policy)
 
 static void check_monitor(void)
 {
-	unsigned long tune;
-	unsigned long memfree;
+	unsigned long tune, threshold;
+	unsigned long memfree, min_memfree;
+	int i;
 
 	while (!end) {
 		memfree = SAFE_READ_MEMINFO("MemFree:");
 		tune = TST_SYS_CONF_LONG_GET(MIN_FREE_KBYTES);
+		/*
+		 * Allow 10% tolerance to account for transient states.
+		 */
+		threshold = tune * 9 / 10;
 
 		if (memfree < tune) {
-			tst_res(TINFO, "MemFree is %lu kB, "
-				 "min_free_kbytes is %lu kB", memfree, tune);
-			tst_res(TFAIL, "MemFree < min_free_kbytes");
+			min_memfree = memfree;
+			/*
+			 * Give it some time to reclaim. The kernel should keep
+			 * MemFree above min_free_kbytes, but transient drops
+			 * are possible under high pressure.
+			 * Check every 10ms for up to 2 seconds for high accuracy.
+			 */
+			for (i = 10; i <= 2000; i += 10) {
+				usleep(10000);
+				memfree = SAFE_READ_MEMINFO("MemFree:");
+				if (memfree < min_memfree)
+					min_memfree = memfree;
+
+				if (memfree >= tune)
+					break;
+			}
+
+			if (memfree < threshold) {
+				tst_res(TFAIL, "MemFree %lu < 90%% of min_free_kbytes %lu (MinSeen: %lu%%) after 2s",
+					memfree, tune, (min_memfree * 100 / tune));
+			} else if (memfree < tune) {
+				tst_res(TINFO, "MemFree (%lu) stayed within 10%% tolerance (min %lu%%) after ~2s",
+					memfree, (min_memfree * 100 / tune));
+			} else {
+				tst_res(TINFO, "MemFree recovered to %lu (min %lu%%) after %d ms",
+					memfree, (min_memfree * 100 / tune), i);
+			}
 		}
 
-		sleep(2);
+		usleep(100000);
 	}
 }
 
-- 
2.54.0



More information about the ltp mailing list