[LTP] [PATCH v3] min_free_kbytes: Handle transient memory drops in check_monitor
Wei Gao
wegao@suse.com
Sun May 31 15:40:47 CEST 2026
High memory pressure can cause MemFree to temporarily drop below the
min_free_kbytes threshold before the kernel reclaimer can catch up.
This results in intermittent test failures, particularly observed on
openQA aarch64 machines where swap is exhausted.
Implement a 2-second grace period with high-accuracy 10ms fixed polling
in check_monitor() to allow the kernel time to reclaim memory.
Introduce a 10% tolerance (90% threshold) for the MemFree check. Our
measurements showed that under extreme pressure, MemFree can take a
long time to recover to the exact 100% MIN_FREE_KBYTES, or may stay slightly
below it. This tolerance prevents false positives and avoids excessive
wait times while still ensuring memory is maintained near the required level.
Enhanced diagnostics are added to report MemAvailable and the minimum
memory level seen during the pressure period to aid in future
calibration.
Signed-off-by: Wei Gao <wegao@suse.com>
---
v2->v3:
- Switched from an exponential backoff retry loop to a fixed 10ms polling interval for up to 2 seconds.
This provides better resolution and a more predictable grace period for the kernel to reclaim memory.
- Introduced a 10% tolerance threshold (90% of min_free_kbytes). Memory levels staying within this range
after the grace period are now logged as info rather than failing, avoiding false positives on systems
under extreme pressure.
- Added enhanced diagnostics: the test now reports MemAvailable, the minimum MemFree level seen during
the pressure period, and the percentage of the threshold achieved.
- Refined logging to clearly distinguish between recovery, tolerance-level maintenance, and actual failures.
.../kernel/mem/tunable/min_free_kbytes.c | 49 ++++++++++++++++---
1 file changed, 42 insertions(+), 7 deletions(-)
diff --git a/testcases/kernel/mem/tunable/min_free_kbytes.c b/testcases/kernel/mem/tunable/min_free_kbytes.c
index 7882c6072..bd3821cf1 100644
--- a/testcases/kernel/mem/tunable/min_free_kbytes.c
+++ b/testcases/kernel/mem/tunable/min_free_kbytes.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
- * Copyright (c) Linux Test Project, 2012-2025
+ * Copyright (c) Linux Test Project, 2012-2026
* Copyright (C) 2012-2017 Red Hat, Inc.
*/
@@ -177,20 +177,55 @@ static int eatup_mem(unsigned long overcommit_policy)
static void check_monitor(void)
{
- unsigned long tune;
- unsigned long memfree;
+ unsigned long tune, threshold;
+ unsigned long memfree, memavail, min_memfree;
+ int i, retry_count;
while (!end) {
memfree = SAFE_READ_MEMINFO("MemFree:");
tune = TST_SYS_CONF_LONG_GET(MIN_FREE_KBYTES);
+ /*
+ * Allow 10% tolerance to account for transient states.
+ */
+ threshold = tune * 9 / 10;
if (memfree < tune) {
- tst_res(TINFO, "MemFree is %lu kB, "
- "min_free_kbytes is %lu kB", memfree, tune);
- tst_res(TFAIL, "MemFree < min_free_kbytes");
+ min_memfree = memfree;
+ retry_count = 0;
+ /*
+ * Give it some time to reclaim. The kernel should keep
+ * MemFree above min_free_kbytes, but transient drops
+ * are possible under high pressure.
+ * Check every 10ms for up to 2 seconds for high accuracy.
+ */
+ for (i = 10; i <= 2000; i += 10) {
+ retry_count++;
+ usleep(10000);
+ memfree = SAFE_READ_MEMINFO("MemFree:");
+ if (memfree < min_memfree)
+ min_memfree = memfree;
+
+ if (memfree >= tune)
+ break;
+ }
+
+ memavail = SAFE_READ_MEMINFO("MemAvailable:");
+
+ if (memfree < threshold) {
+ tst_res(TINFO, "tune=%lu, threshold=%lu", tune, threshold);
+ tst_res(TINFO, "MemFree=%lu, MemAvailable=%lu, MinSeen=%lu (%lu%%)",
+ memfree, memavail, min_memfree, (min_memfree * 100 / tune));
+ tst_res(TFAIL, "MemFree < 90%% of min_free_kbytes after ~2s");
+ } else if (memfree < tune) {
+ tst_res(TINFO, "MemFree (%lu) stayed within 10%% tolerance (min %lu%%, avail %lu) after ~2s",
+ memfree, (min_memfree * 100 / tune), memavail);
+ } else {
+ tst_res(TINFO, "MemFree recovered to %lu (min %lu%%, avail %lu) after %d retries (~%d ms)",
+ memfree, (min_memfree * 100 / tune), memavail, retry_count, i);
+ }
}
- sleep(2);
+ usleep(100000);
}
}
--
2.54.0
More information about the ltp
mailing list