[LTP] [RFC PATCH] madvise06: shrink to 1 MADV_WILLNEED page to stabilize the test
Richard Palethorpe
rpalethorpe@suse.de
Thu Jun 16 09:21:11 CEST 2022
Hello Li,
Li Wang <liwang@redhat.com> writes:
> Paul Bunyan reports that the madvise06 test fails intermittently with many
> LTS kernels, after checking with mm developer we prefer to think this is
> more like a test issue (but not kernel bug):
>
> madvise06.c:231: TFAIL: 4 pages were faulted out of 2 max
>
> So this improvement is target to reduce the false positive happens from
> three points:
>
> 1. Adding the while-loop to give more chances for madvise_willneed()
> reads memory asynchronously
> 2. Raise value of `loop` to let test waiting for more times if swapchache
> haven't reached the expected
> 3. Shrink to only 1 page for MADV_WILLNEED verifying to make the system
> easily takes effect on it
>
> From Rafael Aquini:
>
> The problem here is that MADV_WILLNEED is an asynchronous non-blocking
> hint, which will tell the kernel to start doing read-ahead work for the
> hinted memory chunk, but will not wait up for the read-ahead to finish.
> So, it is possible that when the dirty_pages() call start re-dirtying
> the pages in that target area, is racing against a scheduled swap-in
> read-ahead that hasn't yet finished. Expecting faulting only 2 pages
> out of 102400 also seems too strict for a PASS threshold.
>
> Note:
> As Rafael suggested, another possible approach to tackle this failure
> is to tally up, and loosen the threshold to more than 2 major faults
> after a call to madvise() with MADV_WILLNEED.
> But from my test, seems the faulted-out page shows a significant
> variance in different platforms, so I didn't take this way.
>
> Btw, this patch get passed on my two easy reproducible systems more than 1000 times
>
> Signed-off-by: Li Wang <liwang@redhat.com>
> Cc: Rafael Aquini <aquini@redhat.com>
> Cc: Paul Bunyan <pbunyan@redhat.com>
> Cc: Richard Palethorpe <rpalethorpe@suse.com>
> ---
> testcases/kernel/syscalls/madvise/madvise06.c | 21 +++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/testcases/kernel/syscalls/madvise/madvise06.c b/testcases/kernel/syscalls/madvise/madvise06.c
> index 6d218801c..bfca894f4 100644
> --- a/testcases/kernel/syscalls/madvise/madvise06.c
> +++ b/testcases/kernel/syscalls/madvise/madvise06.c
> @@ -164,7 +164,7 @@ static int get_page_fault_num(void)
>
> static void test_advice_willneed(void)
> {
> - int loops = 50, res;
> + int loops = 100, res;
> char *target;
> long swapcached_start, swapcached;
> int page_fault_num_1, page_fault_num_2;
> @@ -202,23 +202,32 @@ static void test_advice_willneed(void)
> "%s than %ld Kb were moved to the swap cache",
> res ? "more" : "less", PASS_THRESHOLD_KB);
>
> -
> - TEST(madvise(target, PASS_THRESHOLD, MADV_WILLNEED));
> + loops = 100;
> + SAFE_FILE_LINES_SCANF("/proc/meminfo", "SwapCached: %ld", &swapcached_start);
> + TEST(madvise(target, pg_sz, MADV_WILLNEED));
> if (TST_RET == -1)
> tst_brk(TBROK | TTERRNO, "madvise failed");
> + do {
> + loops--;
> + usleep(100000);
> + if (stat_refresh_sup)
> + SAFE_FILE_PRINTF("/proc/sys/vm/stat_refresh", "1");
> + SAFE_FILE_LINES_SCANF("/proc/meminfo", "SwapCached: %ld",
> + &swapcached);
> + } while (swapcached < swapcached_start + pg_sz/1024 && loops > 0);
>
> page_fault_num_1 = get_page_fault_num();
> tst_res(TINFO, "PageFault(madvice / no mem access): %d",
> page_fault_num_1);
> - dirty_pages(target, PASS_THRESHOLD);
> + dirty_pages(target, pg_sz);
Adding the loop makes sense to me. However I don't understand why you
have also switched from PASS_THRESHOLD to only a single page?
I guess calling MADV_WILLNEED on a single page is the least realistic
scenario.
If there is an issue with PASS_THRESHOLD perhaps we could scale it based
on page size?
> page_fault_num_2 = get_page_fault_num();
> tst_res(TINFO, "PageFault(madvice / mem access): %d",
> page_fault_num_2);
> meminfo_diag("After page access");
>
> res = page_fault_num_2 - page_fault_num_1;
> - tst_res(res < 3 ? TPASS : TFAIL,
> - "%d pages were faulted out of 2 max", res);
> + tst_res(res == 0 ? TPASS : TFAIL,
> + "%d pages were faulted out of 1 max", res);
>
> SAFE_MUNMAP(target, CHUNK_SZ);
> }
--
Thank you,
Richard.
More information about the ltp
mailing list