[LTP] [RFC PATCH] madvise06: shrink to 1 MADV_WILLNEED page to stabilize the test

Mon Jun 20 09:44:35 CEST 2022

Hello Li,

Li Wang <liwang@redhat.com> writes:

> Hi Richard,
>
> Richard Palethorpe <rpalethorpe@suse.de> wrote:
>  
>  > --- a/testcases/kernel/syscalls/madvise/madvise06.c
>  > +++ b/testcases/kernel/syscalls/madvise/madvise06.c
>  > @@ -164,7 +164,7 @@ static int get_page_fault_num(void)
>  >  
>  >  static void test_advice_willneed(void)
>  >  {
>  > -     int loops = 50, res;
>  > +     int loops = 100, res;
>  >       char *target;
>  >       long swapcached_start, swapcached;
>  >       int page_fault_num_1, page_fault_num_2;
>  > @@ -202,23 +202,32 @@ static void test_advice_willneed(void)
>  >               "%s than %ld Kb were moved to the swap cache",
>  >               res ? "more" : "less", PASS_THRESHOLD_KB);
>  >  
>  > -
>  > -     TEST(madvise(target, PASS_THRESHOLD, MADV_WILLNEED));
>  > +     loops = 100;
>  > +     SAFE_FILE_LINES_SCANF("/proc/meminfo", "SwapCached: %ld", &swapcached_start);
>  > +     TEST(madvise(target, pg_sz, MADV_WILLNEED));
>  >       if (TST_RET == -1)
>  >               tst_brk(TBROK | TTERRNO, "madvise failed");
>  > +     do {
>  > +             loops--;
>  > +             usleep(100000);
>  > +             if (stat_refresh_sup)
>  > +                     SAFE_FILE_PRINTF("/proc/sys/vm/stat_refresh", "1");
>  > +             SAFE_FILE_LINES_SCANF("/proc/meminfo", "SwapCached: %ld",
>  > +                             &swapcached);
>  > +     } while (swapcached < swapcached_start + pg_sz/1024 && loops > 0);
>  >  
>  >       page_fault_num_1 = get_page_fault_num();
>  >       tst_res(TINFO, "PageFault(madvice / no mem access): %d",
>  >                       page_fault_num_1);
>  > -     dirty_pages(target, PASS_THRESHOLD);
>  > +     dirty_pages(target, pg_sz);
>
>  Adding the loop makes sense to me. However I don't understand why you
>  have also switched from PASS_THRESHOLD to only a single page?
>
> In the test, we use two checks combined to confirm the bug reproduces:
>
>   1. swap cached increasing less than PASS_THRESHOLD_KB
>   2. page_fault number large than expected
>
> The 2. case is more easily get failed on kind of platforms and hard
> to count an average value for tolerating. So maybe we just reduce
> the page to one that would not affect the final result. Because we
> rely on both checks happening simultaneously then assume a bug.
>
>  
>  
>  I guess calling MADV_WILLNEED on a single page is the least realistic
>  scenario.
>
> Okay, perhaps it's a step backward:).
>
> I was just thinking it is a regression test and if 1 page works to reproduce
> that (but more chunks of memory easily cause false positive), why not.

That makes sense, but this test has also found other bugs. I'm not sure
if they are reproducible with only one page.

>
>  
>  
>  If there is an issue with PASS_THRESHOLD perhaps we could scale it based
>  on page size?
>
> This sounds acceptable too.
>
> How many pages do you think are proper, 100 or more?
> and, loosen the faulted-out numbers to 1/10 pages?

I suppose that 100 pages would be too much memory on some systems. I
guess at least 2 or 3 pages are needed so there is some
traversal. Beyond that I don't know what would make a difference.

If there are only max 3 pages and we have a loop, I would not expect any
to be faulted. Although maybe we could allow 1/3 because MADV_WILLNEED
is only an advisory and a lot of time has been spent discussing this
test already.

-- 
Thank you,
Richard.