[LTP] [PATCH] madvise06: Raise the bar for judging failure
Li Wang
liwang@redhat.com
Tue Feb 28 06:45:03 CET 2023
Hi Richard,
On Mon, Feb 27, 2023 at 8:27 PM Richard Palethorpe <rpalethorpe@suse.de>
wrote:
> Hell Li,
>
> Li Wang <liwang@redhat.com> writes:
>
> > There is an intermittent failure which we have observed many times
> whether
> > on rhel or mainline kernel. But we're unable to stable reproduce it:
> >
> > 43 madvise06.c:201: TFAIL: less than 102400 Kb were moved to
> the swap cache
> > ...
> >
> > However it does not look like a kernel issue, because SwapCached change
> is
> > not strictly abiding by the principle of MADV_WILLNEED advice. That
> means it
> > all depends on the kernel's specific circumstances. The value of the
> threshold
> > is debatable at least from my point of view, its use 1/4 is not
> guaranteed
> > 100% safe.
> >
> > As MADV_WILLNEED is just advice to the kernel, not a guarantee. The
> kernel may
> > choose to ignore the advice, or may prioritize other memory management
> tasks
> > over pre-loading the advised pages.
> >
> > So this patch is aimed at improving the accuracy and clarity of the test
> results.
> > Specifically, the use of two separate variables to track the results of
> different
> > comparisons will make it easier to understand what the test is doing.
> >
> > Additionally, the change to report a test result of "TINFO" instead of
> "TFAIL"
> > when the swap cache size is less than expected would be intended to
> indicate
> > that this is an acceptable outcome.
> >
> > Finally, the change to the second tst_res call is intended to make the
> test more
> > lenient, as it now passes if either no page faults occur or the swap
> cache size
> > is larger than expected.
>
> Why not skip to making them all TINFO?
>
> It's undefined what action will result from MADV_WILLNEED. If it were
> better for performance *not* to read in pages, then it would be valid
> for the kernel to ignore it.
>
Yes, but I didn't do that because madvise06 test checks free_mem/free_swap
size at the beginning, it garantee the system at least with 2 * CHUNK_SZ
(800MB + 800MB) memory for the test performing, unless there is something
happening parallel otherwise kernel will handle MADV_WILLNEED request
correctly for most scenarios.
And we indeed do not see page-faults failure out of expected
anymore since commit 00e769e63515e51, so I just combined the
two judgments together in this patch. I believe it's enough and also
give a leeway to the kernel.
I hope there could be a lenient test for MADV_WILLNEED.
I will decisively take your suggestion once the failure appears again next
time.
>
> Yang Xu added a tag for a perf regression that it could
> reproduce. However looking at the kernel commit this was first found by
> stress-ng.
>
> commit 66383800df9cbdbf3b0c34d5a51bf35bcdb72fd2
> Author: Matthew Wilcox (Oracle) <willy@infradead.org>
> Date: Sat Nov 21 22:17:22 2020 -0800
>
> mm: fix madvise WILLNEED performance problem
>
> The calculation of the end page index was incorrect, leading to a
> regression of 70% when running stress-ng.
>
> With this fix, we instead see a performance improvement of 3%
>
> I found a bug with this test, but it was causing an Oops. It wouldn't
> matter if the test printed pass or fail.
>
> So I think we are wasting our time by constantly tweaking this test.
>
> --
> Thank you,
> Richard.
>
>
--
Regards,
Li Wang
More information about the ltp
mailing list