[LTP] [LTP PATCH v2 0/1] Add some memory page soft-offlining control

Fri Jan 27 11:05:52 CET 2023

From: William Roche <william.roche@oracle.com>

After a long delay (since August) and many days of work on this topic,
I come back with a new version of this test proposal.
This version is still using a set of threads running the same code and
competing with each other. They all allocate a set of memory pages,
write a sentinel value into each of them and soft-offline them before
verifying the sentinel value and unmapping them - in a loop.

I've tried to address all the feedbacks I had:

- added madvise11 to the runtest/syscalls file [Petr]
- more complete and compliant Description comment [Petr]
- removed no longer used header files
- removed inline comments [Petr + Richard]
- removed unnecessary comments [Petr]
- number of threads dynamically tuned (with limits) [Richard]
- warn about unexpected mmap errors [Richard]
- lower case (not camel) variable names [Petr + Richard]
- removal of an unneeded temporary "copy" variable [Richard]
- removed unnecessary additional checks of SAFE_* functions [Petr]
- removed the min_kver=2.6.33 [Petr]
- added the commit id into the test_tst structure [Richard]
- "make check-madvise11" is now clean [Petr + Richard]

But also:

- separate functions for mmap and madvise (dealing with error cases)
- simplified the page sentinel value setting and verification
- give information about number of threads and memory to be used by an
  iteration of the test
- count the iterations to unpoison the right number of pages in case of
  multiple successful iterations
- moved sigaction setting to setup()
- SAFE_MALLOC() used
- significantly reduced the number of threads used
- significantly reduced the runtime timeout

Note about the tst_fuzzy_sync framework use:
What required the largest part of my work was this aspect that has been
mentioned by Richard, as I agree with him about putting the emphasis on
the competing critical sections of code (mmap and madvise). I finally
could create a version of this test using the tst_fuzzy_sync mechanism
that could reproduce the race condition.
But I chose not to use it for the following reasons:
- my fuzzy version was not as reliable as the multithreaded version to
  identify our race condition -- On a kernel where the race fixed by
  d4ae9916ea29 is still there, the fuzzy version of the test could give
  false positive results on about 10% of the runs, where this
  multithreaded version hasn't shown a false positive in my tests.
- Another reason why I chose to submit this multithreaded test version is
  that it is generally (about 80% of the cases) much faster to fall on
  the race condition than the fuzzy version.

So I hope you'll find this multithreaded test useful.
Tested on ARM and x86.

William Roche (1):
  madvise11: Add test for memory allocation / Soft-offlining possible
    race

 runtest/syscalls                              |   1 +
 testcases/kernel/syscalls/madvise/.gitignore  |   1 +
 testcases/kernel/syscalls/madvise/Makefile    |   3 +
 testcases/kernel/syscalls/madvise/madvise11.c | 405 ++++++++++++++++++
 4 files changed, 410 insertions(+)
 create mode 100644 testcases/kernel/syscalls/madvise/madvise11.c

-- 
2.31.1