<div dir="ltr"><div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-size:small">Hi Jan,</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Mar 13, 2019 at 7:48 PM Jan Stancek <<a href="mailto:jstancek@redhat.com">jstancek@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">From: Li Wang <<a href="mailto:liwang@redhat.com" target="_blank">liwang@redhat.com</a>><br>
<br>
Test issue:<br>
mtest01 start many children to alloc chunck of memory and do write<br>
page(with -w option), but occasionally some children were killed by<br>
oom-killer and exit with SIGCHLD signal sending. After the parent<br>
reciving this SIGCHLD signal it will report FAIL as a test result.<br>
<br>
It seems not a real kernel bug if something just like that, it's<br>
trying to use 80% of memory and swap. Once it uses most of memory,<br>
system starts swapping, but the test is likely consuming memory at<br>
greater rate than kswapd can provide, which eventually triggers OOM.<br>
<br>
---- FAIL LOG ----<br>
mtest01 0 TINFO : Total memory already used on system = 1027392 kbytes<br>
mtest01 0 TINFO : Total memory used needed to reach maximum = 12715520 kbytes<br>
mtest01 0 TINFO : Filling up 80% of ram which is 11688128 kbytes<br>
mtest01 1 TFAIL : mtest01.c:314: child process exited unexpectedly<br>
-------------------<br>
<br>
Rewrite changes:<br>
To make mtest01 more easier to understand, I just rewrite it into<br>
LTP new API and make a little changes in children behavior.<br>
<br>
* decrease the pressure to 80% of free memory for testing<br>
* drop the signal SIGCHLD action becasue new API help to check_child_status<br>
* make child pause itself after finishing their memory allocating/writing<br>
* parent sends SIGCONT to make children continue and exit<br>
* use TST_PROCESS_STATE_WAIT to wait child changes to 'T' state<br>
* involve ALLOC_THRESHOLD to rework same code in defines<br>
* to make mtest01 support running with -i N > 1<br>
<br>
Signed-off-by: Li Wang <<a href="mailto:liwang@redhat.com" target="_blank">liwang@redhat.com</a>><br>
Signed-off-by: Jan Stancek <<a href="mailto:jstancek@redhat.com" target="_blank">jstancek@redhat.com</a>><br>
---<br>
Li,<br>
<br>
I'm posting v4 because I'm proposing also couple other small changes.<br>
Changes in v4:<br>
- fix -b parameter, it was ignored, because maxpercent is always non-zero<br>
Now, if -b is set, then maxpercent is ignored.<br>
- remove casts to unsigned long long, use literal with ULL suffix<br>
- pid_count renamed to children_done, because pid_count vs. pid_cntr was confusing<br>
- original_maxbytes dropped, instead alloc_maxbytes remains unchanged in mem_test()<br>
- do_write_page renamed to do_write_mem, since we write more than single page<br>
- do_write_mem loop now increases offset by page size to make it slightly faster<br>
- bytecount initialized to 0 in child_loop_alloc()<br>
- info messages expanded to show some timestamp (remaining time)<br>
- PASS/FAIL messages for do_write=0 and do_write=1 consolidated to single one<br>
- child creation loop in mem_test() tweaked:<br>
- child_loop_alloc() called directly from loop<br>
- pid_cntr incremented and used as index to pid_list array, 'i' variable not used<br>
- while condition "((pid != 0)" removed, it's always true<br>
- sleep reduced to 100ms<br>
- "while (pid_list[i] > 0)" replaced with for loop, since we know exactly how many<br>
children we spawned. memset in setup dropped.<br></blockquote><div><br></div><div><div class="gmail_default" style="font-size:small">I'm OK with these improvements, patch v4 looks quite good to me:).</div></div><div><br></div></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr">Regards,<br>Li Wang<br></div></div></div></div></div></div>