[LTP] [PATCH v2 1/4] read_all: Add worker timeout and rewrite scheduling
Li Wang
liwang@redhat.com
Wed Aug 3 09:49:47 CEST 2022
Hi Richard, All,
On Mon, Jul 25, 2022 at 6:06 PM Richard Palethorpe <rpalethorpe@suse.com>
wrote:
> Kill and restart workers that take too long to read a file. The
> default being 10% of max_runtime. A custom time can be set with the
> new -t option.
>
> This is to prevent a worker from blocking forever in a read. Currently
> when this happens the whole test times out and any remaining files in
> the worker's queue are not tested.
>
> As a side effect we can now also set the timeout very low to cause
> partial reads. However setting it to less than the time it takes to
> start (fork) a new worker is treated as an error. Forking takes much
> longer than most reads.
>
> A number of other possible issues were fixed as well as a side effect
> of changing the scheduling:
>
> + The worker queues are now filled in a
> "round robin" way. Before this only happened when -r was large
> enough.
> + When testing is finished the main process waits on the child procs before
> destroying the semaphores and worker queues.
> + max_runtime is set to 100 secs. This is so that the worker timeout
> is a round number.
> + Files which don't implement O_NONBLOCK and may block, no longer need
> to be avoided. Even if they refuse to die immediately;
> although this may result in TBROK due to zombie processes.
>
> Note that with a worker timeout of 1s, 2 to 3 files will usually timeout on
> my workstation. With 2s, none will. In any case testing completes in
> under 3s for proc, sys or dev.
>
> This is much faster than many other machines. It's quite likely the
> worker timeout and max_runtime need to be increased on small and very
> large machines. This can be done manually by setting LTP_RUNTIME_MUL.
>
Yes, apart from a bit of difficulty (at least for me) to comprehend the
detailed
behavior of this scheduler :).
Thanks for your improvements! Just one tiny query below.
Reviewed-by: Li Wang <liwang@redhat.com>
> --- a/testcases/kernel/fs/read_all/read_all.c
> +++ b/testcases/kernel/fs/read_all/read_all.c
> ...
> +#include <signal.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> +#include <sys/wait.h>
> #include <lapi/fnmatch.h>
> #include <stdlib.h>
> #include <stdio.h>
> @@ -43,7 +45,10 @@
> #include <pwd.h>
> #include <grp.h>
>
> +#include "tst_atomic.h"
> +#include "tst_safe_clocks.h"
> #include "tst_test.h"
> +#include "tst_timer.h"
>
> #define QUEUE_SIZE 16384
> #define BUFFER_SIZE 1024
> @@ -55,11 +60,15 @@ struct queue {
> int front;
> int back;
> char data[QUEUE_SIZE];
>
I doubt whether we need to maintain a queue to store the paths.
During the test it seems only one path is being pushed in the q->data[],
so the rest of the space is wasted, right?
By shrinking the QUEUE_SIZE to equal BUFFER_SIZE, the test
still works normally.
> + char popped[BUFFER_SIZE];
> };
>
> struct worker {
> + int i;
> pid_t pid;
> struct queue *q;
> + int last_seen;
> + unsigned int kill_sent:1;
> };
>
>
--
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20220803/afecb075/attachment.htm>
More information about the ltp
mailing list