[LTP] [PATCH 1/2] read_all: Add worker timeout
Richard Palethorpe
rpalethorpe@suse.de
Mon Jul 18 12:37:59 CEST 2022
Hello,
Petr Vorel <pvorel@suse.cz> writes:
> Hi all,
>
> Reviewed-by: Petr Vorel <pvorel@suse.cz>
>
>> > +static void restart_worker(struct worker *const worker)
>> > +{
>> > + int wstatus, ret, i, q_len;
>> > + struct timespec now;
>> > +
>> > + kill(worker->pid, SIGKILL);
>> > + ret = waitpid(worker->pid, &wstatus, 0);
>
>> Is there a chance we could get stuck in uninterruptible read? I think I saw some
>> in past, but those may be blacklisted already, so this may only be something
>> to watch for if we still get test timeouts in future.
>
> +1
>
> ...
>> > + if (ret != worker->pid) {
>> > + tst_brk(TBROK | TERRNO, "waitpid(%d, ...) = %d",
>> > + worker->pid, ret);
>> > + }
>> > +
>> > + /* Make sure the queue length and semaphore match. Threre is a
>> > + * race in qeuue_pop where the semaphore can be decremented
>> ^^ typo in queue_pop above
>
> ...
>> > + tst_clock_gettime(CLOCK_MONOTONIC_RAW, &now);
>> > + elapsed =
>> > + tst_timespec_to_ms(now) - tst_atomic_load(&w->last_seen);
>> > +
>> > + if (elapsed > worker_timeout) {
>> > + if (!quiet) {
>> > + tst_res(TINFO,
>> > + "Worker %d (%d) stuck for %dms, restarting it",
>> > + i, w->pid, elapsed);
>
>> Can we also print file worker is stuck on?
>> And as Li pointed out, I'd also extend max_runtime to 60 seconds.
>
> +1, all these additional changes make sense to me.
OK I'll make these changes, thanks!
>
> Kind regards,
> Petr
--
Thank you,
Richard.
More information about the ltp
mailing list