[LTP] [PATCH 1/2] read_all: Add worker timeout

Petr Vorel pvorel@suse.cz
Thu Jul 14 19:56:01 CEST 2022


Hi all,

Reviewed-by: Petr Vorel <pvorel@suse.cz>

> > +static void restart_worker(struct worker *const worker)
> > +{
> > +       int wstatus, ret, i, q_len;
> > +       struct timespec now;
> > +
> > +       kill(worker->pid, SIGKILL);
> > +       ret = waitpid(worker->pid, &wstatus, 0);

> Is there a chance we could get stuck in uninterruptible read? I think I saw some
> in past, but those may be blacklisted already, so this may only be something
> to watch for if we still get test timeouts in future.

+1

...
> > +       if (ret != worker->pid) {
> > +               tst_brk(TBROK | TERRNO, "waitpid(%d, ...) = %d",
> > +                       worker->pid, ret);
> > +       }
> > +
> > +       /* Make sure the queue length and semaphore match. Threre is a
> > +        * race in qeuue_pop where the semaphore can be decremented
> ^^ typo in queue_pop above

...
> > +               tst_clock_gettime(CLOCK_MONOTONIC_RAW, &now);
> > +               elapsed =
> > +                       tst_timespec_to_ms(now) - tst_atomic_load(&w->last_seen);
> > +
> > +               if (elapsed > worker_timeout) {
> > +                       if (!quiet) {
> > +                               tst_res(TINFO,
> > +                                       "Worker %d (%d) stuck for %dms, restarting it",
> > +                                       i, w->pid, elapsed);

> Can we also print file worker is stuck on?
> And as Li pointed out, I'd also extend max_runtime to 60 seconds.

+1, all these additional changes make sense to me.

Kind regards,
Petr


More information about the ltp mailing list