[LTP] [PATCH 1/2] lib: multiply the timeout if detect slow kconfigs

Petr Vorel pvorel@suse.cz
Fri Jan 3 08:06:34 CET 2025


> On Thu, Jan 2, 2025 at 8:43 PM Petr Vorel <pvorel@suse.cz> wrote:

> > Hi Li, all,

> > [ Cc others to get broather feedback ]

> > > This refines the handling of timeouts for tests running on
> > > systems with slow kernel configurations (kconfigs).

> > > Previously, the max_runtime was multiplied globally when
> > > slow kconfigs were detected, which inadvertently prolonged
> > > the runtime of all tests using max_runtime for control.

> > > This patch corrects that behavior by applying the multiplication
> > > specifically to timeouts, ensuring it only affects the intended
> > > operations without impacting other tests.

> > > Fixes: 2da30df24 ("lib: multiply the max_runtime if detect slow
> > kconfigs")

> > Thanks for handling this, I overlooked it on 27th, thus review it now.

> > Multiplying whole timeout instead of max_runtime helps to hide longer
> > timeout
> > from the test which uses detection via tst_remaining_runtime(). I.e.
> > previously
> > it behaved on slow config as LTP_RUNTIME_MUL=4, now as LTP_TIMEOUT_MUL=4.


> Yes, the benefit of multiplying TIMEOUT (on a slow system) is not only
> to avoid increasing the actual execution time of the test, but also to give
>  the system more time to wait for the test to complete the final work.

> Original:
>   |  -- timeout -- | -- max_runtime -- |

> Previous:
>   |  -- timeout -- | -------- max_runtime * 4 -------- |

> Now:
>   |  -------- timeout * 4 -------- | -- max_runtime --  |

Later it'd be nice to document this simple timeline (also with LTP_RUNTIME_MUL
and LTP_TIMEOUT_MUL) in sphinx docs (/** */). Or, it could be in lib/README.md,
but I would like to convert also this page to sphinx.

> > Good idea. IMHO good enough (Martin previously suggested [1] to add a new
> > tst_test flag to identify tests which exit when runtime expires).


> Introduce a new tst_test flag to split the max_runtime into two parts can
> resolve it as well, but the disadvantage might make people hard to
> understand the LTP time controlling, if go with timeout, max_runtime,
> max_exetime
> I think 'simple+uselful' is beautiful unless we need to complex it in the
> future.

+1, I fully agree.

> > Reviewed-by: Petr Vorel <pvorel@suse.cz>

> > Some measurements on my Tumbleweed VM, which is detected as slow due
> > CONFIG_LATENCYTOP:

> > TEST                                              | 2da30df24~ |
> > 2da30df24  | this patch

> > --------------------------------------------------|-------------------------------------
> > swapping01.c (calls tst_remaining_runtime())      | 0h 10m 30s | 0h 40m
> > 30s | 0h 12m 00s
> > tst_fuzzy_sync01.c (calls tst_remaining_runtime())| 0h 03m 00s | 0h 10m
> > 30s | 0h 04m 30s
> > tst_cgroup02.c (default timeout 0h 00m 30s)       | 0h 00m 30s | 0h 00m
> > 30s | 0h 02m 00s
> > test_runtime01.c (.max_runtime = 4, calls         | 0h 00m 34s | 0h 00m
> > 46s | 0h 02m 04s
> > tst_remaining_runtime())
> > starvation.c (calls tst_remaining_runtime() only  | 0h 01m 05s | 0h 02m
> > 50s | 0h 02m 34s
> > to detect failure)                                |

> > => Tests which call tst_remaining_runtime() runs slightly longer, but IMHO
> > that's OK. Other tests (regardless if with the default runtime or these
> > which
> > set .max_runtime) run 4* longer as expected.

> > Tested-by: Petr Vorel <pvorel@suse.cz>


> The longer time is not because call tst_remaining_runtime(), it just comes
> from
> the ' timeout *= 4' while detecting slow configs, as you can see, the
> original default
> timeout is 30s, and with multiple 4 it become 120s (which is +2mins), all
> the test with
> this patch shows that 2 more mins there.

> But, that does not mean the test executing time is really adding 2 minutes,
> it just
> means having that timeout value. We need to use `time ./swapping01` to
> evaluate the real test time, and I didn't find any more delay with this
> method :).

Yes, I noticed that (measuring just test_runtime01.c, where it's nicely
visible).

> Thanks for the comparison, actually I did some tests for RHEL and got a
> good result.

I'm OK with whole result. I'd be happier if we could avoid TCONF of starvation,
but let's discuss this on that patch.

Kind regards,
Petr


More information about the ltp mailing list