[LTP] [PATCH] set_mempolicy01: cancel the limit of maximum runtime

Thu Dec 29 04:04:12 CET 2022

On Wed, Dec 28, 2022 at 6:44 PM Richard Palethorpe <rpalethorpe@suse.de>
wrote:

> Hello,
>
> Li Wang <liwang@redhat.com> writes:
>
> > On Tue, Dec 20, 2022 at 10:15 PM Cyril Hrubis <chrubis@suse.cz> wrote:
> >
> >> Hi!
> >> > It needs more time for running on multiple numa nodes system.
> >> > Here propose to cancel the limit of max_runtime.
> >> >
> >> >   ========= test log on 16 nodes system =========
> >> >   ...
> >> >   set_mempolicy01.c:80: TPASS: child: Node 15 allocated 16
> >> >   tst_numa.c:25: TINFO: Node 0 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 1 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 2 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 3 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 4 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 5 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 6 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 7 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 8 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 9 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 10 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 11 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 12 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 13 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 14 allocated 0 pages
> >> >   tst_numa.c:25: TINFO: Node 15 allocated 16 pages
> >> >   set_mempolicy01.c:80: TPASS: parent: Node 15 allocated 16
> >> >
> >> >   Summary:
> >> >   passed   393210
> >> >   failed   0
> >> >   broken   0
> >> >   skipped  0
> >> >   warnings 0
> >> >
> >> >   real        6m15.147s
> >> >   user        0m33.641s
> >> >   sys 0m44.553s
> >>
> >> Can't we just set the default to 30 minutes or something large enough?
> >>
> >
> > Yes, I thought about a fixed larger value before, but seems the test
> > time go increased extremely faster when the test matrix doubled.
> >
> > I don't have a system with more than 32 nodes to check if 30mins
> > enough, so I guess probably canceling the limitation like what we
> > did for oom tests would make sense, that timeout value depends
> > on real system configurations.
>
> IMO, this is what the timeout multiplier is for. So if you have a
> computer with 512 CPUs or a tiny embedded device, you can adjust the
> timeouts upwards.
>

Well, exporting LTP_RUNTIME_MUL to a large value is useful for
extending the maximal test runtime, but the side effect is, it will
change the runtime for many other tests as well, especially those
who use tst_remaining_runtime() in their infinite looping
(e.g. pty06/7, swapping01, mmap1, fork13),
which leads to the whole LTP suite costing more time to complete.

That's why we love LTP_RUNTIME_MUL but dare not set it too high.

>
> The default timeouts are for workstations, commodity servers and
> VMs. Although I suppose as this is a NUMA test the average machine will
> be bigger, but 32 nodes on a physical machine would be 128-512 CPUs?
>

I guess yes, after checking one 16nodes physical machine it has 128 CPUs.

-- 
Regards,
Li Wang