[LTP] sched_football: Validity of testcase

Jorg Vehlow lkml@jv-coder.de
Fri Sep 6 09:40:58 CEST 2019


Hi,

I was looking thoroughly at the realtime testcase sched_football, 
because it sometimes fails and like to know your opinion on the test case.

A short introduction to how the test works:
It creates nThreads threads called offense and n threads called defense
(all fifo scheduled). The offense threads run at a lower priority than
the defense threads and the main thread has the highest priority. After 
all threads are created (validated using an atomic counter). The test 
verifies, that the offense threads are never executed by incrementing 
a counter in the offense threads, that is zeroed in the main thread. 
During the test the main threads sleeps to regularly. 

While the test is totally fine on a single core system, you can 
immediately see, that it will fail on a system with nCores > nThreads, 
because there will be a core were only an offense thread an no defense 
thread is scheduled. In its default setup nThreads = nCores. This should 
theoretically work, because there is a defense thread for every score with 
a higher priority than the offense threads and they should be scheduled
onto  every core. This is indeed what happens. The problem seems to be 
the  initialization phase. When the threads are created, they are not 
evenly scheduled. After pthread_create was called, the threads are scheduled

too cores where nothing is running. If there is no idle core anymore, they
are
scheduled to any core (the first?, the one with the shortest wait queue?).
At
some point after all threads are created, they are rescheduled to every
core.
It looks like the test fails, when there is initially a core with only an
offense thread scheduled onto it. In perf sched traces I saw, that a defense
thread was migrated to this core, but still the offense thread was executed
for
a short time, until the offense thread runs. From this point onwards only
defense threads are running.

I tested adding a sleep to the main function, after all threads are created,
to give the system some time for rescheduling. A sleep of around 50ms works
quite well and supports my theory about the migration time being the
problem.

Now I am not sure if the test case is even valid or if the scheduler is not
working as it is supposed to. Looking at the commits of sched_football it 
looks like it was running stable at least at some point, at least it es 
reported to have run 15k iterations in e6432e45.
What do you think about the test case? Is it even valid?
Should the cpu affinity be set fixed?

A note about my testing methodology:
After I realized, that the execution often failed due to the offense thread
running after referee set the_ball to 0, I replaced the loop with just
usleep(10000), for faster iteration.
I tested on ubuntu 19.04 with linux 5.0.0-27 running in vmware and 
a custom yocto distribution running linux 4.19.59 (with and without rt
patches)

Jörg



More information about the ltp mailing list