[LTP] [PATCH] sched_football: fix false failures on many-CPU systems

John Stultz jstultz@google.com
Wed Apr 15 22:21:02 CEST 2026


On Wed, Apr 15, 2026 at 8:20 AM Jan Polensky <japo@linux.ibm.com> wrote:
> On Wed, Apr 15, 2026 at 05:52:11PM +0800, Li Wang wrote:
> > Hi Soma and Jan,
> >
> > > > 1. RT throttling freezes all SCHED_FIFO threads simultaneously. On
> > > > release, the kernel does not always reschedule the highest-priority
> > > > thread first on every CPU, so offense briefly runs and increments
> > > > the_ball before defense is rescheduled. Fix by saving and disabling
> > > > sched_rt_runtime_us in setup and restoring it in a new cleanup
> > > > callback.
> >
> > Make sense, and like the AI-reviewer points out LTP provides an option
> > to save_restore it automatically.

Throttling shouldn't break the test. The fact that SCHED_NORMAL tasks
ran shouldn't change the ordering when we go back to running RT tasks.
This is likely a kernel bug.

> > > > 2. Offense and defense threads were unpinned, allowing the scheduler
> > > > to migrate them freely. An offense thread could land on a CPU with
> > > > no defense thread present and run unchecked. Fix by passing a CPU
> > > > index as the thread arg and calling sched_setaffinity() at thread
> > > > start. Pairs are distributed round-robin (i % ncpus) so each
> > > > offense thread shares its CPU with a defense thread.
> >
> > This is a good thought, as for SCHED_FIFO it manages the corresponding
> > runqueue for each CPU and simply picks the higher priority task to run.
> > So pinning the threads to each CPU makes sense, but maybe we could
> > only pin the defense because:
> >
> > With N defense threads pinned one per CPU, every CPU has a defense
> > thread at priority 30 permanently runnable. The offense threads at priority
> > 15, regardless of which CPU the scheduler places them on, will always find
> > a higher-priority defense thread on the same CPU's runqueue. Since
> > SCHED_FIFO strictly favors the higher-priority runnable task, offense can
> > never be picked.
> >
> > Pinning offense as well would be redundant, it doesn't matter where offense
> > lands, because defense already covers every CPU. This also has the advantage
> > of letting the scheduler freely migrate offense threads without
> > affecting the test
> > outcome, which avoids interfering with the kernel's load balancing logic during
> > the test.
> >
> > And, I'd suggest using tst_ncpus_available() instead of get_numcpus()
> > when distributing defense threads across CPUs, in case some CPUs are
> > offline. Pinning a defense thread to an offline CPU would leave that
> > CPU uncovered and allow offense to run unchecked. See:

I didn't see the orignal patch here, but the whole point of
sched_football is to ensure the top <num cpu> (unaffined) priority
tasks are always run and no lower priority rt tasks are run instead.

So none of the tasks should be pinned to any cpus. The scheduler is
supposed to ensure the RT invariant holds.
There are some known bugs at the moment that will cause sched_football
to fail (the RT_PUSH_IPI feature, for instance). That's a problem with
the kernel, not the test.

thanks
-john


More information about the ltp mailing list