[LTP] [PATCH v1 1/1] sched_football: harden kickoff synchronization on high-CPU systems

John Stultz jstultz@google.com
Wed Feb 25 20:11:53 CET 2026


On Wed, Feb 25, 2026 at 1:23 AM Andrea Cervesato
<andrea.cervesato@suse.com> wrote:
> On Tue Feb 24, 2026 at 10:03 PM CET, John Stultz via ltp wrote:
> > On Tue, Feb 24, 2026 at 2:45 AM Jan Polensky <japo@linux.ibm.com> wrote:
> > >
> > > The sched_football test has been intermittently failing, most noticeably
> > > on systems with many CPUs and/or under load, due to a startup ordering
> > > hole around kickoff.
> > >
> >
> > I've not had time to closely review your suggestion here, but it
> > sounds reasonable.
> >
> > That said, I want to warn you and ensure you are aware: the
> > RT_PUSH_IPI feature in the scheduler breaks the RT invariant
> > sched_football is testing.
> >
> > I see this as a bug with that feature, but the scalability RT_PUSH_IPI
> > allows for seems more important to folks who are doing RT work then
> > the mis-behavior.  Steven and I talked awhile back about some ideas on
> > how we might be able to do the pull in a more efficient way while
> > still holding the invariant true, and I have a bug to track it, but
> > its not been high enough priority to get bandwidth yet.
> >
> > So you might want to make sure you disable that feature before testing via:
> > # echo NO_RT_PUSH_IPI > /sys/kernel/debug/sched/features
> >
> > thanks
> > -john
>
> Thanks for your deep analysis on the possible issue. I'm not an RT expert,
> but I trust your expertise in here :-) Will leave this patch review to
> someone who's more skilled than me in this topic.
>
> I have a suggestion tho.
>
> About the NO_RT_PUSH_IPI, we are lucky: LTP provides a safe mechanism to
> set the sys configurations and to restore it to default value after
> test. You can find this in the `struct tst_test` and it's called
> `.save_restore` [1]
>
> I think it's worth to force this option according to the underlying
> variant (and properly document this in the code with a comment).
>
> WDYT?

That seems reasonable, as long as it's clearly labeled as a workaround
and hopefully can be dropped (or kernel version limited) when the
issue is finally addressed.

thanks
-john


More information about the ltp mailing list