[LTP] [PATCH v3] thermal: add new test group

Petr Vorel pvorel@suse.cz
Thu Nov 20 17:30:16 CET 2025


Hi Piotr,

> This is a new test for checking thermal interrupt events.
> stress-ng is used because genload doesn't seem to generate enough load.

Thanks for info.  Maybe we should try to fix it later, because we really prefer
to use internal tools instead of external dependencies.

> This version adds min_runtime, switches echo to tst_res TINFO and TDEBUG and
> removes usage of bc. It also adds --quiet to stress-ng's arguments.

...
> diff --git a/runtest/thermal b/runtest/thermal
> new file mode 100644
> index 000000000..804ef7d79
> --- /dev/null
> +++ b/runtest/thermal
> @@ -0,0 +1,2 @@
> +#THERMAL
nit: '#THERMAL' does not bring much information value (the same name as the
file). Maybe:
# Thermal driver API
# https://docs.kernel.org/driver-api/thermal/

...
> +++ b/testcases/kernel/thermal/Makefile
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# Copyright (c) 2025, Intel Corporation. All rights reserved.
> +# Author:Piotr Kubaj <piotr.kubaj@intel.com>
> +
> +top_srcdir             ?= ../../..
> +
> +include $(top_srcdir)/include/mk/env_pre.mk
> +
> +INSTALL_TARGETS                := thermal01.sh
nit: *.sh will help not update makefile in case more tests is added.

> +include $(top_srcdir)/include/mk/generic_leaf_target.mk
> diff --git a/testcases/kernel/thermal/thermal01.sh b/testcases/kernel/thermal/thermal01.sh
> new file mode 100755
> index 000000000..91e4c02c5
> --- /dev/null
> +++ b/testcases/kernel/thermal/thermal01.sh
> @@ -0,0 +1,102 @@
> +#!/usr/bin/env bash

Please, don't use env, don't expect bash.
#!/bin/sh

Please, test at least on the system where dash is /bin/sh

https://linux-test-project.readthedocs.io/en/latest/developers/writing_tests.html#shell-coding-style
(Doc does not mention busybox sh, but ideally it should be tested also on busybox sh.)

> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# Copyright (C) 2025 Intel - http://www.intel.com/
> +#
> +# ---
> +# doc
> +# Tests the CPU package thermal sensor interface for Intel platforms.
> +#
> +# It works by checking the initial count of thermal interrupts. Then it
> +# decreases the threshold for sending a thermal interrupt to just above
> +# the current temperature and runs a workload on the CPU. Finally, it restores
> +# the original thermal threshold and checks whether the number of thermal
> +# interrupts increased.
> +# ---
> +#
> +# ---
> +# env
> +# {
> +#  "needs_root": true,
> +#  "supported_archs": ["x86", "x86_64"],
> +#  "needs_cmds": ["stress-ng"],
> +#  "min_runtime": 180
> +# }
> +# ---
> +
> +. tst_loader.sh
> +
> +thermal_zone_numbers=""
> +temp=""
> +temp_high=""
> +status=0

These variables do not need to be global, because you use them only in a single
function.  Could you please move them to tst_test()? And please use local?

tst_test()
{
	local thermal_zone_numbers temp temp_high
	local status=0

> +tst_test()
> +{
> +	line=$(grep "Thermal event interrupts" /proc/interrupts)
Actually any variable should use local.

> +	if [ $? -eq 0 ]; then
> +		interrupt_array_init=$(echo "$line" | tr -d "a-zA-Z:" | awk '{$1=$1;print}')
> +		tst_res TDEBUG "Initial values of thermal interrupt counters: $interrupt_array_init"
> +		num=$(nproc)

Could you please use: tst_getconf _NPROCESSORS_ONLN
Again, we try to avoid external dependencies (tst_getconf is LTP minimal getconf implementation).

> +		tst_res TDEBUG "Number of logical cores: $num"
> +	else
> +		tst_brk TCONF "Thermal event interrupts is not found."
nit: unneeded dot at the end.
> +	fi
> +
> +	# Below we check for the thermal_zone which uses x86_pkg_temp driver
> +	thermal_zone_numbers=$(grep -l x86_pkg_temp /sys/class/thermal/thermal_zone*/type | sed 's/[^0-9]//g' | tr -t '\n' ' ')

This will not work on busybox tr implementation:

tr: unrecognized option: t
BusyBox v1.36.1 (2023-11-07 18:53:09 UTC) multi-call binary.

Usage: tr [-cds] STRING1 [STRING2]

Yes, the time and pain trying to write portable shell is why I asked first to
rewrite into C which is more portable and reliable. (The other is to allow test
working without coreutils/busybox tools as external dependencies.) And using
more shell pipes increases the risk the code will fail.

First, "tr -t '\n' ' '" part is not needed at all, it will work without it:
thermal_zone_numbers=$(grep -l x86_pkg_temp /sys/class/thermal/thermal_zone*/type | sed 's/[^0-9]//g')

Do we need really to know the number? Can't we just store whole file and later
use dirname to get directory?

setup()
{
	...
	THERMAL_ZONE_NUMBERS="$(grep -l x86_pkg_temp /sys/class/thermal/thermal_zone*/type)"
}

tst_test()
{
	local i file tmp

	for i in $THERMAL_ZONE_NUMBERS; do
		file="$(dirname $i)/temp"
		temp=$(cat "$file")
	...
}

I checked on 2 x86_64 machines, with 8 and 2 thermal_zone files. Both had only
one x86_pkg_temp type, always the highest number. I wonder if we need loop at
all.

FYI C API has .save_restore and tst_run_shell.c supports it (see
testcases/kernel/mem/vma/vma05_vdso.c). Unfortunately we cannot use it, because
we are searching for one particular file from possible several.

NOTE: We usually try to do any preparation in test setup function (declared in
TST_SETUP=...  see testcases/lib/tests/shell_loader_setup_cleanup.sh) and do
cleanup in a a test cleanup function (TST_CLEANUP=...). Why? At least cleanup is
useful to have separated in case test quit during failure (e.g. tst_brk quits
testing earlier).

TST_SETUP="setup"
TST_CLEANUP="cleanup"

setup()
{
	local line

	line=$(grep "Thermal event interrupts" /proc/interrupts)

	if [ $? -ne 0 ]; then
		tst_brk TCONF "Thermal event interrupts is not found."
	fi

	tst_res TDEBUG "Number of logical cores: $(tst_getconf _NPROCESSORS_ONLN)"
	THERMAL_ZONE_NUMBERS="$(grep -l x86_pkg_temp /sys/class/thermal/thermal_zone*/type)"
}

cleanup()
{
	local i

	for i in $THERMAL_ZONE_NUMBERS; do
		file="$(dirname $i)/trip_point_1_temp"
		temp=$(cat "$file")
		echo "..." > $file
		=> but here we would need to reuse values previously stored in setup().
		Maybe not worth to complicate with setup+cleanup unless we test only
		single sysfs file.
	...
}

> +	tst_res TINFO "x86_pkg_temp thermal zones: $thermal_zone_numbers"
> +
> +	if [ -z $thermal_zone_numbers ]; then
> +		tst_brk TCONF "No x86_pkg_temp thermal zones found"
> +	fi
> +	for i in $thermal_zone_numbers; do
> +		tst_res TINFO "Currently testing x86_pkg_temp thermal_zone$i"
> +		TEMP=/sys/class/thermal/thermal_zone$i/temp
> +		temp=$(cat "$TEMP")
> +		tst_res TDEBUG "thermal_zone$i current temperature is $temp"
> +		case $temp in

> +		[0-9]*) ;;
> +		*)
> +			tst_brk TBROK "Unexpected zone temperature value $temp";;
> +		esac
> +		trip=$(cat /sys/class/thermal/thermal_zone$i/trip_point_1_temp)
> +		# Setting trip_point_1_temp for termal_zone$i to $temp + 10 (0.001°C)
> +		temp_high=$(( temp + 10 ))
> +		echo $temp_high > /sys/class/thermal/thermal_zone$i/trip_point_1_temp
> +		run_time=30
> +		sleep_time=10
> +		while [ $sleep_time -gt 0 ]; do
> +			stress-ng --matrix 0 --timeout $run_time --quiet

You happily ignore stress-ng exit code. IMHO it should be checked.

I sent a patch which adds ROD() and other functions
https://lore.kernel.org/ltp/20251120161957.331580-1-pvorel@suse.cz/

ROD stress-ng --matrix 0 --timeout $run_time --quiet

> +			temp_cur=$(cat "$TEMP")
> +			tst_res TDEBUG "temp_cur: $temp_cur"
> +			[ $temp_cur -gt $temp_high ] && break
> +			sleep $sleep_time
We have tst_sleep binary, please use it.

> +			run_time=$(( run_time - 3 ))
> +			sleep_time=$(( sleep_time - 1 ))
> +		done
> +		[ $temp_cur -gt $temp_high ] || tst_res TFAIL "Zone temperature is not rising as expected"
> +
> +		# Restore the original trip_point_1_temp value
> +		echo $trip > /sys/class/thermal/thermal_zone$i/trip_point_1_temp
nit: quoting echo is always safe

echo "$trip" > ...

In my case echo fails.

thermal01.sh:53: TINFO: Currently testing x86_pkg_temp thermal_zone8
/opt/ltp/testcases/bin/thermal01.sh: line 80: echo: write error: Invalid argument

> +
> +		# Check whether thermal interrupts count actually increased
> +		interrupt_array_later=$(grep "Thermal event interrupts" /proc/interrupts | \
> +			tr -d "a-zA-Z:" | awk '{$1=$1;print}')
Maybe, when we need to use awk, avoid tr? advantage: remove one pipe,
disadvantage: awk code less readable.

		interrupt_array_later=$(grep "Thermal event interrupts" /proc/interrupts | \
			awk '{ gsub(/[A-Za-z:]/, ""); $1 = $1 }1'

The rest LGTM.

Kind regards,
Petr

> +		tst_res TDEBUG "Current values of thermal interrupt counters: $interrupt_array_later"
> +		for j in $(seq 1 "$num"); do
> +			interrupt_later=$(echo "$interrupt_array_later" | cut -d " " -f  "$j")
> +			interrupt_init=$(echo "$interrupt_array_init" | cut -d " " -f  "$j")
> +			if [ $interrupt_later -le $interrupt_init ]; then
> +				status=1
> +			fi
> +		done
> +	done
> +
> +	if [ $status -eq 0 ]; then
> +		tst_res TPASS "x86 package thermal interrupt triggered"
> +	else
> +		tst_res TFAIL "x86 package thermal interrupt did not trigger"
> +	fi
> +}
> +
> +. tst_run.sh


More information about the ltp mailing list