[LTP] [PATCH v3] thermal: add new test group
Petr Vorel
pvorel@suse.cz
Thu Jan 22 15:07:06 CET 2026
Hi Piotr,
not a complete review, just few thoughts.
> diff --git a/testcases/kernel/thermal/thermal_interrupt_events.c b/testcases/kernel/thermal/thermal_interrupt_events.c
...
> +/*\
> + * Tests the CPU package thermal sensor interface for Intel platforms.
> +
> + * Works by checking the initial count of thermal interrupts.
IMHO this part should be in setup function.
> Then it
> + * decreases the threshold for sending a thermal interrupt to just above
> + * the current temperature and runs a workload on the CPU.
First, why test needs to run for 30 sec and then sleep for 10 sec?
Is it possible to check the expected value from sysfs and quit testing before?
Also we have TST_RETRY_FUNC() and TST_RETRY_FN_EXP_BACKOFF() which could be used
for the loop (you create a function which does the check to quit before the
timeout.
> Finally, it restores
> + * the original thermal threshold and checks whether the number of thermal
> + * interrupts increased.
NOTE: the restore will not work if test exits before for some reason
(i.e. any of SAFE_*() macros fail). Therefore restore should be in a cleanup
function.
> + */
> +
> +#include "tst_safe_stdio.h"
> +#include "tst_test.h"
> +#include <ctype.h>
> +#include <pthread.h>
> +#define PATH_LEN 69
> +#define STRING_LEN 23
> +
> +static void *cpu_workload(void *arg)
> +{
> + time_t start_time = time(NULL);
> + int num = 2;
> +
> + while (difftime(time(NULL), start_time) < *(double *)arg) {
difftime() is sufficient, but FYI we have include/tst_timer.h for
measuring time difference (which uses more precise struct timeval).
> + for (int i = 2; i * i <= num; i++) {
> + if (num % i == 0)
> + break;
> + }
> + num++;
This is supposed to do some real workload on CPU? Is that really enough?
> + }
> + return NULL;
> +}
...
> +static void run(void)
> +{
> + bool status = 1;
> + char line[8192];
> + int nproc = tst_ncpus();
> + uint64_t interrupt_init[nproc], interrupt_later[nproc];
> +
> + tst_res(TDEBUG, "Number of logical cores: %d", nproc);
> + read_interrupts(interrupt_init, nproc);
> +
> + DIR *dir = SAFE_OPENDIR("/sys/class/thermal/");
> + struct dirent *entry;
> + uint8_t tz_counter = 0;
I'm not sure if AI advices you to use it, but really we don't optimize that much
that we could not afford just to use plain int. Using exact-width types is
usually used only when needed, e.g. when mapping some struct from kernel.
> + while ((entry = SAFE_READDIR(dir))) {
> + if ((strncmp(entry->d_name, "thermal_zone", sizeof("thermal_zone"))) > 0)
> + tz_counter++;
> + }
> + SAFE_CLOSEDIR(dir);
> + tst_res(TDEBUG, "Found %d thermal zone(s)", tz_counter);
I still think this part can be done in setup and needed values stored in static
variables (outside function), i.e. static int tz_counter, nproc; Why? Because some
people can run test 1000x via "-i 1000", therefore we cache various preparation
results.
> +
> + bool x86_pkg_temp_tz[tz_counter], x86_pkg_temp_tz_found = 0;
> +
> + memset(x86_pkg_temp_tz, 0, sizeof(x86_pkg_temp_tz));
> +
> + for (uint8_t i = 0; i < tz_counter; i++) {
> + char path[PATH_LEN];
> +
> + snprintf(path, PATH_LEN, "/sys/class/thermal/thermal_zone%d/type", i);
> + tst_res(TDEBUG, "Checking whether %s is x86_pkg_temp", path);
> +
> + SAFE_FILE_SCANF(path, "%s", line);
> + if (strstr(line, "x86_pkg_temp")) {
> + tst_res(TDEBUG, "Thermal zone %d uses x86_pkg_temp", i);
> + x86_pkg_temp_tz[i] = 1;
> + x86_pkg_temp_tz_found = 1;
> + }
> + }
> + if (!x86_pkg_temp_tz_found) {
> + tst_res(TINFO, "No thermal zone uses x86_pkg_temp");
> + status = 0;
> + }
> +
> + for (uint8_t i = 0; i < tz_counter; i++) {
Again, please just use int.
> + if (x86_pkg_temp_tz[i]) {
> + char path[PATH_LEN], temp_path[PATH_LEN], trip_path[PATH_LEN], temp_high[12], trip[12];
You read and write integer values. Using char array is overcomplicated:
int trip, temp_high = temp + 10;
(But maybe have 10 defined as a constant with meaningful name. That is kind of
documentation (instead reader having to read the code to understand the meaning
of the number).
#define TEMP_INCREMENT 10
> + double run_time = 30;
#define RUNTIME 30
> + uint8_t sleep_time = 10;
#define SLEEP 10
> + int temp;
> +
> + snprintf(path, PATH_LEN, "/sys/class/thermal/thermal_zone%d/", i);
> + strncpy(temp_path, path, PATH_LEN);
> + strncat(temp_path, "temp", 4);
> + tst_res(TDEBUG, "Testing %s", temp_path);
nit: I'd put this as TINFO to get at least some debug info without -D.
> + SAFE_FILE_SCANF(temp_path, "%d", &temp);
All SAFE_*() macros quit testing, therefore the following check is not needed.
> + if (temp < 0) {
> + tst_brk(TBROK, "Unexpected zone temperature value %d", temp);
> + status = 0;
> + }
> + tst_res(TDEBUG, "Current temperature for %s: %d", path, temp);
> +
This would not be needed:
> + snprintf(temp_high, sizeof(temp_high), "%d", temp + 10);
> +
> + strncpy(trip_path, path, PATH_LEN);
> + strncat(trip_path, "trip_point_1_temp", 17);
> +
> + tst_res(TDEBUG, "Setting new trip_point_1_temp value: %s", temp_high);
> + SAFE_FILE_SCANF(trip_path, "%s", trip);
SAFE_FILE_SCANF(trip_path, "%d", &trip);
> + SAFE_FILE_PRINTF(trip_path, "%s", temp_high);
SAFE_FILE_PRINTF(trip_path, "%d", temp_high);
> +
> + while (sleep_time > 0) {
> + tst_res(TDEBUG, "Running for %f seconds, then sleeping for %d seconds", run_time, sleep_time);
> + pthread_t threads[nproc];
> +
> + for (uint16_t j = 0; j < nproc; j++)
> + pthread_create(&threads[j], NULL, cpu_workload, &run_time);
IMHO using pthread (and therefore -lpthread) is overkill.
Could you please use simple fork()?
testcases/cve/cve-2017-2618.c
for (i = 0; i < nproc; i++) {
if (!SAFE_FORK()) {
cpu_workload();
exit(0);
}
tst_reap_children();
}
> + for (uint16_t j = 0; j < nproc; j++)
> + pthread_join(threads[j], NULL);
> +
> + SAFE_FILE_SCANF(temp_path, "%d", &temp);
> + tst_res(TDEBUG, "Temperature for %s after a test: %d", path, temp);
> +
> + if (temp > atol(temp_high))
> + break;
> + sleep(sleep_time--);
> + run_time -= 3;
> + }
> + if (temp <= atol(temp_high)) {
> + tst_res(TINFO, "Zone temperature is not rising as expected");
> + status = 0;
> + }
> +
> + tst_res(TDEBUG, "Restoring original trip_point_1_temp value: %s", trip);
> + SAFE_FILE_PRINTF(trip_path, "%s", trip);
> + }
> + }
> + read_interrupts(interrupt_later, nproc);
> +
> + for (uint16_t i = 0; i < nproc; i++) {
> + if (interrupt_later[i] < interrupt_init[i]) {
> + tst_res(TINFO, "For CPU %d interrupt counter is currently %ld, while it was %ld before the test", i, interrupt_later[i], interrupt_init[i]);
> + status = 0;
> + }
> + }
> +
> + if (status)
> + tst_res(TPASS, "x86 package thermal interrupt triggered");
> + else
> + tst_res(TFAIL, "x86 package thermal interrupt did not trigger");
> +}
> +
> +static struct tst_test test = {
> + .min_runtime = 180,
> + .needs_root = true,
nit: We always set it as 1.
Kind regards,
Petr
> + .supported_archs = (const char *const []) {
> + "x86",
> + "x86_64",
> + NULL
> + },
> + .test_all = run
> +};
More information about the ltp
mailing list