[LTP] [PATCH v3 2/3] lib/tst_test.c: Update result counters when calling tst_brk()
Jan Stancek
jstancek@redhat.com
Mon Jan 7 18:39:35 CET 2019
----- Original Message -----
> Hi!
> > 1) Catch and report the TFAIL exit status of child process.
>
> Looking at the codebase we do have a few usages of tst_brk(TFAIL, "...")
> to exit the child process, which sort of works but it's incorrect. The
> tst_brk() always meant "unrecoverable failure have happened, exit the
> current process as fast as possible". Looking over our codebase most of
> the tst_brk(TFAIL, "...") should not actually cause the main test
> process to exit, these were only meant to exit the child and report the
> result in one call. It will for instance break the test with -i option
> on the first failure, which is incorrect.
Nice example, would you care to add that to docs?
>
> So if we ever want to have a function to exit child process with a result we
> should implement tst_ret() that would be equivalent to tst_res() followed by
> exit(0).
>
> It could be even implemented as:
>
> #define tst_ret(ttype, fmt, ...) \
> do { \
> tst_res_(__FILE__, __LINE__, (ttype), (fmt), ##__VA_ARGS__); \
> exit(0); \
> } while (0)
>
> This function has one big advantage, it increments the results counters
> before the child process exits.
If all call-sites switch to tst_ret(), we could add TFAIL to tst_brk
compile time check.
>
> Actually one of the big points of the new test library was that the
> results counters are atomically increased, because passing the results
> in exit values is nightmare that cannot be done correclty.
>
> > 2) Only update result counters in library process and main test
> > process because the exit status of child can be reported by
> > main test process.
>
> Actually after I spend some time on it I think that the best solution is
> to update the results in the piece of shared memory as fast as possible,
> anything else is prone to various races and corner cases.
I was thinking this too.
If your parent process happens to wait for the child itself,
then library will never get to see retcode.
Regards,
Jan
>
> > 3) Print TCONF message and increase skipped when calling tst_brk(TCONF).
> > Print TBROK message and increase broken when calling tst_brk(TBROK).
> > Print TFAIL message and increase failed when calling tst_brk(TFAIL).
> > 4) Remove duplicate update_results() in run_tcases_per_fs().
>
> I've been thinking about this and the problem is more complex, and I'm
> even not sure that it's possible to write the library so that the
> counters are consistent at the time we exit the test if something
> unexpected happened and we called tst_brk().
>
> Consider for instance this example:
>
> #include "tst_test.h"
>
> static void do_test(void)
> {
> if (!SAFE_FORK())
> tst_brk(TBROK, "child");
> tst_brk(TBROK, "parent");
> }
>
> static struct tst_test test = {
> .test_all = do_test,
> .forks_child = 1,
> };
>
> When tst_brk() is called both in parent and child the counter would be
> incremented only once because the child is not waited for by the main
> test.
>
> We can close this special case by changing the main test pid to wait for the
> children before it calls exit() in the tst_brk() but that may cause the
> main process to get stuck undefinitely if the child processes get stuck,
> so we would have to be careful.
>
> Also from the very definition of the TBROK return status the test
> results would be incomplete at best, since TBROK really means
> "unrecoverable error happened during the test" which would mostly means
> that something as low level as filesystem got corrupted and there is no
> point in presenting the results in that case, so I guess that the best
> we could do in the case of TBROK is to print big message that says
> "things went horribly wrong!" or something similar.
>
> All in all I would like to avoid applying patches to the test library
> before we finalize the release, since there is not much time for
> testing now.
>
> --
> Cyril Hrubis
> chrubis@suse.cz
>
More information about the ltp
mailing list