[LTP] [PATCH v1] fanotify22.c: handle multiple asynchronous error events

Fri Mar 6 16:19:18 CET 2026

On Fri 06-03-26 04:50:31, Wei Gao wrote:
> On Thu, Mar 05, 2026 at 04:50:57PM +0100, Jan Kara wrote:
> > On Thu 05-03-26 14:36:31, Wei Gao wrote:
> > > On Thu, Mar 05, 2026 at 10:36:04AM +0100, Jan Kara wrote:
> > > > > 2) tcase4 failure: The kernel may deliver errors as independent events
> > > > >    instead of a single merged event, The test logic is updated to
> > > > >    validate the expected error_count by either a single merged event
> > > > >    or the accumulation of multiple independent events in the buffer.
> > > > 
> > > > Did you investigate why the events didn't get merged in the kernel? If they
> > > > are against the same filesystem they should get merged AFAICS.
> > > > 
> > > Sorry i have no idea why this happen, I just add debug code into LTP case and 
> > > found the event not lost but deliver independent, this leads me to believe that
> > > the LTP should handle both scenarios—merged and independent events.
> > > I also not sure my patch is correct or not, that's also the reason i
> > > CC the patch to you :)
> > 
> > OK :). How easily is this reproducible? Because in principle event merging
> > is not *guaranteed* so it isn't wrong for the LTP test to handle split
> > events but before we complicate the test too much it would be good to
> > figure out why the kernel behaves in unexpected way and doesn't merge the
> > events...
> > 
> > 								Honza
> 
> Can not reproduce in my local env, but i can reproduce this issue in the OpenQA worker environment, 
> with a failure rate of approximately 30%.
> 
> By comparing the logs of successful and failed merge cases, I try to identified the root cause: 
> Event merging is bypassed when errors are handled by different kworker threads.
> 
> The core of the issue lies in how the event hash is calculated. As seen in the kernel source:
> hash ^= hash_long((unsigned long)pid | ondir, FANOTIFY_EVENT_HASH_BITS);
> 
> In the failed case, different threads (e.g., T1618 and T1550) processed different events errors. 
> This places the events into different hash buckets (e.g., Bucket 89 vs. 118), finally 
> fanotify_should_merge() is never even invoked.

Aha! That was the bit I was missing. Indeed with the new fserror code the
errors are generated from work items and as you say different worker
kthread can end up generating each event so they won't be merged.

So fixing this in LTP test definitely makes sense! As I wrote in earlier
email what I'd do is that I'd create a helper function in the test reading
fserror events from the kernel and merging them and returning final array
of merged events.

The side note is that for fserror events the pid is mostly useless because
it will be pid of some random kworker but I don't think it's serious enough
to try to deal with that in the kernel somehow.

								Honza

> 
> 
> Enable pr-debug info with following command:
> echo 'func fanotify_should_merge +p' > /sys/kernel/debug/dynamic_debug/control
> 
> merge failed log:
> [ 5005.318867] [   T2675] loop0: detected capacity change from 0 to 614400
> [ 5005.324451] [   T2675] /dev/zero: Can't lookup blockdev
> [ 5005.399749] [      C0] operation not supported error, dev loop0, sector 614272 op 0x9:(WRITE_ZEROES) flags 0x10800800 phys_seg 0 prio class 2
> [ 5005.431946] [   T2675] EXT4-fs (loop0): mounted filesystem 66a18a14-c058-4356-bf1d-f34b4139ad3f r/w with ordered data mode. Quota mode: none.
> [ 5005.450493] [   T2691] EXT4-fs (loop0): unmounting filesystem 66a18a14-c058-4356-bf1d-f34b4139ad3f.
> [ 5005.472350] [   T2691] EXT4-fs (loop0): mounted filesystem 66a18a14-c058-4356-bf1d-f34b4139ad3f r/w with ordered data mode. Quota mode: none.
> [ 5005.476698] [   T2691] EXT4-fs error (device loop0): ext4_lookup:1785: inode #32386: comm fanotify22: iget: bogus i_mode (377)
> [ 5005.479819] [   T1618] fanotify_group_event_mask: report_mask=4 mask=8000 data=00000000a5a60545 data_type=6
> [ 5005.484575] [   T1618] fanotify_handle_event: group=000000005be5930d mask=8000 report_mask=4
> [ 5005.485525] [   T2691] EXT4-fs error (device loop0): __ext4_remount:6804: comm fanotify22: Abort forced by user
> [ 5005.486874] [   T1618] fanotify_insert_event: group=000000005be5930d event=00000000f1581cb4 bucket=118  <<<<<<
> [ 5005.488983] [   T2691] Aborting journal on device loop0-8.
> [ 5005.492336] [   T1550] fanotify_group_event_mask: report_mask=4 mask=8000 data=000000007f826d40 data_type=6
> [ 5005.494414] [   T2691] EXT4-fs (loop0): Remounting filesystem read-only
> [ 5005.494505] [   T1550] fanotify_handle_event: group=000000005be5930d mask=8000 report_mask=4
> [ 5005.494746] [   T2691] EXT4-fs (loop0): re-mounted 66a18a14-c058-4356-bf1d-f34b4139ad3f ro.
> [ 5005.495052] [   T1550] fanotify_merge: group=000000005be5930d event=00000000c92d9500 bucket=89
> [ 5005.501280] [   T1550] fanotify_insert_event: group=000000005be5930d event=00000000c92d9500 bucket=89  <<<<<<
> [ 5006.500649] [   T2691] EXT4-fs (loop0): unmounting filesystem 66a18a14-c058-4356-bf1d-f34b4139ad3f.
> [ 5006.505916] [   T2691] EXT4-fs (loop0): warning: mounting fs with errors, running e2fsck is recommended
> [ 5006.509571] [   T2691] EXT4-fs (loop0): mounted filesystem 66a18a14-c058-4356-bf1d-f34b4139ad3f r/w with ordered data mode. Quota mode: none.
> [ 5006.512851] [   T2675] EXT4-fs (loop0): unmounting filesystem 66a18a14-c058-4356-bf1d-f34b4139ad3f.
> 
> merge ok log:
> [ 5115.542758] [   T2732] loop0: detected capacity change from 0 to 614400
> [ 5115.549082] [   T2732] /dev/zero: Can't lookup blockdev
> [ 5115.617432] [      C0] operation not supported error, dev loop0, sector 614272 op 0x9:(WRITE_ZEROES) flags 0x10800800 phys_seg 0 prio class 2
> [ 5115.637765] [   T2732] EXT4-fs (loop0): mounted filesystem f2264636-366f-4ba7-a6c1-3f7963ce677a r/w with ordered data mode. Quota mode: none.
> [ 5115.661989] [   T2744] EXT4-fs (loop0): unmounting filesystem f2264636-366f-4ba7-a6c1-3f7963ce677a.
> [ 5115.690910] [   T2744] EXT4-fs (loop0): mounted filesystem f2264636-366f-4ba7-a6c1-3f7963ce677a r/w with ordered data mode. Quota mode: none.
> [ 5115.695573] [   T2744] EXT4-fs error (device loop0): ext4_lookup:1785: inode #32386: comm fanotify22: iget: bogus i_mode (377)
> [ 5115.697835] [   T1618] fanotify_group_event_mask: report_mask=4 mask=8000 data=00000000a5a60545 data_type=6
> [ 5115.700048] [   T1618] fanotify_handle_event: group=000000004dc01100 mask=8000 report_mask=4
> [ 5115.702552] [   T1618] fanotify_insert_event: group=000000004dc01100 event=00000000c92d9500 bucket=48  <<<<<<
> [ 5115.704872] [   T2744] EXT4-fs error (device loop0): __ext4_remount:6804: comm fanotify22: Abort forced by user
> [ 5115.707250] [   T2744] Aborting journal on device loop0-8.
> [ 5115.708638] [   T1618] fanotify_group_event_mask: report_mask=4 mask=8000 data=00000000a5a60545 data_type=6
> [ 5115.712418] [   T1618] fanotify_handle_event: group=000000004dc01100 mask=8000 report_mask=4
> [ 5115.715217] [   T1618] fanotify_merge: group=000000004dc01100 event=00000000f1581cb4 bucket=48   <<<<<<
> [ 5115.716032] [   T2744] EXT4-fs (loop0): Remounting filesystem read-only
> [ 5115.717883] [   T1618] fanotify_should_merge: old=00000000c92d9500 new=00000000f1581cb4
> [ 5115.718698] [   T2744] EXT4-fs (loop0): re-mounted f2264636-366f-4ba7-a6c1-3f7963ce677a ro.
> [ 5116.725817] [   T2744] EXT4-fs (loop0): unmounting filesystem f2264636-366f-4ba7-a6c1-3f7963ce677a.
> [ 5116.731447] [   T2744] EXT4-fs (loop0): warning: mounting fs with errors, running e2fsck is recommended
> [ 5116.734565] [   T2744] EXT4-fs (loop0): mounted filesystem f2264636-366f-4ba7-a6c1-3f7963ce677a r/w with ordered data mode. Quota mode: none.
> [ 5116.738947] [   T2732] EXT4-fs (loop0): unmounting filesystem f2264636-366f-4ba7-a6c1-3f7963ce677a.
> 
> > 
> > > > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > > > Closes: https://lore.kernel.org/oe-lkp/202602042124.87bd00e3-lkp@intel.com
> > > > > Signed-off-by: Wei Gao <wegao@suse.com>
> > > > > ---
> > > > >  .../kernel/syscalls/fanotify/fanotify22.c     | 32 ++++++++++++++++---
> > > > >  1 file changed, 28 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/testcases/kernel/syscalls/fanotify/fanotify22.c b/testcases/kernel/syscalls/fanotify/fanotify22.c
> > > > > index 6578474a7..82eed7ba9 100644
> > > > > --- a/testcases/kernel/syscalls/fanotify/fanotify22.c
> > > > > +++ b/testcases/kernel/syscalls/fanotify/fanotify22.c
> > > > > @@ -53,6 +53,8 @@ static struct fanotify_fid_t null_fid;
> > > > >  static struct fanotify_fid_t bad_file_fid;
> > > > >  static struct fanotify_fid_t bad_link_fid;
> > > > >  
> > > > > +static int event_count;
> > > > > +
> > > > >  static void trigger_fs_abort(void)
> > > > >  {
> > > > >  	SAFE_MOUNT(tst_device->dev, MOUNT_PATH, tst_device->fs_type,
> > > > > @@ -88,7 +90,6 @@ static void trigger_bad_link_lookup(void)
> > > > >  			ret, BAD_LINK, errno, EUCLEAN);
> > > > >  }
> > > > >  
> > > > > -
> > > > >  static void tcase3_trigger(void)
> > > > >  {
> > > > >  	trigger_bad_link_lookup();
> > > > > @@ -176,9 +177,10 @@ static int check_error_event_info_error(struct fanotify_event_info_error *info_e
> > > > >  {
> > > > >  	int fail = 0;
> > > > >  
> > > > > -	if (info_error->error_count != ex->error_count) {
> > > > > -		tst_res(TFAIL, "%s: Unexpected error_count (%d!=%d)",
> > > > > -			ex->name, info_error->error_count, ex->error_count);
> > > > > +	if (info_error->error_count != ex->error_count && event_count != ex->error_count) {
> > > > > +		tst_res(TFAIL, "%s: Unexpected error_count (%d!=%d && %d!=%d)",
> > > > > +			ex->name, info_error->error_count, ex->error_count,
> > > > > +			event_count, ex->error_count);
> > > > >  		fail++;
> > > > >  	}
> > > > >  
> > > > > @@ -255,8 +257,30 @@ static void do_test(unsigned int i)
> > > > >  
> > > > >  	tcase->trigger_error();
> > > > >  
> > > > > +	usleep(100000);
> > > > > +
> > > > 
> > > > OK, but can you please add a comment why the sleep is here.
> > > > 
> > > Sure!
> > > > >  	read_len = SAFE_READ(0, fd_notify, event_buf, BUF_SIZE);
> > > > >  
> > > > > +	struct fanotify_event_metadata *metadata;
> > > > > +	size_t len = read_len;
> > > > > +
> > > > > +	event_count = 0;
> > > > > +
> > > > > +	for (metadata = (struct fanotify_event_metadata *)event_buf;
> > > > > +			FAN_EVENT_OK(metadata, len);
> > > > > +			metadata = FAN_EVENT_NEXT(metadata, len)) {
> > > > > +		event_count++;
> > > > > +		struct fanotify_event_info_error *info_error = get_event_info_error(metadata);
> > > > > +
> > > > > +		if (info_error) {
> > > > > +			tst_res(TINFO, "Event [%d]: errno=%d (expected %d), error_count=%d (expected total %d)",
> > > > > +					event_count, info_error->error, tcase->error,
> > > > > +					info_error->error_count, tcase->error_count);
> > > > > +		} else {
> > > > > +			tst_res(TINFO, "Event [%d]: No error info record found", event_count);
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > 
> > > > This looks too lax to me. I think
> > > > 
> > > I guess your mean this is workaround for the issue? Are you suggesting that I should reconstruct the test case 
> > > to properly handle independent events rather than relying on the existing logic? 
> > > If so, I’d be happy to explore that and implement a more robust solution.
> > > 
> > > > -- 
> > > > Jan Kara <jack@suse.com>
> > > > SUSE Labs, CR
> > -- 
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR