[LTP] [PATCH] fanotify10: Calling drop_cache three times to ensure the inode is evicted

Jan Kara jack@suse.cz
Fri Oct 18 13:52:24 CEST 2024


On Fri 18-10-24 15:13:53, Zizhi Wo wrote:
> In commmit 6df425bb7040 ("fanotify10: Calling drop_cache twice to ensure
> the inode is evicted"), the number of drop_cache operations was increased
> to two in order to prevent inodes from being left uncleared due to inodes
> and dentries being on different NUMA nodes, which would cause the testcase
> to fail.
> 
> However, during our local testing, I found that this testcase still
> occasionally fails:
> 
> fanotify10.c:771: TINFO: Test #27: don't ignore fs events created inside a parent with evicted ignore mark
> fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
> fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
> fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
> fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
> fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
> fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
> ......
> fanotify10.c:841: TPASS: group 0 (8) got 16 events: mask 20 pid=22962
> fanotify10.c:841: TPASS: group 1 (8) got 16 events: mask 20 pid=22962
> fanotify10.c:841: TPASS: group 2 (8) got 16 events: mask 20 pid=22962
> fanotify10.c:836: TFAIL: group 0 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
> fanotify10.c:836: TFAIL: group 1 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
> fanotify10.c:836: TFAIL: group 2 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
> ......

Sigh, I'm not sure we can ever make this work reliably.
 
> In this testcase(Test #27), ignore_path is "fs_mnt/testdir", and event_path
> is "fs_mnt/testdir/testfile". The purpose of ignore_path is to verify that
> the ignore_mark does not pin the corresponding inode. If any inodes remain
> after drop_cache, the test case will fail. Here, the ignore_path is located
> in a two-level directory structure, and two drop_cache operations might
> still not be enough.
> 
> Consider the scenario where the parent inode is on NUMA0, the parent dentry
> is on NUMA1, the child inode is on NUMA2, and the child dentry is on NUMA3.
> After the first drop_cache, the child dentry is cleared, its inode and
> parent dentry are placed in the *corresponding* lru link list; after the
> second drop_cache, the parent dentry and the child inode are cleared; only
> after the third drop_cache would the parent inode be fully released. The
> corresponding kernel flow is as follows:
> 
> drop_caches_sysctl_handler
>   drop_slab
>     // Traverse NUMA in order.
>     drop_slab_node
>     ...
>       // Free dentry, child dentry pin parent dentry and its inode.
>       prune_dcache_sb
>       ...
>         dentry_unlink_inode
>         ...
>           // Place the inode into its corresponding NUMA lru link list.
>           list_lru_add
>       /*
>        * Free inode. If the NUMA where the inode resides is different from
>        * its dentry, the inode may not be released this time.
>        */
>       prune_icache_sb
> 
> 			drop_cache1	drop_cache2	drop_cache3
> NUMA0: parent inode	exist		exist		free
> NUMA1: parent dentry	exist		free		free
> NUMA2: child inode	exist		free		free
> NUMA3: child dentry	free		free		free

Well, this is right but there's also the while ((freed >> shift++) > 1)
loop in drop_slab() which should generally make us loop as long as there's
something to reclaim. But yes, if in theory the only thing we can reclaim
is the child dentry in the first round, then what you suggest may happen.

> Due to the release of the dependency chain, the drop_cache cleanup also
> takes several times. Therefore, to be safe, three drop_cache operations are
> needed to handle the two-level directory structure.

OK, I'm willing to give this one last try. If it doesn't work out, I'd just
drop these tests until we can find a more reliable way of testing this.

Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
> ---
>  testcases/kernel/syscalls/fanotify/fanotify10.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/testcases/kernel/syscalls/fanotify/fanotify10.c b/testcases/kernel/syscalls/fanotify/fanotify10.c
> index ce5469580..eedd1442f 100644
> --- a/testcases/kernel/syscalls/fanotify/fanotify10.c
> +++ b/testcases/kernel/syscalls/fanotify/fanotify10.c
> @@ -515,7 +515,11 @@ static void drop_caches(void)
>  	if (syncfs(fd_syncfs) < 0)
>  		tst_brk(TBROK | TERRNO, "Unexpected error when syncing filesystem");
>  
> -	/* Need to drop twice to ensure the inode is evicted. */
> +	/*
> +	 * In order to ensure that the inode can be released in the two-tier
> +	 * directory structure, drop_cache is required three times.
> +	 */
> +	SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
>  	SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
>  	SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
>  }
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


More information about the ltp mailing list