[LTP] [linux-next:master] [block/bdev] 3c20917120: BUG:sleeping_function_called_from_invalid_context_at_mm/util.c

Tue Apr 8 20:51:03 CEST 2025

On Tue, Apr 08, 2025 at 11:02:40AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 08, 2025 at 06:51:14PM +0100, Matthew Wilcox wrote:
> > On Tue, Apr 08, 2025 at 10:48:55AM -0700, Darrick J. Wong wrote:
> > > On Tue, Apr 08, 2025 at 10:24:40AM -0700, Luis Chamberlain wrote:
> > > > On Tue, Apr 8, 2025 at 10:06 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > > > > Fun
> > > > > puzzle for the community is figuring out *why* oh why did a large folio
> > > > > end up being used on buffer-heads for your use case *without* an LBS
> > > > > device (logical block size) being present, as I assume you didn't have
> > > > > one, ie say a nvme or virtio block device with logical block size  >
> > > > > PAGE_SIZE. The area in question would trigger on folio migration *only*
> > > > > if you are migrating large buffer-head folios. We only create those
> > > > 
> > > > To be clear, large folios for buffer-heads.
> > > > > if
> > > > > you have an LBS device and are leveraging the block device cache or a
> > > > > filesystem with buffer-heads with LBS (they don't exist yet other than
> > > > > the block device cache).
> > > 
> > > My guess is that udev or something tries to read the disk label in
> > > response to some uevent (mkfs, mount, unmount, etc), which creates a
> > > large folio because min_order > 0, and attaches a buffer head.  There's
> > > a separate crash report that I'll cc you on.
> > 
> > But you said:
> > 
> > > the machine is arm64 with 64k basepages and 4k fsblock size:
> > 
> > so that shouldn't be using large folios because you should have set the
> > order to 0.  Right?  Or did you mis-speak and use a 4K PAGE_SIZE kernel
> > with a 64k fsblocksize?
> 
> This particular kernel warning is arm64 with 64k base pages and a 4k
> fsblock size, and my suspicion is that udev/libblkid are creating the
> buffer heads or something weird like that.
> 
> On x64 with 4k base pages, xfs/032 creates a filesystem with 64k sector
> size and there's an actual kernel crash resulting from a udev worker:
> https://lore.kernel.org/linux-fsdevel/20250408175125.GL6266@frogsfrogsfrogs/T/#u
> 
> So I didn't misspeak, I just have two problems.  I actually have four
> problems, but the others are loop device behavior changes.

Right, but this warning only triggers for large folios.  So somehow
we've got a multi-page folio in the bdev's page cache.

Ah.  I see.

block/bdev.c:   mapping_set_folio_min_order(BD_INODE(bdev)->i_mapping,

so we're telling the bdev that it can go up to MAX_PAGECACHE_ORDER.
And then we call readahead, which will happily put order-2 folios
in the pagecache because of my bug that we've never bothered fixing.

We should probably fix that now, but as a temporary measure if
you'd like to put:

mapping_set_folio_order_range(BD_INODE(bdev)->i_mapping, min, min)

instead of the mapping_set_folio_min_order(), that would make the bug
no longer appear for you.