[LTP] hugemmap24 failure on aarch64 with 512MB hugepages

Li Wang liwang@redhat.com
Fri Mar 10 14:54:46 CET 2023


Hi Jan,

On Thu, Mar 9, 2023 at 10:01 PM Li Wang <liwang@redhat.com> wrote:

>
>
> On Thu, Mar 9, 2023 at 6:01 PM Jan Stancek <jstancek@redhat.com> wrote:
>
>> On Thu, Mar 9, 2023 at 9:29 AM Li Wang <liwang@redhat.com> wrote:
>> >
>> > [Cc'ing Jan Stancek]
>> >
>> > On Wed, Mar 8, 2023 at 5:51 PM Cyril Hrubis <chrubis@suse.cz> wrote:
>> >>
>> >> Hi!
>> >> Looks like the hugemmap24 test fails on aarch64 with 512MB hugepages
>> >> since it attempts to MAP_FIXED at NULL address, any idea why aarch64 is
>> >> limited to 0x10000000 as slice boundary?
>> >
>> >
>> > It looks like a generic/random slice_boundary that tries as a
>> > basic gap between two available free neighbor slices.
>> >
>> >
>> https://github.com/libhugetlbfs/libhugetlbfs/commit/8ee2462f3f6eea72067641a197214610443576b8
>> >
>> https://github.com/libhugetlbfs/libhugetlbfs/commit/399cda578564bcd52553ab88827a82481b4034d1
>> >
>> > I guess it doesn't matter just to increase the size of the boundary.
>> > or, we can skip testing on a big page-size system like aarch64(with
>> 512MB)
>> > if unable to find the free slices.
>> >
>> > Test passed from my side with patch:
>> >
>> > --- a/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24.c
>> > +++ b/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24.c
>> > @@ -37,7 +37,7 @@ static int init_slice_boundary(int fd)
>> >  #else
>> >         /* powerpc: 256MB slices up to 4GB */
>> >         slice_boundary = 0x00000000;
>> > -       slice_size = 0x10000000;
>> > +       slice_size = 0x100000000;
>>
>> This would likely negatively impact 32-bit, as it makes slice size 4GB.
>>
>> With so large hugepages it underflows mmap address, so I'd increase it,
>> until
>> we start with one larger than zero:
>>
>> diff --git a/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24.c
>> b/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24.c
>> index a465aad..9523067 100644
>> --- a/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24.c
>> +++ b/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24.c
>> @@ -23,7 +23,7 @@
>>
>>  static int  fd = -1;
>>  static unsigned long slice_boundary;
>> -static long hpage_size, page_size;
>> +static unsigned long hpage_size, page_size;
>>
>>  static int init_slice_boundary(int fd)
>>  {
>> @@ -40,6 +40,10 @@ static int init_slice_boundary(int fd)
>>         slice_size = 0x10000000;
>>  #endif
>>
>> +       /* avoid underflow on systems with large huge pages */
>> +       while (slice_boundary + slice_size < 2 * hpage_size)
>> +               slice_boundary += slice_size;
>> +
>>         /* dummy malloc so we know where is heap */
>>         heap = malloc(1);
>>         free(heap);
>>
>>
>> Another issue however is the use of MAP_FIXED, which can stomp over
>> existing mappings:
>>
>
> If we make the slice_size larger than 2*hpage_size, this situation
> will be avoided I guess. Because the gap between the two neighbor
> slices guarantees there is no chance to overlap.
>

Ignore this comment please, I misunderstood it last night.

The root cause is very likely as you analyzed the random
picked 'slice_boundary' easily remap to an existing used
address so that collision triggers a SIGSEGV.

I haven't figured out which part overlaps but from my debug,
the bad process even can't read the complete info of '/proc/pid/maps'.

This seems quite tricky to avoid but what I can think now
is to choose that 'slice_boundary' plus the heap address:

+       while (slice_boundary + slice_size < (long unsigned int)heap +
2*hpage_size)
+               slice_boundary += slice_size;

With this change, the test doesn't fail anymore on that aarch64(512MB)
machine.

At least it chooses a boundary above the heap and avoids some collisions I
guess.


# ./hugemmap24
tst_hugepage.c:83: TINFO: 4 hugepage(s) reserved
tst_test.c:1560: TINFO: Timeout per run is 0h 00m 30s
hugemmap24.c:47: TINFO: heap is 0x2d533030
hugemmap24.c:63: TINFO: can't use slice_boundary: 0x70000000: EINVAL (22)
hugemmap24.c:75: TINFO: using slice_boundary: 0x80000000
hugemmap24.c:92: TINFO: Testing with hpage above & normal below the
slice_boundary
hugemmap24.c:114: TINFO: mremap(0x7fff0000, 65536, 131072, 0) disallowed
hugemmap24.c:130: TINFO: Testing with normal above & hpage below the
slice_boundary
hugemmap24.c:152: TINFO: mremap(0x60000000, 536870912, 1073741824, 0)
disallowed
hugemmap24.c:165: TPASS: Successful

hugemmap24.c:166: TINFO: p = 0xa0000000, q = 0x60000000

=====================
00400000-00430000 r-xp 00000000 fd:00 1918877
 /root/ltp.upstream/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24
00430000-00440000 r--p 00020000 fd:00 1918877
 /root/ltp.upstream/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24
00440000-00450000 rw-p 00030000 fd:00 1918877
 /root/ltp.upstream/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24
00450000-00460000 rw-p 00000000 00:00 0

2d530000-2d560000 rw-p 00000000 00:00 0
 [heap]
60000000-80000000 rw-s 00000000 00:2e 41091
 /tmp/LTP_hug8eJ8A5/hugetlbfs/ltp_hugD1qB5w (deleted)
a0000000-a0010000 rw-s 00000000 00:01 41093
 /dev/zero (deleted)

ffff947f0000-ffff94950000 r-xp 00000000 fd:00 33555860
/usr/lib64/libc-2.28.so
ffff94950000-ffff94960000 r--p 00150000 fd:00 33555860
/usr/lib64/libc-2.28.so
ffff94960000-ffff94970000 rw-p 00160000 fd:00 33555860
/usr/lib64/libc-2.28.so
ffff94970000-ffff94990000 r-xp 00000000 fd:00 33555872
/usr/lib64/libpthread-2.28.so
ffff94990000-ffff949a0000 r--p 00010000 fd:00 33555872
/usr/lib64/libpthread-2.28.so
ffff949a0000-ffff949b0000 rw-p 00020000 fd:00 33555872
/usr/lib64/libpthread-2.28.so
ffff949b0000-ffff949c0000 r-xp 00000000 fd:00 33556231
/usr/lib64/libnuma.so.1.0.0
ffff949c0000-ffff949d0000 r--p 00000000 fd:00 33556231
/usr/lib64/libnuma.so.1.0.0
ffff949d0000-ffff949e0000 rw-p 00010000 fd:00 33556231
/usr/lib64/libnuma.so.1.0.0
ffff949e0000-ffff949f0000 rw-s 00000000 00:17 42319
 /dev/shm/ltp_hugemmap24_5572 (deleted)
ffff949f0000-ffff94a10000 r--p 00000000 00:00 0
 [vvar]
ffff94a10000-ffff94a20000 r-xp 00000000 00:00 0
 [vdso]
ffff94a20000-ffff94a50000 r-xp 00000000 fd:00 33555853
/usr/lib64/ld-2.28.so
ffff94a50000-ffff94a60000 r--p 00020000 fd:00 33555853
/usr/lib64/ld-2.28.so
ffff94a60000-ffff94a70000 rw-p 00030000 fd:00 33555853
/usr/lib64/ld-2.28.so
fffffca00000-fffffca30000 rw-p 00000000 00:00 0
 [stack]
=====================

Summary:
passed   1
failed   0
broken   0
skipped  0
warnings 0



>> [pid 48607] 04:50:51 mmap(0x20000000, 2147483648, PROT_READ,
>> MAP_SHARED|MAP_FIXED, 3, 0) = 0x20000000
>> [pid 48607] 04:50:51 munmap(0x20000000, 2147483648) = 0
>>
>> test may PASS, but at the end you get:
>>
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  numa_bitmask_free (bmp=0x39ae09a0) at libnuma.c:228
>> 228             free(bmp->maskp);
>> (gdb) bt
>> #0  numa_bitmask_free (bmp=0x39ae09a0) at libnuma.c:228
>> #1  numa_bitmask_free (bmp=0x39ae09a0) at libnuma.c:224
>> #2  0x0000ffff89263360 in numa_fini () at libnuma.c:114
>> #3  0x0000ffff892d4cac in _dl_fini () at dl-fini.c:141
>> #4  0x0000ffff890d899c in __run_exit_handlers (status=status@entry=0,
>> listp=0xffff892105e0 <__exit_funcs>,
>> run_list_atexit=run_list_atexit@entry=true,
>> run_dtors=run_dtors@entry=true) at exit.c:108
>> #5  0x0000ffff890d8b1c in __GI_exit (status=status@entry=0) at exit.c:139
>> #6  0x000000000040ecd8 in testrun () at tst_test.c:1468
>> #7  fork_testrun () at tst_test.c:1592
>> #8  0x0000000000410728 in tst_run_tcases (argc=<optimized out>,
>> argv=<optimized out>, self=self@entry=0x440650 <test>) at
>> tst_test.c:1686
>> #9  0x0000000000403ef8 in main (argc=<optimized out>, argv=<optimized
>> out>) at ../../../../../include/tst_test.h:394
>>
>> (gdb) info proc map
>> Mapped address spaces:
>>
>>           Start Addr           End Addr       Size     Offset objfile
>>             0x400000           0x430000    0x30000        0x0
>> /root/ltp.upstream/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24
>>             0x430000           0x440000    0x10000    0x20000
>> /root/ltp.upstream/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24
>>             0x440000           0x450000    0x10000    0x30000
>> /root/ltp.upstream/testcases/kernel/mem/hugetlb/hugemmap/hugemmap24
>>       0xffff890a0000     0xffff89200000   0x160000        0x0
>> /usr/lib64/libc-2.28.so
>>       0xffff89200000     0xffff89210000    0x10000   0x150000
>> /usr/lib64/libc-2.28.so
>>       0xffff89210000     0xffff89220000    0x10000   0x160000
>> /usr/lib64/libc-2.28.so
>>       0xffff89220000     0xffff89240000    0x20000        0x0
>> /usr/lib64/libpthread-2.28.so
>>       0xffff89240000     0xffff89250000    0x10000    0x10000
>> /usr/lib64/libpthread-2.28.so
>>       0xffff89250000     0xffff89260000    0x10000    0x20000
>> /usr/lib64/libpthread-2.28.so
>>       0xffff89260000     0xffff89270000    0x10000        0x0
>> /usr/lib64/libnuma.so.1.0.0
>>       0xffff89270000     0xffff89280000    0x10000        0x0
>> /usr/lib64/libnuma.so.1.0.0
>>       0xffff89280000     0xffff89290000    0x10000    0x10000
>> /usr/lib64/libnuma.so.1.0.0
>>       0xffff89290000     0xffff892a0000    0x10000        0x0
>> /dev/shm/ltp_hugemmap24_48644 (deleted)
>>       0xffff892d0000     0xffff89300000    0x30000        0x0
>> /usr/lib64/ld-2.28.so
>>       0xffff89300000     0xffff89310000    0x10000    0x20000
>> /usr/lib64/ld-2.28.so
>>       0xffff89310000     0xffff89320000    0x10000    0x30000
>> /usr/lib64/ld-2.28.so
>>
>>
>
> --
> Regards,
> Li Wang
>


-- 
Regards,
Li Wang


More information about the ltp mailing list