[LTP] [mcgrof:20240502-large-block-minorder-ignore-debugfs] [mm] 98bf6a5549: dmesg.BUG:Bad_rss-counter_state_mm:#type:MM_FILEPAGES_val

kernel test robot yujie.liu@intel.com
Sat May 11 10:51:21 CEST 2024


Hello,

kernel test robot noticed "dmesg.BUG:Bad_rss-counter_state_mm:#type:MM_FILEPAGES_val" on:

commit: 98bf6a554986e068b9c94dfe8d8004cbe22cee96 ("mm: split a folio in minimum folio order chunks")
https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux.git 20240502-large-block-minorder-ignore-debugfs

in testcase: ltp
version: ltp-x86_64-14c1f76-1_20240508
with following parameters:

	disk: 1HDD
	test: mm-00

compiler: gcc-13
test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz (Kaby Lake) with 32G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <yujie.liu@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202405111644.40d50350-lkp@intel.com


[  612.787878][  T361] <<<test_start>>>
[  612.794124][  T361] tag=ksm01 stime=1715338753
[  612.801061][  T361] cmdline="ksm01"
[  612.806948][  T361] contacts=""
[  612.812510][  T361] analysis=exit
[  612.818328][  T361] <<<test_output>>>
[  612.825279][  T361] tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
[  612.836222][  T361] tst_test.c:1730: TINFO: LTP version: 20240129-247-g6052dca5d
...
[  617.229304][  T361] mem.c:320: TFAIL: child 2 has c at 2,1,672162.
[  617.239393][  T361] mem.c:320: TFAIL: child 2 has c at 2,1,672163.
[  617.245639][ T3313] BUG: Bad rss-counter state mm:000000005a9ef575 type:MM_FILEPAGES val:-1
[  617.256353][ T3313] BUG: Bad rss-counter state mm:000000005a9ef575 type:MM_ANONPAGES val:1
[  617.262328][  T360] BUG: Bad rss-counter state mm:0000000093b60f35 type:MM_FILEPAGES val:-1
[  617.262332][  T360] BUG: Bad rss-counter state mm:0000000093b60f35 type:MM_ANONPAGES val:1


[  607.433779][  T361] <<<test_start>>>
[  607.440371][  T361] tag=mmap10_4 stime=1715338751
[  607.447741][  T361] cmdline="mmap10 -a -s -i 60"
[  607.454734][  T361] contacts=""
[  607.460279][  T361] analysis=exit
[  607.466110][  T361] <<<test_output>>>
...
[  610.420345][  T361] mmap10      0  TINFO  :  start tests.
[  610.428698][  T361] mmap10      0  TINFO  :  add to KSM regions.
[  610.437665][  T361] mmap10      0  TINFO  :  use anonymous pages.
[  610.446601][  T361] mmap10      0  TINFO  :  start tests.
[  610.452887][   T79] ------------[ cut here ]------------
[  610.454940][  T361] mmap10      0  TINFO  :  add to KSM regions.
[ 610.459504][ T79] WARNING: CPU: 7 PID: 79 at mm/gup.c:229 try_grab_page (mm/gup.c:229 (discriminator 1))
[  610.473307][   T79] Modules linked in: btrfs blake2b_generic xor intel_rapl_msr intel_rapl_common zstd_compress x86_pkg_temp_thermal intel_powerclamp coretemp raid6_pq kvm_intel libcrc32c i915 kvm sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 crct10dif_pclmul drm_buddy crc32_pclmul intel_gtt crc32c_intel sg ghash_clmulni_intel drm_display_helper sha512_ssse3 ttm rapl drm_kms_helper ahci wmi_bmof mei_wdt libahci video intel_cstate ipmi_devintf ipmi_msghandler intel_uncore i2c_designware_platform i2c_i801 mei_me libata idma64 i2c_designware_core i2c_smbus mei wmi pinctrl_sunrisepoint acpi_pad binfmt_misc fuse loop drm dm_mod ip_tables
[  610.532426][   T79] CPU: 7 PID: 79 Comm: ksmd Tainted: G S                 6.9.0-rc6-00008-g98bf6a554986 #1
[  610.532430][   T79] Hardware name: Dell Inc. OptiPlex 7050/062KRH, BIOS 1.2.0 12/22/2016
[ 610.548196][ T79] RIP: 0010:try_grab_page (mm/gup.c:229 (discriminator 1))
[ 610.548201][ T79] Code: 40 f6 c5 01 0f 84 1a fe ff ff 48 83 ed 01 e9 14 fe ff ff be 04 00 00 00 4c 89 e7 e8 ad 01 14 00 f0 41 ff 04 24 e9 67 ff ff ff <0f> 0b b8 f4 ff ff ff 5b 5d 41 5c 41 5d c3 cc cc cc cc e8 8c 01 14
All code
========
   0:   40 f6 c5 01             test   $0x1,%bpl
   4:   0f 84 1a fe ff ff       je     0xfffffffffffffe24
   a:   48 83 ed 01             sub    $0x1,%rbp
   e:   e9 14 fe ff ff          jmp    0xfffffffffffffe27
  13:   be 04 00 00 00          mov    $0x4,%esi
  18:   4c 89 e7                mov    %r12,%rdi
  1b:   e8 ad 01 14 00          call   0x1401cd
  20:   f0 41 ff 04 24          lock incl (%r12)
  25:   e9 67 ff ff ff          jmp    0xffffffffffffff91
  2a:*  0f 0b                   ud2             <-- trapping instruction
  2c:   b8 f4 ff ff ff          mov    $0xfffffff4,%eax
  31:   5b                      pop    %rbx
  32:   5d                      pop    %rbp
  33:   41 5c                   pop    %r12
  35:   41 5d                   pop    %r13
  37:   c3                      ret
  38:   cc                      int3
  39:   cc                      int3
  3a:   cc                      int3
  3b:   cc                      int3
  3c:   e8                      .byte 0xe8
  3d:   8c 01                   mov    %es,(%rcx)
  3f:   14                      .byte 0x14

Code starting with the faulting instruction
===========================================
   0:   0f 0b                   ud2
   2:   b8 f4 ff ff ff          mov    $0xfffffff4,%eax
   7:   5b                      pop    %rbx
   8:   5d                      pop    %rbp
   9:   41 5c                   pop    %r12
   b:   41 5d                   pop    %r13
   d:   c3                      ret
   e:   cc                      int3
   f:   cc                      int3
  10:   cc                      int3
  11:   cc                      int3
  12:   e8                      .byte 0xe8
  13:   8c 01                   mov    %es,(%rcx)
  15:   14                      .byte 0x14
[  610.558435][   T79] RSP: 0018:ffffc900005f7a98 EFLAGS: 00010246
[  610.558438][   T79] RAX: 0000000000000000 RBX: ffffea0005300740 RCX: ffffffff8193b8eb
[  610.558440][   T79] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffea0005300774
[  610.583027][   T79] RBP: ffffea0005300740 R08: 0000000000000000 R09: fffff94000a600ee
[  610.583029][   T79] R10: ffffea0005300777 R11: ffffc900005f7d60 R12: ffffea0005300774
[  610.583046][   T79] R13: 0000000000000002 R14: 0000000000000002 R15: ffffea0005300740
[  610.583048][   T79] FS:  0000000000000000(0000) GS:ffff888769b80000(0000) knlGS:0000000000000000
[  610.583049][   T79] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  610.583051][   T79] CR2: 000055e715c96510 CR3: 000000081a85a001 CR4: 00000000003706f0
[  610.594321][   T79] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  610.594323][   T79] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  610.594324][   T79] Call Trace:
[  610.594326][   T79]  <TASK>
[ 610.609918][ T79] ? __warn (kernel/panic.c:694)
[ 610.609922][ T79] ? try_grab_page (mm/gup.c:229 (discriminator 1))
[ 610.619897][ T79] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 610.619901][ T79] ? handle_bug (arch/x86/kernel/traps.c:239 (discriminator 1))
[ 610.635497][ T79] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
[ 610.635500][ T79] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
[ 610.635504][ T79] ? try_grab_page (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/atomic/atomic-instrumented.h:33 include/linux/page_ref.h:67 include/linux/page_ref.h:89 mm/gup.c:229)
[ 610.650665][ T79] ? try_grab_page (mm/gup.c:229 (discriminator 1))
[ 610.650669][ T79] ? try_grab_page (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/atomic/atomic-instrumented.h:33 include/linux/page_ref.h:67 include/linux/page_ref.h:89 mm/gup.c:229)
[ 610.664447][ T79] follow_page_pte (mm/gup.c:652 (discriminator 1))
[ 610.664451][ T79] ? __pfx_follow_page_pte (mm/gup.c:582)
[ 610.680047][ T79] ? replace_page (include/linux/mmu_notifier.h:486 (discriminator 1) mm/ksm.c:1461 (discriminator 1))
[ 610.680050][ T79] follow_pmd_mask+0x1cb/0xa20
[ 610.685963][ T79] ? __pfx_follow_pmd_mask+0x10/0x10
[ 610.685967][ T79] ? __pfx___might_resched (kernel/sched/core.c:10152)
[ 610.692052][ T79] follow_page (mm/gup.c:854)
[ 610.692055][ T79] ? __pfx_follow_page (mm/gup.c:839)
[ 610.692058][ T79] scan_get_next_rmap_item (mm/ksm.c:2651)
[ 610.702738][ T79] ? __pfx_scan_get_next_rmap_item (mm/ksm.c:2563)
[ 610.702742][ T79] ? __pfx___might_resched (kernel/sched/core.c:10152)
[ 610.702744][ T79] ksm_scan_thread (mm/ksm.c:2754 mm/ksm.c:2780)
[  610.707751][  T361] mmap10      0  TINFO  :  start tests.
[ 610.711264][ T79] ? __pfx_ksm_scan_thread (mm/ksm.c:2770)
[ 610.711266][ T79] ? _raw_spin_lock_irqsave (arch/x86/include/asm/atomic.h:115 (discriminator 4) include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) include/asm-generic/qspinlock.h:111 (discriminator 4) include/linux/spinlock.h:187 (discriminator 4) include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4))
[ 610.711270][ T79] ? __pfx__raw_spin_lock_irqsave (kernel/locking/spinlock.c:161)
[ 610.717969][ T79] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:383)
[ 610.717972][ T79] ? __kthread_parkme (arch/x86/include/asm/bitops.h:206 (discriminator 1) arch/x86/include/asm/bitops.h:238 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 1) kernel/kthread.c:280 (discriminator 1))
[ 610.727352][ T79] ? __pfx_ksm_scan_thread (mm/ksm.c:2770)
[ 610.727354][ T79] kthread (kernel/kthread.c:388)
[ 610.727357][ T79] ? __pfx_kthread (kernel/kthread.c:341)
[ 610.737340][ T79] ret_from_fork (arch/x86/kernel/process.c:153)
[ 610.744033][ T79] ? __pfx_kthread (kernel/kthread.c:341)
[ 610.753753][ T79] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
[  610.764252][   T79]  </TASK>
[  610.771633][   T79] ---[ end trace 0000000000000000 ]---


The kernel config is available at:
https://download.01.org/0day-ci/archive/20240511/202405111644.40d50350-lkp@intel.com

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


More information about the ltp mailing list