[LTP] [PATCH RFC] move_pages12: handle errno EBUSY for madvise(..., MADV_SOFT_OFFLINE)
Li Wang
liwang@redhat.com
Thu Jul 4 05:29:09 CEST 2019
Hi Xu,
On Thu, Jun 27, 2019 at 10:50 AM Yang Xu <xuyang2018.jy@cn.fujitsu.com>
wrote:
> ...
>> Hi Li
>>
>> Your patch can handle EBUSY errno correctly for soft offline.
>> But move page may be killed by SIGBUS because of MCE when we soft
>> offline concurrently.
>> That leads to move_page failed with ESRCH. Also, move page may fails
>> with ENOMEM .
>> Do you notice it ?
>>
>
> I didn't get this failure, it seems not related to this patch. Two
> questions:
>
> 1. which kernel version do you test?
> 2. can you reproduce this without my patch?
>
> Hi Li
>
> I test it on 3.10.0-957.el7.x86_64 kvm(my machine was not support numa
> and i enable it on kvm. as below:
> <cpu mode='custom' match='exact' check='full'>
> <model fallback='forbid'>Penryn</model>
> <feature policy='require' name='x2apic'/>
> <feature policy='require' name='hypervisor'/>
> <numa>
> <cell id='0' cpus='0' memory='1048576' unit='KiB'/>
> <cell id='1' cpus='1' memory='1048576' unit='KiB'/>
> </numa>
> </cpu>
>
> Does it only exist on kvm and doesn't exist on physical machine? I don't
> have physical machine that supports numa.
>
I can reproduce your problem on bare metal too, it seems like you hit the
bug as the commit 6bc9b56433b (mm: fix race on soft-offlining free huge
pages) described, which Naoya pointed out before:
See:
+ /*
+ * We set PG_hwpoison only when the migration source
hugepage
+ * was successfully dissolved, because otherwise hwpoisoned
+ * hugepage remains on free hugepage list, then userspace
will
+ * find it as SIGBUS by allocation failure. That's not
expected
+ * in soft-offlining.
+ */
+ ret = dissolve_free_huge_page(page);
+ if (!ret) {
+ if (set_hwpoison_free_buddy_page(page))
+ num_poisoned_pages_inc();
+ }
And, this bz still exists in the latest rhel7 kernel, I will open a bug to
RHEL7 product.
--
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20190704/8774640e/attachment-0001.htm>
More information about the ltp
mailing list