<div dir="ltr"><div class="gmail_default" style="">Hi Johannes, Roman, and MM experts,</div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">Both Xinpeng and PaulB reports that LTP/ioctl_sg01 always gets OOM killed on aarch64</div><div class="gmail_default" style="">( confirmed "x86_64 + kernel-v5.12-rc6" influenced as well) when system MemAvailable</div><div class="gmail_default" style="">less than MemFree. With help of Eirik and Chunyu, we found that the problem only</div><div class="gmail_default" style="">occurred since below kernel commit:</div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">    commit 8c7829b04c523cdc732cb77f59f03320e09f3386<br>    Author: Johannes Weiner <<a href="mailto:hannes@cmpxchg.org">hannes@cmpxchg.org</a>><br>    Date:   Mon, 13 May 2019 17:21:50 -0700<br><br>        mm: fix false-positive OVERCOMMIT_GUESS failures</div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">The mmap() behavior changed <span style="color:rgb(0,0,0);font-size:12px">in GUESS mode from </span>that, we can NOT receive</div><div class="gmail_default" style="">MAP_FAILED on ENOMEM  in userspace anymore unless the process one-time</div><div class="gmail_default" style="">allocating memory larger than "total_ram+ total_swap" <span style="color:rgb(51,51,51);font-size:14px">explicitly</span>, hence, it does</div><div class="gmail_default" style="">not look like a heuristics way in memory allocation. </div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">Chunyu and I concern that might be more trouble for users in memory allocation.</div><div class="gmail_default" style=""><br></div><div><div>mmap2<br> ksys_mmap_pgoff<br>  vm_mmap_pgoff<br>   do_mmap<br>    mmap_region<br>     // Private writeable mmaping: check memory availability<br>     security_vm_enough_memory_mm<br><span class="gmail_default" style="font-size:small">     </span>__vm_enough_memory<br></div><div><br></div><div><div class="gmail_default" style="font-size:small">"</div></div><div><div class="gmail_default" style="font-size:small">   872    int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)</div><div class="gmail_default">             ...</div><div class="gmail_default">   884    if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) {<br>   885       if (pages > totalram_pages() + total_swap_pages)<br>   886           goto error;<br>   887        return 0;<br></div><div class="gmail_default" style="font-size:small">   888    }</div><div class="gmail_default" style="font-size:small">"</div><br></div><div><div class="gmail_default" style="">As __vm_enough_memory() using a consistent upbound on return ENOMEM which only</div><div class="gmail_default" style="">make sense for the one-time requested memory size larger than "total_ram + total_swap",<br>so all processes in userspace will more easily hit OOM (in OVERCOMMIT_GUESS) roughly.</div><div class="gmail_default" style=""><br>Maybe the acceptable way should be to dynamically detect the available/free memory</div><div class="gmail_default" style="">according to the running system "free_pages + free_swap_pages" as before.</div></div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">Any thoughts or suggestions?</div><div><br></div><div class="gmail_default" style="font-size:small"></div></div><div><div class="gmail_default" style="font-size:small">=================</div><div class="gmail_default" style="">To simply show the above issue, I extract a C reproducer as:</div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">Without the kernel commit</div><div class="gmail_default" style=""># ./mmap_failed</div><div class="gmail_default" style="">...</div>map_blocks[1493] = 0xffc525c60000<br><div>PASS: MAP_FAILED as expected</div><div><br></div><div><div class="gmail_default" style="font-size:small">After the kernel commit:</div><div class="gmail_default" style=""># ./mmap_failed</div><div class="gmail_default" style="">...<br>map_blocks[1617] = 0x3c0836b0000<br>map_blocks[1618] = 0x3c0796b0000<br>Killed                <===== Always Killed by OOM-Killer<br></div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">-------------------------</div><div class="gmail_default" style=""># cat mmap_failed.c<br></div><div class="gmail_default" style=""><br></div><div class="gmail_default" style="">#include <stdio.h><br>#include <sys/sysinfo.h><br>#include <stdlib.h><br>#include <string.h><br>#include <sys/mman.h><br><br>#define BLOCKSIZE (160 * 1024 * 1024)<br><br>void main(void)<br>{<br>    size_t i, maxsize, map_count = 0, blocksize = BLOCKSIZE;<br>    void **map_blocks;<br>    struct sysinfo info;<br><br>    sysinfo(&info);<br>    maxsize = (info.freeram + info.freeswap) * info.mem_unit;<br><br>    map_count = maxsize / blocksize;<br>    map_blocks = malloc(map_count * sizeof(void *));<br><br>    for (i = 0; i < map_count; i++) {<br>            map_blocks[i] = mmap(NULL, blocksize, PROT_READ | PROT_WRITE,<br>            MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);<br><br>            // we'd better get MAP_FAILED and break here but not OOM instantly<br>            if (map_blocks[i] == MAP_FAILED) {<br>                map_count = i;<br>                printf("PASS: MAP_FAILED as expected\n");<br>                break;<br>        }<br><br>        printf("map_blocks[%d] = %p\n", i, map_blocks[i]);<br>        memset(map_blocks[i], 1, blocksize);<br>    }<br><br>    for (i = 0; i < map_count; i++)<br>    munmap(map_blocks[i], blocksize);<br><br>    free(map_blocks);<br>}<br></div><div><br></div><div><br></div><div><div><div class="gmail_default" style="font-size:small">--</div>P.s<span class="gmail_default"></span> <span class="gmail_default" style="font-size:small">there</span> is another issue<span class="gmail_default" style="font-size:small"> about</span> MemAvailable < MemFree<span class="gmail_default" style="font-size:small"> because of</span> <span class="gmail_default" style="font-size:small"></span><span style="color:rgb(0,0,0);white-space:pre-wrap">reserve<span class="gmail_default" style="font-size:small">ing</span></span></div><div><span style="color:rgb(0,0,0);white-space:pre-wrap">by<span class="gmail_default" style="font-size:small"> </span></span><span style="color:rgb(0,0,0);white-space:pre-wrap">khugepaged for<span class="gmail_default"> </span></span><span style="color:rgb(0,0,0);white-space:pre-wrap">allocating transparent hugepage</span>, but I don't want to <span class="gmail_default" style="font-size:small">mix</span> <span class="gmail_default">them</span></div><div>in this thread to make<span class="gmail_default"> </span>things <span class="gmail_default" style="font-size:small">complicated</span><span class="gmail_default">.  @</span><span class="gmail_default">Chunyu, if you can start a new email</span></div><div><span class="gmail_default">thread that'd be appreciated.</span></div></div><div><br></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>Regards,<br></div><div>Li Wang<br><br><br></div></div></div></div></div></div>