[LTP] controllers/memcg_stress: Does not work correctly?

Michal Hocko mhocko@suse.cz
Thu Nov 19 11:30:43 CET 2015


On Wed 18-11-15 18:55:55, Cyril Hrubis wrote:
> Hi!
> > I am not sure I understand what is the actual problem. Swap off system
> > will not have any swap available. Or did you mean that the swap off will
> > happen during the test? If yes then the test case is clearly not
> > prepared for that and it would get killed by the OOM killer.
> 
> As far as I understand this the problem is that the test may cause OOM
> unexpectedly on systems without swap.

OK, I got the point now. I thought that the OOM killer was _expected_
and handled properly wrt. memcg.

> > I am not entirely sure I understand the purpose of the test case but it
> > seems it just wants to generate the global memory pressure via memcg
> > loads. Whether that is a useful test is hard to judge. What is the
> > pass/fail metric? If the failure is the OOM killer then this is
> > extremely fragile because of what Cyril mentions below:
> 
> These testcases were added to LTP way back in 2009 and there is no
> reasonable description what the stress test is trying to achieve. All it
> does, as you said, is to create global memory pressure via processes in
> memcg subroups, excersizes the memory for a while by writing it and then
> if the main process manages to kill all the worker processes it's a
> pass. So it looks like the only pass/fail metric is whether the system
> (i.e. the main script) outlives the memory pressure for one hour.

Hmm, checking that you are able to shut down a memcg during heavy memory
might be useful when I think about it some more. I do not remember any
specific bugs off hand but I certainly encountered some during testing
of my experimental patches. I wouldn't be afraid of the OOM killer in
such a case though. It is simply a test that we do not get stuck anywhere
while tear down races with a high memory pressure or OOM.
 
> So this test doesn't really do anything smart enough to stress the memcg
> cgroups. If you have better ideas how to stress memcg I would love to
> hear them.y

Well, I can think of some but the PASS/FAIL decision is not easy. I am
testing parallel reclaim generated by multiple memcgs in the common
ancestor and evaluate the reclaim decisions. This, however, requires
analysis of reclaim statistics and that is hard to do automagically...
-- 
Michal Hocko
SUSE Labs


More information about the Ltp mailing list