[infinispan-dev] Testsuite: memory usage?

Sanne Grinovero sanne at infinispan.org
Mon Feb 19 07:45:02 EST 2018


Thanks Dan,

that solved the main issue, I no longer have OOMs on the core module.
I'll merge your PR as soon as I completed the full build.

Interesting idea to disable TieredCompilation, I'll try that on other
projects too.

If someone is up for some additional love as follow ups:
 - raising the heap from 1G to ~1300M does give it quite some more
breathing space, I believe it should still work on a 2GB testing
machine.
 - I still see quite some MBeans in the JConsole at the end of the
build, something is leaking these and they do keep references to
CacheManagers.
 - still seeing an unreasonable amount of threads as well, varying
from ~200 to ~2000. Possibly related to the previous point?

Cheers,
Sanne




On 19 February 2018 at 11:57, Dan Berindei <dan.berindei at gmail.com> wrote:
> Ok, so the biggest problem is that TestNG keeps test instances around until
> the end of the test suite, and many of our tests are quite heavyweight
> because they keep references to caches/managers even after they finish. I've
> opened a PR to set those fields to null, fix some smaller leaks, and use
> -XX:+UseG1GC -XX:-TieredCompilation, and I'm getting ~ 11 mins on my laptop.
>
> https://github.com/infinispan/infinispan/pull/5768
>
> It's still a lot, especially knowing that not long ago it would take half of
> that, but making it shorter would probably involve looking deeper into the
> (many) tests that we've added in the last year or so.
>
> Cheers
> Dan
>
>
> On Fri, Feb 16, 2018 at 8:05 AM, Dan Berindei <dan.berindei at gmail.com>
> wrote:
>>
>> Yeah, I got a much slower run with the default collector (parallel):
>>
>> [INFO] Total time: 17:45 min
>> GC Time: 2m 43s
>> Compile time: 18m 20s
>>
>> I'm not sure if it's really the GC affecting the compile time or there's
>> another factor hiding there. But I did get a heap dump and I'm analyzing it
>> now.
>>
>> Cheers
>> Dan
>>
>>
>> On Thu, Feb 15, 2018 at 1:59 PM, Dan Berindei <dan.berindei at gmail.com>
>> wrote:
>>>
>>> Hmmm, I didn't notice that I was running with -XX:+UseG1GC, so perhaps
>>> our test suite is a pathological case for the default collector?
>>>
>>> [INFO] Total time: 12:45 min
>>> GC Time: 52.593s
>>> Class Loader Time: 1m 26.007s
>>> Compile Time: 10m 10.216s
>>>
>>> I'll try without -XX:+UseG1GC later.
>>>
>>> Cheers
>>> Dan
>>>
>>>
>>> On Thu, Feb 15, 2018 at 1:39 PM, Dan Berindei <dan.berindei at gmail.com>
>>> wrote:
>>>>
>>>> And here I was thinking that by adding -XX:+HeapDumpOnOutOfMemoryError
>>>> anyone would be able to look into OOMEs and I wouldn't have to reproduce the
>>>> failures myself :)
>>>>
>>>> Dan
>>>>
>>>>
>>>> On Thu, Feb 15, 2018 at 1:32 PM, William Burns <mudokonman at gmail.com>
>>>> wrote:
>>>>>
>>>>> So I must admit I had noticed a while back that I was having some
>>>>> issues with running the core test suite. Unfortunately at the time CI and
>>>>> everyone else seemed to not have any issues. I just ignored it because at
>>>>> the time I didn't need to run core tests. But now that Sanne pointed this
>>>>> out, by increasing the heap variable in the pom.xml, I was for the first
>>>>> time able to run the test suite completely. It would normally hang for an
>>>>> extremely long time near the 9k-10K test completed point and never finish
>>>>> for me (at least I didn't wait long enough).
>>>>>
>>>>> So it definitely seems there is something leaking in the test suite
>>>>> causing the GC to use a ton of CPU time.
>>>>>
>>>>>  - Will
>>>>>
>>>>> On Thu, Feb 15, 2018 at 5:40 AM Sanne Grinovero <sanne at infinispan.org>
>>>>> wrote:
>>>>>>
>>>>>> Thanks Dan.
>>>>>>
>>>>>> Do you happen to have observed the memory trend during a build?
>>>>>>
>>>>>> After a couple more attempts it passed the build once, so that shows
>>>>>> it's possible to pass.. but even though it's a small sample so far
>>>>>> that's 1 pass vs 3 OOMs on my machine.
>>>>>>
>>>>>> Even the one time it successfully completed the tests I see it wasted
>>>>>> ~80% of total build time doing GC runs.. it was likely very close to
>>>>>> fall over, and definitely not an efficient setting for regular builds.
>>>>>> Observing trends on my machine I'd guess a reasonable value to be
>>>>>> around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
>>>>>> complete successfully without often failing.
>>>>>>
>>>>>> The memory issues are worse towards the end of the testsuite, and
>>>>>> steadily growing.
>>>>>>
>>>>>> I won't be able to investigate further as I need to urgently work on
>>>>>> modules, but I noticed there are quite some MBeans according to
>>>>>> JConsole. I guess it would be good to check if we're not leaking the
>>>>>> MBean registration, and therefore leaking (stopped?) CacheManagers
>>>>>> from there?
>>>>>>
>>>>>> Even near the beginning of the tests, when forcing a full GC I see
>>>>>> about 400MB being "not free". That's quite a lot for some simple
>>>>>> tests, no?
>>>>>>
>>>>>> Thanks,
>>>>>> Sanne
>>>>>>
>>>>>>
>>>>>> On 15 February 2018 at 06:51, Dan Berindei <dan.berindei at gmail.com>
>>>>>> wrote:
>>>>>> > forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap
>>>>>> > to 1G
>>>>>> > because we were trying to run the build on agent VMs with only 4GB
>>>>>> > of RAM,
>>>>>> > and the 2GB heap was making the build run out of native memory.
>>>>>> >
>>>>>> > I've yet to see an OOME in the core tests, locally or in CI. But I
>>>>>> > also
>>>>>> > included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming
>>>>>> > there's
>>>>>> > a new leak it should be easy to track down in the heap dump.
>>>>>> >
>>>>>> > Cheers
>>>>>> > Dan
>>>>>> >
>>>>>> >
>>>>>> > On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero
>>>>>> > <sanne at infinispan.org>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> Hey all,
>>>>>> >>
>>>>>> >> I'm having OOMs running the tests of infinispan-core.
>>>>>> >>
>>>>>> >> Initially I thought it was related to limits and security as that's
>>>>>> >> the usual suspect, but no it's really just not enough memory :)
>>>>>> >>
>>>>>> >> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G
>>>>>> >> for
>>>>>> >> surefire; I've been observing the growth of heap usage in JConsole
>>>>>> >> and
>>>>>> >> it's clearly not enough.
>>>>>> >>
>>>>>> >> What surprises me is that - as an occasional tester - I shouldn't
>>>>>> >> be
>>>>>> >> the one to notice such a new requirement first. A leak which only
>>>>>> >> manifests in certain conditions?
>>>>>> >>
>>>>>> >> What do others observe?
>>>>>> >>
>>>>>> >> FWIW, I'm running it with 8G heap now and it's working much better;
>>>>>> >> still a couple of failures but at least they're not OOM related.
>>>>>> >>
>>>>>> >> Thanks,
>>>>>> >> Sanne
>>>>>> >> _______________________________________________
>>>>>> >> infinispan-dev mailing list
>>>>>> >> infinispan-dev at lists.jboss.org
>>>>>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > infinispan-dev mailing list
>>>>>> > infinispan-dev at lists.jboss.org
>>>>>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>>
>>>
>>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


More information about the infinispan-dev mailing list