<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>Hi,</p>
    <p>There's on going discussions and more data coming from Heiko, but
      since not everyone might be online, I'll fill in here as well..<br>
    </p>
    <p>Testing with the Hawkular-Metrics (not services), I can't repeat
      these. I started by testing just the compression library and after
      compressing some 9 billion datapoints, I was still running
      constant ~10MB heap usage.</p>
    <p>So I started HWKMETRICS in Wildfly 10.0.0.Final and ran with our
      load-test, gatling and clients=20, metrics=1000 and loops enough
      to keep it running forever. The instance was modified to run with
      20 minutes interval of compression.</p>
    <p>Monitoring with jconsole, I can't see any difference in memory
      usage from first 5 minutes to the last 5 minutes (running for two
      hours now). It's running between 147MB of memory and 400MB.</p>
    <p>Requires more investigation then..<br>
    </p>
    <p>  - Micke<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 10/11/2016 09:42 AM, Heiko W.Rupp
      wrote:<br>
    </div>
    <blockquote
      cite="mid:A76E22FB-A275-4AD9-A1EF-5C3900BE0DE1@redhat.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <div style="font-family:sans-serif">
        <div style="white-space:normal">
          <p dir="auto">Hey,</p>
          <p dir="auto">tl;dr: we need to investigate heap usage -
            especially in the case of compression<br>
            kicking in - it looks like there could be a memory leak.
            Compression timing seems<br>
            mostly ok.</p>
          <p dir="auto">I originally wanted to see how more feeds
            influence the metrics compression timing.<br>
            So I started the server with -Xmx512m as I did in all the
            weeks before and pointed a<br>
            few feeds at the server - to see it crash with OOME shortly
            after the compression started.</p>
          <p dir="auto">Now I restarted the server with -Xmx1024m and
            also -XX:MaxMetaspaceSize=512m (up from 256m before) and am
            running the server now with 1 feed for a day.</p>
          <p dir="auto">To be continued below ...</p>
          <p dir="auto">( Time is in GMT, which is 2h off of my local
            time)<br>
            hawkular_1 | 15:00:44,764 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 15:00:45,452 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 687 ms<br>
            hawkular_1 | 17:00:44,757 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 17:00:46,796 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 2039 ms<br>
            hawkular_1 | 19:00:44,751 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 19:00:47,293 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 2541 ms<br>
            hawkular_1 | 21:00:44,749 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 21:00:46,267 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 1517 ms<br>
            hawkular_1 | 23:00:44,749 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 23:00:45,472 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 722 ms<br>
            hawkular_1 | 01:00:44,749 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 01:00:46,241 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 1492 ms<br>
            hawkular_1 | 03:00:44,747 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 03:00:45,780 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 1033 ms<br>
            hawkular_1 | 05:00:44,746 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 05:00:45,781 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 1034 ms<br>
            hawkular_1 | 07:00:44,749 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 07:00:46,386 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 1636 ms<br>
            hawkular_1 | 09:00:44,748 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 09:00:45,682 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 934 ms<br>
            hawkular_1 | 11:00:44,750 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 11:00:46,339 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 1589 ms<br>
            hawkular_1 | 13:00:44,748 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 13:00:45,880 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 1132 ms</p>
          <p dir="auto">Looking at the memory usage I see the memory
            usage there is often a peak in heap usage around and after
            the compression - looking at the accumulated GC-time also
            shows heavy GC activity. I guess the peak in heap usage (and
            thus committed heap) comes from promoting objects from young
            generation into old during the compression and then later
            after compression is over, they are garbage collected, so
            that heap usage goes down and the system is able to free
            some memory.</p>
          <p dir="auto">Starting at around 4:10pm (2pm in the above
            output) I am running with 10 extra feeds (hawkfly on an
            external box)</p>
          <p dir="auto">While it looks like there is a slowly growing
            non-heap, it was growing a lot when the 10 extra feeds
            connected.<br>
            Also it looks like it is growing a bit more each time
            compression kicks in ("non-heap")</p>
          <p dir="auto">The first compression run with 11 feeds did not
            take too long, the next was</p>
          <p dir="auto">hawkular_1 | 17:00:44,749 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 17:00:50,277 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 5528 ms</p>
          <p dir="auto">Which used a lot more memory than before.
            Non-heap was able to reclaim some memory though.</p>
          <p dir="auto">hawkular_1 | 19:00:44,751 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 19:00:50,093 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 5341 ms<br>
            hawkular_1 | 21:00:44,751 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 21:00:49,465 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 4714 ms<br>
            hawkular_1 | 23:00:44,753 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 23:00:48,925 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 4171 ms<br>
            hawkular_1 | 01:00:44,750 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 01:00:48,554 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 3803 ms<br>
            hawkular_1 | 03:00:44,761 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 03:00:48,659 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 3898 ms<br>
            hawkular_1 | 05:00:44,748 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 05:00:49,134 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 4385 ms<br>
            hawkular_1 | 07:00:44,755 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 07:00:49,831 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 5076 ms<br>
            hawkular_1 | 09:00:44,751 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 09:00:49,508 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 4757 ms</p>
          <p dir="auto">Now at 11:20 (9:20 for above logs) I started 5
            more feeds (with a 2min sleep between starts)</p>
          <p dir="auto">hawkular_1 | 11:00:44,749 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 11:00:49,751 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 5002 ms<br>
            hawkular_1 | 13:00:44,749 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 13:00:56,594 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 11845 ms<br>
            hawkular_1 | 15:00:44,754 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 15:00:53,985 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 9231 ms</p>
          <p dir="auto">And another 5 starting at 17:12 (15:00 in above
            logs timezone)</p>
          <p dir="auto">hawkular_1 | 17:00:44,768 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 17:00:57,824 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 13056 ms</p>
          <p dir="auto">And another 5 starting at 19:57 (17:57 in above
            log timezone )</p>
          <p dir="auto">hawkular_1 | 19:00:44,751 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            hawkular_1 | 19:01:24,401 WARN
            org.apache.activemq.artemis.ra AMQ152007: Thread
            Thread[TearDown/ActiveMQActivation,5,] could not be finished<br>
            hawkular_1 | 19:01:40,918 WARN
            org.apache.activemq.artemis.ra AMQ152007: Thread
            Thread[TearDown/ActiveMQActivation,5,default-threads] could
            not be finished<br>
            hawkular_1 | 19:02:04,619 WARN
            org.apache.activemq.artemis.ra AMQ152007: Thread
            Thread[TearDown/ActiveMQActivation,5,default-threads] could
            not be finished<br>
            hawkular_1 | 19:02:22,423 WARN
            org.apache.activemq.artemis.ra AMQ152007: Thread
            Thread[TearDown/ActiveMQActivation,5,default-threads] could
            not be finished<br>
            hawkular_1 | 19:02:30,247 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 105495 ms</p>
          <p dir="auto">This took almost 2 minutes, where the server was
            non-responsive</p>
          <p dir="auto">21:00:44,753 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution<br>
            21:01:06,520 INFO
            org.hawkular.metrics.core.jobs.CompressData Finished
            compressing data in 21767 ms<br>
            23:00:44,751 INFO
            org.hawkular.metrics.core.jobs.CompressData Starting
            execution</p>
          <p dir="auto">And here it went south and the server more or
            less died with OOME exceptions (it is still responding to
            queries, and potentially even ingesting new data, but the
            scheduler seems not to run anymore.<br>
            I can imagine that once the OOME happened, the scheduler
            thread died, freeing up memory that the compression job held
            and which then allowed the server itself to continue. But it
            certainly is in an unusable state.</p>
          <p dir="auto">This is the final "one day" with "max" values
            plotted:<br>
            <img src="cid:part1.F91CA1EE.E86D0719@redhat.com" alt=""
              title="Bildschirmfoto 2016-10-11 um 08.21.53.png"><br>
            The peaks at compression time for heap-max (green line) are
            clearly visible.</p>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
hawkular-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:hawkular-dev@lists.jboss.org">hawkular-dev@lists.jboss.org</a>
<a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/hawkular-dev">https://lists.jboss.org/mailman/listinfo/hawkular-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>