Awesome Dan! May I ask for a Pull Request: https://github.com/slaskawi/jboss-logging-perf-test?



On Mon, Oct 3, 2016 at 2:40 PM, Dan Berindei <dan.berindei@gmail.com> wrote:
Hi Sebastian

I modified your benchmark so that the logger and the trace field are
static final and looked at the generated assembly with JITWatch [1].
Based on the attached assembly listings, caching isTraceEnabled() in a
constant field is "infinitely faster", because there are no assembly
instructions that can be traced back to the test method.

JBossLogManagerLogger::translate() is inlined in this listing, but it
still goes through the switch machinery, I'm guessing because the
ordinal of an Enum is not considered a constant.

Cheers
Dan


[1]: https://github.com/AdoptOpenJDK/jitwatch

On Mon, Oct 3, 2016 at 11:28 AM, Sebastian Laskawiec
<slaskawi@redhat.com> wrote:
> Hey!
>
> Please have a look at the latest perf test results [1][2]:
>
> Benchmark                             Mode  Cnt           Score
> Error  Units
> MyBenchmark.noVariable               thrpt   20   681131269.875 ±
> 3961932.923  ops/s
> MyBenchmark.withIsTraceEnabledCheck  thrpt   20   676307984.921 ±
> 14305970.393  ops/s
> MyBenchmark.withVariable             thrpt   20  2411000894.582 ±
> 17382438.635  ops/s
>
> I think there is no surprise here.. using a field, which stores the result
> of `logger.isTraceEnabled()` evaluation is 3 times faster than other
> options.
>
> If anyone is interested in printing out JIT stuff, I also ran it with
> "-XX:+PrintCompilation", "-XX:+PrintCompilation2" and "-XX:+PrintInlining"
> here [3].
>
> I'm not a performance expert but it seems that JIT could not inline the
> "translate" method because of its size (see line 1861). However it tried
> several times with different optimizations (and some of them were thrown
> away - "made not entrant" messages [4]).
>
> Let's wait for James' opinion on this, but I think we should address this
> issue on JBoss Logging/LogManager side (so I agree with David here) and make
> those parts inlinable (wow, does this word even exist? :D). Once this is
> done, we could experiment further in Infinispan codebase and see how this
> relates to some real world benchmarks...
>
> Thanks
> Sebastian
>
> [1] https://gist.github.com/slaskawi/6766b6e17c7a28ac8d8962293c48a53c
> [2] Test repository: https://github.com/slaskawi/jboss-logging-perf-test
> [3] https://gist.github.com/slaskawi/6f317bb05539611434bc91d66924bae0
> [4]
> https://www.safaribooksonline.com/library/view/java-performance-the/9781449363512/ch04.html
>
> On Mon, Oct 3, 2016 at 7:32 AM, Sebastian Laskawiec <slaskawi@redhat.com>
> wrote:
>>
>> Hey!
>>
>> Adding James to this thread.
>>
>> @Dennis - I think Dan has a point here. The trick with checking a field in
>> a class is 3 times faster. Most of the checks are done in core so they are
>> executed multiple times per operation. Changing all those places is probably
>> not an option.
>>
>> @David - Let me run a test with JBoss Log Manager and get back to you with
>> some results. But if Dan is right, and the problem is with enum mapping, I
>> will get similar results.
>>
>> Thanks
>> Sebastian
>>
>> On Sat, Oct 1, 2016 at 11:53 AM, Dan Berindei <dan.berindei@gmail.com>
>> wrote:
>>>
>>> On Fri, Sep 30, 2016 at 10:15 PM, David M. Lloyd <david.lloyd@redhat.com>
>>> wrote:
>>> > The performance problem that this trick is meant to resolve is really a
>>> > problem in the logging backend.  It *should* be faster inside of
>>> > WildFly, where JBoss LogManager is used, because that project just
>>> > checks a single volatile field for the level check... and the path to
>>> > that code *should* be inline-friendly.
>>> >
>>>
>>> Indeed, we started using this trick because of log4j 1.2, which needs
>>> to walk the logger hierarchy in order to check the level, and it made
>>> a significant difference there.
>>>
>>> Nowadays I think it's pretty close to optimal in all logging
>>> frameworks. The only nitpick is that they all use enums for the
>>> levels, and the JIT can't inline Level.TRACE.value as it would with a
>>> Level.TRACE_VALUE int constant. If JDK9 fixes that, then it's going to
>>> be more or less equivalent to using a volatile "trace" field in each
>>> class, so it should be suitable even for local read operations that
>>> take < 200ns.
>>>
>>> We'd probably still need to weed out some of the trace messages, as we
>>> probably have more than 10 of them during such a read operation. I
>>> confess that I added way too many trace logs myself, precisely because
>>> I knew we are using a static final field and the JIT compiler doesn't
>>> even generate code for that branch.
>>>
>>> Cheers
>>> Dan
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev@lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev