On Mon, Oct 3, 2016 at 3:58 PM, Sanne Grinovero <sanne(a)infinispan.org> wrote:
I've not looked at any such details but I always assumed that
since the
logger frameworks (any of them) need to be able to change the logging level,
there must be some way of switching state - at least a volatile read?
And while that could be very cheap, I doubt it would be as easily optimised
as a constant.
True, at some point you need a volatile read -- you can only avoid
that by caching isTraceEnabled(), like we do in Infinispan.
My point was related to this assembly instruction, which reads the
ordinal of Level.TRACE:
0x00007f93951e9fdf: mov rdi,0x76d4e6d48 ; {oop(a
'org/jboss/logging/Logger$Level')}
0x00007f93951e9fe9: mov ebp << missing an operand here >>; -
java.lang.Enum::ordinal@1 (line 103)
; -
org.jboss.logging.JBossLogManagerLogger::translate@8 (line 58)
; -
org.jboss.logging.JBossLogManagerLogger::isEnabled@5 (line 35)
; - org.jboss.logging.Logger::tracef@4 (line 287)
; - org.infinispan.MyBenchmark::tracef@7 (line 26)
; -
org.infinispan.generated.MyBenchmark_tracef_jmhTest::tracef_thrpt_jmhStub@15
(line 113)
In an ideal world, the JIT would replace translate(Level.TRACE) with a
constant. But here, even though `0x76d4e6d48` is clearly a constant,
the ordinal is still loaded and checked every time.
Since we might not be interested only in the logger method to be
inlined, we
also want most of the client methods to be considered for inlining.. some of
these can be extremely sensitive so such a decision requires a full
Infinispan performance test rather than a microbenchmark. (which is useful
too but more for the sake of optimising JBoss Logging, not to proof if it's
OK to switch all of Infinispan 's internal conventions)
Based on the testing both myself and Radim did WRT inlining, it
doesn't make a huge difference. But I agree 100% that before removing
the isTraceEnabled() caching we'd have to benchmark the whole of
Infinispan, and not just JBoss Logging in isolation.
Cheers
Dan
On Mon, 3 Oct 2016, 13:45 Sebastian Laskawiec, <slaskawi(a)redhat.com> wrote:
>
> Awesome Dan! May I ask for a Pull Request:
>
https://github.com/slaskawi/jboss-logging-perf-test?
>
>
>
> On Mon, Oct 3, 2016 at 2:40 PM, Dan Berindei <dan.berindei(a)gmail.com>
> wrote:
>>
>> Hi Sebastian
>>
>> I modified your benchmark so that the logger and the trace field are
>> static final and looked at the generated assembly with JITWatch [1].
>> Based on the attached assembly listings, caching isTraceEnabled() in a
>> constant field is "infinitely faster", because there are no assembly
>> instructions that can be traced back to the test method.
>>
>> JBossLogManagerLogger::translate() is inlined in this listing, but it
>> still goes through the switch machinery, I'm guessing because the
>> ordinal of an Enum is not considered a constant.
>>
>> Cheers
>> Dan
>>
>>
>> [1]:
https://github.com/AdoptOpenJDK/jitwatch
>>
>> On Mon, Oct 3, 2016 at 11:28 AM, Sebastian Laskawiec
>> <slaskawi(a)redhat.com> wrote:
>> > Hey!
>> >
>> > Please have a look at the latest perf test results [1][2]:
>> >
>> > Benchmark Mode Cnt Score
>> > Error Units
>> > MyBenchmark.noVariable thrpt 20 681131269.875 ±
>> > 3961932.923 ops/s
>> > MyBenchmark.withIsTraceEnabledCheck thrpt 20 676307984.921 ±
>> > 14305970.393 ops/s
>> > MyBenchmark.withVariable thrpt 20 2411000894.582 ±
>> > 17382438.635 ops/s
>> >
>> > I think there is no surprise here.. using a field, which stores the
>> > result
>> > of `logger.isTraceEnabled()` evaluation is 3 times faster than other
>> > options.
>> >
>> > If anyone is interested in printing out JIT stuff, I also ran it with
>> > "-XX:+PrintCompilation", "-XX:+PrintCompilation2" and
>> > "-XX:+PrintInlining"
>> > here [3].
>> >
>> > I'm not a performance expert but it seems that JIT could not inline the
>> > "translate" method because of its size (see line 1861). However
it
>> > tried
>> > several times with different optimizations (and some of them were
>> > thrown
>> > away - "made not entrant" messages [4]).
>> >
>> > Let's wait for James' opinion on this, but I think we should
address
>> > this
>> > issue on JBoss Logging/LogManager side (so I agree with David here) and
>> > make
>> > those parts inlinable (wow, does this word even exist? :D). Once this
>> > is
>> > done, we could experiment further in Infinispan codebase and see how
>> > this
>> > relates to some real world benchmarks...
>> >
>> > Thanks
>> > Sebastian
>> >
>> > [1]
https://gist.github.com/slaskawi/6766b6e17c7a28ac8d8962293c48a53c
>> > [2] Test repository:
>> >
https://github.com/slaskawi/jboss-logging-perf-test
>> > [3]
https://gist.github.com/slaskawi/6f317bb05539611434bc91d66924bae0
>> > [4]
>> >
>> >
https://www.safaribooksonline.com/library/view/java-performance-the/97814...
>> >
>> > On Mon, Oct 3, 2016 at 7:32 AM, Sebastian Laskawiec
>> > <slaskawi(a)redhat.com>
>> > wrote:
>> >>
>> >> Hey!
>> >>
>> >> Adding James to this thread.
>> >>
>> >> @Dennis - I think Dan has a point here. The trick with checking a
>> >> field in
>> >> a class is 3 times faster. Most of the checks are done in core so they
>> >> are
>> >> executed multiple times per operation. Changing all those places is
>> >> probably
>> >> not an option.
>> >>
>> >> @David - Let me run a test with JBoss Log Manager and get back to you
>> >> with
>> >> some results. But if Dan is right, and the problem is with enum
>> >> mapping, I
>> >> will get similar results.
>> >>
>> >> Thanks
>> >> Sebastian
>> >>
>> >> On Sat, Oct 1, 2016 at 11:53 AM, Dan Berindei
<dan.berindei(a)gmail.com>
>> >> wrote:
>> >>>
>> >>> On Fri, Sep 30, 2016 at 10:15 PM, David M. Lloyd
>> >>> <david.lloyd(a)redhat.com>
>> >>> wrote:
>> >>> > The performance problem that this trick is meant to resolve is
>> >>> > really a
>> >>> > problem in the logging backend. It *should* be faster inside
of
>> >>> > WildFly, where JBoss LogManager is used, because that project
just
>> >>> > checks a single volatile field for the level check... and the
path
>> >>> > to
>> >>> > that code *should* be inline-friendly.
>> >>> >
>> >>>
>> >>> Indeed, we started using this trick because of log4j 1.2, which
needs
>> >>> to walk the logger hierarchy in order to check the level, and it
made
>> >>> a significant difference there.
>> >>>
>> >>> Nowadays I think it's pretty close to optimal in all logging
>> >>> frameworks. The only nitpick is that they all use enums for the
>> >>> levels, and the JIT can't inline Level.TRACE.value as it would
with a
>> >>> Level.TRACE_VALUE int constant. If JDK9 fixes that, then it's
going
>> >>> to
>> >>> be more or less equivalent to using a volatile "trace"
field in each
>> >>> class, so it should be suitable even for local read operations that
>> >>> take < 200ns.
>> >>>
>> >>> We'd probably still need to weed out some of the trace messages,
as
>> >>> we
>> >>> probably have more than 10 of them during such a read operation. I
>> >>> confess that I added way too many trace logs myself, precisely
>> >>> because
>> >>> I knew we are using a static final field and the JIT compiler
doesn't
>> >>> even generate code for that branch.
>> >>>
>> >>> Cheers
>> >>> Dan
>> >>> _______________________________________________
>> >>> infinispan-dev mailing list
>> >>> infinispan-dev(a)lists.jboss.org
>> >>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>
>> >>
>> >
>> >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev(a)lists.jboss.org
>> >
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev