[infinispan-dev] if (trace) logger.tracef - it makes sense

Dennis Reed dereed at redhat.com
Mon Oct 3 13:31:56 EDT 2016


On 10/03/2016 08:43 AM, Radim Vansa wrote:
> Aren't we investing a bit too much time to this? The trace level was 
> cached for ages, has anyone ever complained?

I have.  https://issues.jboss.org/browse/JBCACHE-1625

Looks like I just never copied it over to the ISPN JIRA.  :)

> Turning on trace level is completely impractical in production,

We do it all the time when there's no better option to debug an issue.
(generally for as short a period as possible if it's production)

-Dennis

> but that info (or, I'd say 80% of 
> that) is very useful when debugging race conditions etc., so it makes 
> sense to keep it there and reduce the cost to minimum.
> 
> My 2c
> 
> Radim
> 
> On 10/03/2016 03:25 PM, Dan Berindei wrote:
>> On Mon, Oct 3, 2016 at 3:58 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>>> I've not looked at any such details but I always  assumed that since the
>>> logger frameworks (any of them) need to be able to change the logging level,
>>> there must be some way of switching state - at least a volatile read?
>>>
>>> And while that could be very cheap, I doubt it would be as easily optimised
>>> as a constant.
>>>
>> True, at some point you need a volatile read -- you can only avoid
>> that by caching isTraceEnabled(), like we do in Infinispan.
>>
>> My point was related to this assembly instruction, which reads the
>> ordinal of Level.TRACE:
>>
>> 0x00007f93951e9fdf: mov rdi,0x76d4e6d48  ;   {oop(a
>> 'org/jboss/logging/Logger$Level')}
>> 0x00007f93951e9fe9: mov ebp  << missing an operand here >>; -
>> java.lang.Enum::ordinal at 1 (line 103)
>>                               ; -
>> org.jboss.logging.JBossLogManagerLogger::translate at 8 (line 58)
>>                               ; -
>> org.jboss.logging.JBossLogManagerLogger::isEnabled at 5 (line 35)
>>                               ; - org.jboss.logging.Logger::tracef at 4 (line 287)
>>                               ; - org.infinispan.MyBenchmark::tracef at 7 (line 26)
>>                               ; -
>> org.infinispan.generated.MyBenchmark_tracef_jmhTest::tracef_thrpt_jmhStub at 15
>> (line 113)
>>
>> In an ideal world, the JIT would replace translate(Level.TRACE) with a
>> constant. But here, even though `0x76d4e6d48` is clearly a constant,
>> the ordinal is still loaded and checked every time.
>>
>>> Since we might not be interested only in the logger method to be inlined, we
>>> also want most of the client methods to be considered for inlining.. some of
>>> these can be extremely sensitive so such a decision requires a full
>>> Infinispan performance test rather than a microbenchmark. (which is useful
>>> too but more for the sake of optimising JBoss Logging, not to proof if it's
>>> OK to switch all of Infinispan 's internal conventions)
>>>
>> Based on the testing both myself and Radim did WRT inlining, it
>> doesn't make a huge difference. But I agree 100% that before removing
>> the isTraceEnabled() caching we'd have to benchmark the whole of
>> Infinispan, and not just JBoss Logging in isolation.
>>
>> Cheers
>> Dan
>>
>>> On Mon, 3 Oct 2016, 13:45 Sebastian Laskawiec, <slaskawi at redhat.com> wrote:
>>>> Awesome Dan! May I ask for a Pull Request:
>>>> https://github.com/slaskawi/jboss-logging-perf-test?
>>>>
>>>>
>>>>
>>>> On Mon, Oct 3, 2016 at 2:40 PM, Dan Berindei <dan.berindei at gmail.com>
>>>> wrote:
>>>>> Hi Sebastian
>>>>>
>>>>> I modified your benchmark so that the logger and the trace field are
>>>>> static final and looked at the generated assembly with JITWatch [1].
>>>>> Based on the attached assembly listings, caching isTraceEnabled() in a
>>>>> constant field is "infinitely faster", because there are no assembly
>>>>> instructions that can be traced back to the test method.
>>>>>
>>>>> JBossLogManagerLogger::translate() is inlined in this listing, but it
>>>>> still goes through the switch machinery, I'm guessing because the
>>>>> ordinal of an Enum is not considered a constant.
>>>>>
>>>>> Cheers
>>>>> Dan
>>>>>
>>>>>
>>>>> [1]: https://github.com/AdoptOpenJDK/jitwatch
>>>>>
>>>>> On Mon, Oct 3, 2016 at 11:28 AM, Sebastian Laskawiec
>>>>> <slaskawi at redhat.com> wrote:
>>>>>> Hey!
>>>>>>
>>>>>> Please have a look at the latest perf test results [1][2]:
>>>>>>
>>>>>> Benchmark                             Mode  Cnt           Score
>>>>>> Error  Units
>>>>>> MyBenchmark.noVariable               thrpt   20   681131269.875 ±
>>>>>> 3961932.923  ops/s
>>>>>> MyBenchmark.withIsTraceEnabledCheck  thrpt   20   676307984.921 ±
>>>>>> 14305970.393  ops/s
>>>>>> MyBenchmark.withVariable             thrpt   20  2411000894.582 ±
>>>>>> 17382438.635  ops/s
>>>>>>
>>>>>> I think there is no surprise here.. using a field, which stores the
>>>>>> result
>>>>>> of `logger.isTraceEnabled()` evaluation is 3 times faster than other
>>>>>> options.
>>>>>>
>>>>>> If anyone is interested in printing out JIT stuff, I also ran it with
>>>>>> "-XX:+PrintCompilation", "-XX:+PrintCompilation2" and
>>>>>> "-XX:+PrintInlining"
>>>>>> here [3].
>>>>>>
>>>>>> I'm not a performance expert but it seems that JIT could not inline the
>>>>>> "translate" method because of its size (see line 1861). However it
>>>>>> tried
>>>>>> several times with different optimizations (and some of them were
>>>>>> thrown
>>>>>> away - "made not entrant" messages [4]).
>>>>>>
>>>>>> Let's wait for James' opinion on this, but I think we should address
>>>>>> this
>>>>>> issue on JBoss Logging/LogManager side (so I agree with David here) and
>>>>>> make
>>>>>> those parts inlinable (wow, does this word even exist? :D). Once this
>>>>>> is
>>>>>> done, we could experiment further in Infinispan codebase and see how
>>>>>> this
>>>>>> relates to some real world benchmarks...
>>>>>>
>>>>>> Thanks
>>>>>> Sebastian
>>>>>>
>>>>>> [1] https://gist.github.com/slaskawi/6766b6e17c7a28ac8d8962293c48a53c
>>>>>> [2] Test repository:
>>>>>> https://github.com/slaskawi/jboss-logging-perf-test
>>>>>> [3] https://gist.github.com/slaskawi/6f317bb05539611434bc91d66924bae0
>>>>>> [4]
>>>>>>
>>>>>> https://www.safaribooksonline.com/library/view/java-performance-the/9781449363512/ch04.html
>>>>>>
>>>>>> On Mon, Oct 3, 2016 at 7:32 AM, Sebastian Laskawiec
>>>>>> <slaskawi at redhat.com>
>>>>>> wrote:
>>>>>>> Hey!
>>>>>>>
>>>>>>> Adding James to this thread.
>>>>>>>
>>>>>>> @Dennis - I think Dan has a point here. The trick with checking a
>>>>>>> field in
>>>>>>> a class is 3 times faster. Most of the checks are done in core so they
>>>>>>> are
>>>>>>> executed multiple times per operation. Changing all those places is
>>>>>>> probably
>>>>>>> not an option.
>>>>>>>
>>>>>>> @David - Let me run a test with JBoss Log Manager and get back to you
>>>>>>> with
>>>>>>> some results. But if Dan is right, and the problem is with enum
>>>>>>> mapping, I
>>>>>>> will get similar results.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Sebastian
>>>>>>>
>>>>>>> On Sat, Oct 1, 2016 at 11:53 AM, Dan Berindei <dan.berindei at gmail.com>
>>>>>>> wrote:
>>>>>>>> On Fri, Sep 30, 2016 at 10:15 PM, David M. Lloyd
>>>>>>>> <david.lloyd at redhat.com>
>>>>>>>> wrote:
>>>>>>>>> The performance problem that this trick is meant to resolve is
>>>>>>>>> really a
>>>>>>>>> problem in the logging backend.  It *should* be faster inside of
>>>>>>>>> WildFly, where JBoss LogManager is used, because that project just
>>>>>>>>> checks a single volatile field for the level check... and the path
>>>>>>>>> to
>>>>>>>>> that code *should* be inline-friendly.
>>>>>>>>>
>>>>>>>> Indeed, we started using this trick because of log4j 1.2, which needs
>>>>>>>> to walk the logger hierarchy in order to check the level, and it made
>>>>>>>> a significant difference there.
>>>>>>>>
>>>>>>>> Nowadays I think it's pretty close to optimal in all logging
>>>>>>>> frameworks. The only nitpick is that they all use enums for the
>>>>>>>> levels, and the JIT can't inline Level.TRACE.value as it would with a
>>>>>>>> Level.TRACE_VALUE int constant. If JDK9 fixes that, then it's going
>>>>>>>> to
>>>>>>>> be more or less equivalent to using a volatile "trace" field in each
>>>>>>>> class, so it should be suitable even for local read operations that
>>>>>>>> take < 200ns.
>>>>>>>>
>>>>>>>> We'd probably still need to weed out some of the trace messages, as
>>>>>>>> we
>>>>>>>> probably have more than 10 of them during such a read operation. I
>>>>>>>> confess that I added way too many trace logs myself, precisely
>>>>>>>> because
>>>>>>>> I knew we are using a static final field and the JIT compiler doesn't
>>>>>>>> even generate code for that branch.
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Dan
>>>>>>>> _______________________________________________
>>>>>>>> infinispan-dev mailing list
>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 



More information about the infinispan-dev mailing list