ahhh, for some reason I missed that. Sorry guys...
In that case there's probably not much we can do. The only missing bit is
what to do with JBCACHE-1625?
On Tue, Oct 4, 2016 at 12:24 PM, Dan Berindei <dan.berindei(a)gmail.com>
wrote:
On Tue, Oct 4, 2016 at 10:44 AM, Sebastian Laskawiec
<slaskawi(a)redhat.com> wrote:
> I think we reached the point where we mention the same arguments over and
> over again...
>
> So let me try to sum it up and possibly iron an action plan out:
>
> The main reason we cache the "isTraceEnabled" value is speed. It's
more
than
> 3 times faster then checking it before each trace invocation.
Actually, if you look at the the assembly code in const.txt, all the
assembly instructions are from JMH itself, not from the test method.
So caching in a static final is "infinitely faster", not just 3 times
faster.
> This decreases maintainability since we can't change the logging level in
> runtime. Dennis filed a JIRA for it:
>
https://issues.jboss.org/browse/JBCACHE-1625
We can change the logging level to DEBUG at runtime, which I would
argue is enough in most cases.
> It doesn't matter which logging backend we use (JDK, JBoss etc).
> The main problem is the "translate" method in logger, which translates
JBoss
> Logging levels into backend's native levels. The "translate" method
is
used
> for all logging levels (since "doLogf" method checks "isEnabled"
method
> result, which in turn calls "translate").
> As shown on benchmarks with JIT activity, there are two reasons why
> "translate" method is not inlined - it's too big and it contains a
switch
> statement.
>
Hmmm, I thought the assembly listing in tracef.txt showed that it *is*
inlined :)
> If you agree with the above, I would like to propose the following action
> plan:
>
> @Dennis - please move the JBCACHE-1625 to Infinispan project and assign
it
> to me.
> @James - I would like to ask you to experiment with optimizing these
> methods. It would be great if both "translate" and "isEnabled"
methods
could
> be inlined. I would be more than happy to verify if it works in both my
> private benchmarks and if that goes well, with Infinispan stress tests.
>
IMO there are way too many TRACE logs in the core, and I don't think
we can avoid affecting performance without removing 80% of them. And I
also think the time spent arguing about this would have been better
spent trying to fix some of the random failures in the test suite,
i.e. the thing that the excessive TRACE logs are supposed to help us
with :)
Cheers
Dan
> Thanks
> Sebastian
>
> On Mon, Oct 3, 2016 at 7:31 PM, Dennis Reed <dereed(a)redhat.com> wrote:
>>
>> On 10/03/2016 08:43 AM, Radim Vansa wrote:
>> > Aren't we investing a bit too much time to this? The trace level was
>> > cached for ages, has anyone ever complained?
>>
>> I have.
https://issues.jboss.org/browse/JBCACHE-1625
>>
>> Looks like I just never copied it over to the ISPN JIRA. :)
>>
>> > Turning on trace level is completely impractical in production,
>>
>> We do it all the time when there's no better option to debug an issue.
>> (generally for as short a period as possible if it's production)
>>
>> -Dennis
>>
>> > but that info (or, I'd say 80% of
>> > that) is very useful when debugging race conditions etc., so it makes
>> > sense to keep it there and reduce the cost to minimum.
>> >
>> > My 2c
>> >
>> > Radim
>> >
>> > On 10/03/2016 03:25 PM, Dan Berindei wrote:
>> >> On Mon, Oct 3, 2016 at 3:58 PM, Sanne Grinovero <
sanne(a)infinispan.org>
>> >> wrote:
>> >>> I've not looked at any such details but I always assumed that
since
>> >>> the
>> >>> logger frameworks (any of them) need to be able to change the
logging
>> >>> level,
>> >>> there must be some way of switching state - at least a volatile
read?
>> >>>
>> >>> And while that could be very cheap, I doubt it would be as easily
>> >>> optimised
>> >>> as a constant.
>> >>>
>> >> True, at some point you need a volatile read -- you can only avoid
>> >> that by caching isTraceEnabled(), like we do in Infinispan.
>> >>
>> >> My point was related to this assembly instruction, which reads the
>> >> ordinal of Level.TRACE:
>> >>
>> >> 0x00007f93951e9fdf: mov rdi,0x76d4e6d48 ; {oop(a
>> >> 'org/jboss/logging/Logger$Level')}
>> >> 0x00007f93951e9fe9: mov ebp << missing an operand here >>;
-
>> >> java.lang.Enum::ordinal@1 (line 103)
>> >> ; -
>> >> org.jboss.logging.JBossLogManagerLogger::translate@8 (line 58)
>> >> ; -
>> >> org.jboss.logging.JBossLogManagerLogger::isEnabled@5 (line 35)
>> >> ; - org.jboss.logging.Logger::tracef@4
>> >> (line 287)
>> >> ; - org.infinispan.MyBenchmark::
tracef@7
>> >> (line 26)
>> >> ; -
>> >>
>> >> org.infinispan.generated.MyBenchmark_tracef_jmhTest::
tracef_thrpt_jmhStub@15
>> >> (line 113)
>> >>
>> >> In an ideal world, the JIT would replace translate(Level.TRACE) with
a
>> >> constant. But here, even though `0x76d4e6d48` is clearly a constant,
>> >> the ordinal is still loaded and checked every time.
>> >>
>> >>> Since we might not be interested only in the logger method to be
>> >>> inlined, we
>> >>> also want most of the client methods to be considered for
inlining..
>> >>> some of
>> >>> these can be extremely sensitive so such a decision requires a
full
>> >>> Infinispan performance test rather than a microbenchmark. (which
is
>> >>> useful
>> >>> too but more for the sake of optimising JBoss Logging, not to
proof
if
>> >>> it's
>> >>> OK to switch all of Infinispan 's internal conventions)
>> >>>
>> >> Based on the testing both myself and Radim did WRT inlining, it
>> >> doesn't make a huge difference. But I agree 100% that before
removing
>> >> the isTraceEnabled() caching we'd have to benchmark the whole of
>> >> Infinispan, and not just JBoss Logging in isolation.
>> >>
>> >> Cheers
>> >> Dan
>> >>
>> >>> On Mon, 3 Oct 2016, 13:45 Sebastian Laskawiec,
<slaskawi(a)redhat.com
>
>> >>> wrote:
>> >>>> Awesome Dan! May I ask for a Pull Request:
>> >>>>
https://github.com/slaskawi/jboss-logging-perf-test?
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Mon, Oct 3, 2016 at 2:40 PM, Dan Berindei <
dan.berindei(a)gmail.com>
>> >>>> wrote:
>> >>>>> Hi Sebastian
>> >>>>>
>> >>>>> I modified your benchmark so that the logger and the trace
field
are
>> >>>>> static final and looked at the generated assembly with
JITWatch
[1].
>> >>>>> Based on the attached assembly listings, caching
isTraceEnabled()
in
>> >>>>> a
>> >>>>> constant field is "infinitely faster", because
there are no
assembly
>> >>>>> instructions that can be traced back to the test method.
>> >>>>>
>> >>>>> JBossLogManagerLogger::translate() is inlined in this
listing,
but
>> >>>>> it
>> >>>>> still goes through the switch machinery, I'm guessing
because the
>> >>>>> ordinal of an Enum is not considered a constant.
>> >>>>>
>> >>>>> Cheers
>> >>>>> Dan
>> >>>>>
>> >>>>>
>> >>>>> [1]:
https://github.com/AdoptOpenJDK/jitwatch
>> >>>>>
>> >>>>> On Mon, Oct 3, 2016 at 11:28 AM, Sebastian Laskawiec
>> >>>>> <slaskawi(a)redhat.com> wrote:
>> >>>>>> Hey!
>> >>>>>>
>> >>>>>> Please have a look at the latest perf test results
[1][2]:
>> >>>>>>
>> >>>>>> Benchmark Mode Cnt
Score
>> >>>>>> Error Units
>> >>>>>> MyBenchmark.noVariable thrpt 20
681131269.875 ±
>> >>>>>> 3961932.923 ops/s
>> >>>>>> MyBenchmark.withIsTraceEnabledCheck thrpt 20
676307984.921
±
>> >>>>>> 14305970.393 ops/s
>> >>>>>> MyBenchmark.withVariable thrpt 20
2411000894.582 ±
>> >>>>>> 17382438.635 ops/s
>> >>>>>>
>> >>>>>> I think there is no surprise here.. using a field,
which stores
the
>> >>>>>> result
>> >>>>>> of `logger.isTraceEnabled()` evaluation is 3 times
faster than
>> >>>>>> other
>> >>>>>> options.
>> >>>>>>
>> >>>>>> If anyone is interested in printing out JIT stuff, I
also ran it
>> >>>>>> with
>> >>>>>> "-XX:+PrintCompilation",
"-XX:+PrintCompilation2" and
>> >>>>>> "-XX:+PrintInlining"
>> >>>>>> here [3].
>> >>>>>>
>> >>>>>> I'm not a performance expert but it seems that JIT
could not
inline
>> >>>>>> the
>> >>>>>> "translate" method because of its size (see
line 1861). However
it
>> >>>>>> tried
>> >>>>>> several times with different optimizations (and some of
them were
>> >>>>>> thrown
>> >>>>>> away - "made not entrant" messages [4]).
>> >>>>>>
>> >>>>>> Let's wait for James' opinion on this, but I
think we should
>> >>>>>> address
>> >>>>>> this
>> >>>>>> issue on JBoss Logging/LogManager side (so I agree with
David
here)
>> >>>>>> and
>> >>>>>> make
>> >>>>>> those parts inlinable (wow, does this word even exist?
:D). Once
>> >>>>>> this
>> >>>>>> is
>> >>>>>> done, we could experiment further in Infinispan
codebase and see
>> >>>>>> how
>> >>>>>> this
>> >>>>>> relates to some real world benchmarks...
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> Sebastian
>> >>>>>>
>> >>>>>> [1]
>> >>>>>>
https://gist.github.com/slaskawi/6766b6e17c7a28ac8d8962293c48a5
3c
>> >>>>>> [2] Test repository:
>> >>>>>>
https://github.com/slaskawi/jboss-logging-perf-test
>> >>>>>> [3]
>> >>>>>>
https://gist.github.com/slaskawi/6f317bb05539611434bc91d66924ba
e0
>> >>>>>> [4]
>> >>>>>>
>> >>>>>>
>> >>>>>>
https://www.safaribooksonline.com/library/view/java-
performance-the/9781449363512/ch04.html
>> >>>>>>
>> >>>>>> On Mon, Oct 3, 2016 at 7:32 AM, Sebastian Laskawiec
>> >>>>>> <slaskawi(a)redhat.com>
>> >>>>>> wrote:
>> >>>>>>> Hey!
>> >>>>>>>
>> >>>>>>> Adding James to this thread.
>> >>>>>>>
>> >>>>>>> @Dennis - I think Dan has a point here. The trick
with checking
a
>> >>>>>>> field in
>> >>>>>>> a class is 3 times faster. Most of the checks are
done in core
so
>> >>>>>>> they
>> >>>>>>> are
>> >>>>>>> executed multiple times per operation. Changing all
those places
>> >>>>>>> is
>> >>>>>>> probably
>> >>>>>>> not an option.
>> >>>>>>>
>> >>>>>>> @David - Let me run a test with JBoss Log Manager
and get back
to
>> >>>>>>> you
>> >>>>>>> with
>> >>>>>>> some results. But if Dan is right, and the problem
is with enum
>> >>>>>>> mapping, I
>> >>>>>>> will get similar results.
>> >>>>>>>
>> >>>>>>> Thanks
>> >>>>>>> Sebastian
>> >>>>>>>
>> >>>>>>> On Sat, Oct 1, 2016 at 11:53 AM, Dan Berindei
>> >>>>>>> <dan.berindei(a)gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>> On Fri, Sep 30, 2016 at 10:15 PM, David M.
Lloyd
>> >>>>>>>> <david.lloyd(a)redhat.com>
>> >>>>>>>> wrote:
>> >>>>>>>>> The performance problem that this trick is
meant to resolve is
>> >>>>>>>>> really a
>> >>>>>>>>> problem in the logging backend. It
*should* be faster inside
of
>> >>>>>>>>> WildFly, where JBoss LogManager is used,
because that project
>> >>>>>>>>> just
>> >>>>>>>>> checks a single volatile field for the
level check... and the
>> >>>>>>>>> path
>> >>>>>>>>> to
>> >>>>>>>>> that code *should* be inline-friendly.
>> >>>>>>>>>
>> >>>>>>>> Indeed, we started using this trick because of
log4j 1.2, which
>> >>>>>>>> needs
>> >>>>>>>> to walk the logger hierarchy in order to check
the level, and
it
>> >>>>>>>> made
>> >>>>>>>> a significant difference there.
>> >>>>>>>>
>> >>>>>>>> Nowadays I think it's pretty close to
optimal in all logging
>> >>>>>>>> frameworks. The only nitpick is that they all
use enums for the
>> >>>>>>>> levels, and the JIT can't inline
Level.TRACE.value as it would
>> >>>>>>>> with a
>> >>>>>>>> Level.TRACE_VALUE int constant. If JDK9 fixes
that, then it's
>> >>>>>>>> going
>> >>>>>>>> to
>> >>>>>>>> be more or less equivalent to using a volatile
"trace" field in
>> >>>>>>>> each
>> >>>>>>>> class, so it should be suitable even for local
read operations
>> >>>>>>>> that
>> >>>>>>>> take < 200ns.
>> >>>>>>>>
>> >>>>>>>> We'd probably still need to weed out some
of the trace
messages,
>> >>>>>>>> as
>> >>>>>>>> we
>> >>>>>>>> probably have more than 10 of them during such
a read
operation.
>> >>>>>>>> I
>> >>>>>>>> confess that I added way too many trace logs
myself, precisely
>> >>>>>>>> because
>> >>>>>>>> I knew we are using a static final field and
the JIT compiler
>> >>>>>>>> doesn't
>> >>>>>>>> even generate code for that branch.
>> >>>>>>>>
>> >>>>>>>> Cheers
>> >>>>>>>> Dan
>> >>>>>>>>
_______________________________________________
>> >>>>>>>> infinispan-dev mailing list
>> >>>>>>>> infinispan-dev(a)lists.jboss.org
>> >>>>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>>>>>>
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> infinispan-dev mailing list
>> >>>>>> infinispan-dev(a)lists.jboss.org
>> >>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>>>> _______________________________________________
>> >>>>> infinispan-dev mailing list
>> >>>>> infinispan-dev(a)lists.jboss.org
>> >>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>>>
>> >>>> _______________________________________________
>> >>>> infinispan-dev mailing list
>> >>>> infinispan-dev(a)lists.jboss.org
>> >>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>>
>> >>> _______________________________________________
>> >>> infinispan-dev mailing list
>> >>> infinispan-dev(a)lists.jboss.org
>> >>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >> _______________________________________________
>> >> infinispan-dev mailing list
>> >> infinispan-dev(a)lists.jboss.org
>> >>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >
>> >
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev