]
Pedro Zapata Fernandez updated ISPN-11005:
------------------------------------------
Sprint: DataGrid Sprint #37, DataGrid Sprint #38, DataGrid Sprint #39 (was: DataGrid
Sprint #37, DataGrid Sprint #38)
HotRod decoder small performance improvements
---------------------------------------------
Key: ISPN-11005
URL:
https://issues.redhat.com/browse/ISPN-11005
Project: Infinispan
Issue Type: Enhancement
Components: Server
Affects Versions: 10.1.0.Beta1
Reporter: Dan Berindei
Assignee: Dan Berindei
Priority: Minor
Labels: performace
I noticed some small inefficiencies in the flight recordings from the client-server dist
read benchmarks:
* {{Intrinsics.string()}} allocates a temporary {{byte[]}}, we could use
{{ByteBuf.toString(start, length, Charset)}} instead (which reuses a thread-local
buffer).
* For reading the cache name it would be even better to use {{ByteString}} and avoid the
UTF8 decoding.
* {{MediaType.hashCode()}} allocates an iterator for the params map even though it's
empty.
* {{JBossMarshallingTranscoder.transcode()}} is called twice for each requests, and even
when there is no transcoding to perform it does a lot of {{String.equals()}} checks.
* {{CacheImpl.getCacheEntryAsync()}} allocates a new {{CompletableFuture}} via
{{applyThen()}} just to change the return type, could do the same thing by casting to the
erased type.
* {{EncoderCache.getCacheEntryAsync()}} could also avoid allocating a
{{CompletableFuture}} when the read was synchronous.
* {{Encoder2x}} is stateless, and yet a new instance is created for each request.
* {{Encoder2x.writeHeader()}} looks up the cache info a second time, as most requests
needed that info to execute the operation, plus one useless (I think) {{String.equals()}}
check for the counter cache.
There are also a few issues with the benchmark itself:
* The load stage took less than 3 mins according to the logs, but flight recordings show
{{PutKeyValueCommand}}s being executed at least 1 minute after the end of the load phase.
* Either RadarGun or FlightRecorder itself is doing lots of JMX calls that throw
exceptions constantly through the benchmark, allocating lots of {{StackTraceElement}}
instances.
* Finally, the cluster is unstable, and some nodes are excluded even though the network
seems to be fine and GC pauses are quite small.