[JBoss JIRA] (ISPN-11005) HotRod decoder small performance improvements
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11005?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11005:
--------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/8615/files
> HotRod decoder small performance improvements
> ---------------------------------------------
>
> Key: ISPN-11005
> URL: https://issues.redhat.com/browse/ISPN-11005
> Project: Infinispan
> Issue Type: Enhancement
> Components: Server
> Affects Versions: 10.1.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: performace
>
> I noticed some small inefficiencies in the flight recordings from the client-server dist read benchmarks:
> * {{Intrinsics.string()}} allocates a temporary {{byte[]}}, we could use {{ByteBuf.toString(start, length, Charset)}} instead (which reuses a thread-local buffer).
> * For reading the cache name it would be even better to use {{ByteString}} and avoid the UTF8 decoding.
> * {{MediaType.hashCode()}} allocates an iterator for the params map even though it's empty.
> * {{JBossMarshallingTranscoder.transcode()}} is called twice for each requests, and even when there is no transcoding to perform it does a lot of {{String.equals()}} checks.
> * {{CacheImpl.getCacheEntryAsync()}} allocates a new {{CompletableFuture}} via {{applyThen()}} just to change the return type, could do the same thing by casting to the erased type.
> * {{EncoderCache.getCacheEntryAsync()}} could also avoid allocating a {{CompletableFuture}} when the read was synchronous.
> * {{Encoder2x}} is stateless, and yet a new instance is created for each request.
> * {{Encoder2x.writeHeader()}} looks up the cache info a second time, as most requests needed that info to execute the operation, plus one useless (I think) {{String.equals()}} check for the counter cache.
> There are also a few issues with the benchmark itself:
> * The load stage took less than 3 mins according to the logs, but flight recordings show {{PutKeyValueCommand}}s being executed at least 1 minute after the end of the load phase.
> * Either RadarGun or FlightRecorder itself is doing lots of JMX calls that throw exceptions constantly through the benchmark, allocating lots of {{StackTraceElement}} instances.
> * Finally, the cluster is unstable, and some nodes are excluded even though the network seems to be fine and GC pauses are quite small.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 3 months
[JBoss JIRA] (ISPN-11005) HotRod decoder small performance improvements
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11005?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11005:
--------------------------------
Status: Open (was: New)
> HotRod decoder small performance improvements
> ---------------------------------------------
>
> Key: ISPN-11005
> URL: https://issues.redhat.com/browse/ISPN-11005
> Project: Infinispan
> Issue Type: Enhancement
> Components: Server
> Affects Versions: 10.1.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: performace
>
> I noticed some small inefficiencies in the flight recordings from the client-server dist read benchmarks:
> * {{Intrinsics.string()}} allocates a temporary {{byte[]}}, we could use {{ByteBuf.toString(start, length, Charset)}} instead (which reuses a thread-local buffer).
> * For reading the cache name it would be even better to use {{ByteString}} and avoid the UTF8 decoding.
> * {{MediaType.hashCode()}} allocates an iterator for the params map even though it's empty.
> * {{JBossMarshallingTranscoder.transcode()}} is called twice for each requests, and even when there is no transcoding to perform it does a lot of {{String.equals()}} checks.
> * {{CacheImpl.getCacheEntryAsync()}} allocates a new {{CompletableFuture}} via {{applyThen()}} just to change the return type, could do the same thing by casting to the erased type.
> * {{EncoderCache.getCacheEntryAsync()}} could also avoid allocating a {{CompletableFuture}} when the read was synchronous.
> * {{Encoder2x}} is stateless, and yet a new instance is created for each request.
> * {{Encoder2x.writeHeader()}} looks up the cache info a second time, as most requests needed that info to execute the operation, plus one useless (I think) {{String.equals()}} check for the counter cache.
> There are also a few issues with the benchmark itself:
> * The load stage took less than 3 mins according to the logs, but flight recordings show {{PutKeyValueCommand}}s being executed at least 1 minute after the end of the load phase.
> * Either RadarGun or FlightRecorder itself is doing lots of JMX calls that throw exceptions constantly through the benchmark, allocating lots of {{StackTraceElement}} instances.
> * Finally, the cluster is unstable, and some nodes are excluded even though the network seems to be fine and GC pauses are quite small.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 3 months
[JBoss JIRA] (ISPN-12208) Operator Docs: Disabling autoscale
by Donald Naro (Jira)
Donald Naro created ISPN-12208:
----------------------------------
Summary: Operator Docs: Disabling autoscale
Key: ISPN-12208
URL: https://issues.redhat.com/browse/ISPN-12208
Project: Infinispan
Issue Type: Enhancement
Components: Documentation
Reporter: Donald Naro
Assignee: Donald Naro
If the OpenShift service CA is available then the Operator enables encryption by default.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 3 months
[JBoss JIRA] (ISPN-11176) XSite Max Idle
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11176?page=com.atlassian.jira.plugi... ]
Dan Berindei edited comment on ISPN-11176 at 8/10/20 7:05 AM:
--------------------------------------------------------------
{quote}The problem if it is async is you have a window of consistency loss if a node is taken down. We can mitigate this issue by performing a touch all when it occurs. I was told this is not acceptable, and I agree, for clustered max idle where it would occur for a single node. However, I believe that it should be okay when it would occur only when an entire site is lost.
{quote}
Does the consistency loss mean a write could be undone, or is it something else that's maybe more palatable to users?
I agree sending a touch command for all non-expired entries would be very expensive, especially for x-site where the user would make some estimations before deployment of how much bandwidth Infinispan would require, and having to send+process lots of max-idle commands would increase the latency of all the other x-site commands going through the same site masters.
One thing that's not clear to me is when you say "an entire site is lost", are you talking about a site being taken offline? Because a site can easily disappear from the bridge cluster's view for a while just because its site master crashed.
OTOH the take offline policies are maybe too lenient (although I haven't checked if Pedro made any recent changes w/ IRAC), so a sync x-site RPC is likely to time out before the site is taken offline.
Edit: Forgot to mention one more way a site can become unavailable: if it splits and the cache is configured with {{when-split="DENY_READ_WRITES"}}. Unless IRAC doesn't allow that configuration anyway?
was (Author: dan.berindei):
{quote}
The problem if it is async is you have a window of consistency loss if a node is taken down. We can mitigate this issue by performing a touch all when it occurs. I was told this is not acceptable, and I agree, for clustered max idle where it would occur for a single node. However, I believe that it should be okay when it would occur only when an entire site is lost.
{quote}
Does the consistency loss mean a write could be undone, or is it something else that's maybe more palatable to users?
I agree sending a touch command for all non-expired entries would be very expensive, especially for x-site where the user would make some estimations before deployment of how much bandwidth Infinispan would require, and having to send+process lots of max-idle commands would increase the latency of all the other x-site commands going through the same site masters.
One thing that's not clear to me is when you say "an entire site is lost", are you talking about a site being taken offline? Because a site can easily disappear from the bridge cluster's view for a while just because its site master crashed.
OTOH the take offline policies are maybe too lenient (although I haven't checked if Pedro made any recent changes w/ IRAC), so a sync x-site RPC is likely to time out before the site is taken offline.
> XSite Max Idle
> --------------
>
> Key: ISPN-11176
> URL: https://issues.redhat.com/browse/ISPN-11176
> Project: Infinispan
> Issue Type: Enhancement
> Components: Cross-Site Replication, Expiration
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 12.0.0.Final
>
>
> Max idle expiration currently doesn't work with xsite. That is if an entry was written and replicated to both sites but one site never reads the value, but the other does. If they then need to read the value from the other site it will be expired (assuming the max idle time has elapsed).
> There are a few ways we can do this.
> 1. Keep access times local to every site. When a site finds an entry is expired it asks the other site(s) if it has a more recent access. If a site is known to have gone down we should touch all entries, since they may not have updated access times. Requires very little additional xsite communication.
> 2. Batch touch commands and only send every so often. Has window of loss, but should be small. Requires more site usage. Wouldn't work for really low max idle times as an entry could expire before the touch command is replicated.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 3 months
[JBoss JIRA] (ISPN-11176) XSite Max Idle
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11176?page=com.atlassian.jira.plugi... ]
Dan Berindei commented on ISPN-11176:
-------------------------------------
I have an alternative proposal that I've been mulling over the last few days.
*TLDR;* Pre-expire entries at time {{last access + max-idle timeout}}, actually expire them at time {{last access + max-idle timeout + max-idle delay}}, or as soon as we know all other sites have also pre-expired the entry. Reads of pre-expired entries do not keep the entries alive.
How about, instead of sending touch commands based on the time the entry was last read, we send them based on the time the entry is supposed to expire?
The purge expired entries task would scan not just for entries that are already expired, but also for entries that are expired based on the last positive-touch command sent to the remote sites. Then it's going to send batches of touch+ commands to the remote sites for all the entries in the 2nd set (positive-touch for the ones that have newer accesses locally and negative-touch for the ones that don't).
When a read wants to remove an expired entry, it will first check whether the other sites sent a touch+ command for that entry. If there is a negative touch from all backups, or if the entry should have expired more than a {{max-idle delay}} (e.g. 1 min) ago, then the entry is considered truly expired and removed in the local cluster.
Since there is no synchronous communication with the remote sites, reads that happen after the entry should have expired (or maybe just after the local site sent the negative-touch command) must not extend the lifespan of the entry.
In case site 1's positive-touch command did take more than {{max-idle delay}} to reach site 2, and site 2 has already removed the entry, site 2 will send back (through IRAC) a remove-expired command to force site 1 to remove the entry.
In order to remove the likelihood of this happening, we can send the positive-touch {{max-idle delay}} before the time the entry would expire based on the last sent positive-touch command. This requires the max-idle timeout to be at least twice as big as {{max-idle delay}}, or it wouldn't be very efficient, but it seems like a reasonable limitation.
The advantage over option 2 is that for entries that are read often, we only send a remote touch command once per {{max-idle timeout - max-idle delay}}.
The advantage over option 1 is that all the x-site RPCs are in the asynchronous, so all read operations are fast.
The main disadvantage over option 1 is that we can have reads that see a value but don't keep it alive. But IMO it matches how IRAC would handle the application updating a value in site 1 (in order to "touch" it) and removing the same value in site 2 (because the application considers it expired).
Another disadvantage over option 1 is that we would need to send *-touch commands from all owners, if we rely on the backup owners sending positive-touch commands when they become primary owner that might be too late.
The disadvantage over option 2 is that, just like option 1, we can't have a purely active-backup relationship between sites any more. Here, each site has to know which other sites are active and keep per-entry metadata about which sites sent *-touch commands, so the take-offline behaviour may need tweaking.
Just like option 2, when a remote site is inaccessible but not yet offline, *-touch commands may be lost and max-idle entries may expire prematurely. Option 1 would make reads on entries that should expire time out instead.
When a remote site is brought back online, it would need x-site state transfer to include the timestamp of the last access and the timestamp of the last sent *-touch command timestamp so that the lifetime of the transferred entries is kept in sync. This also requires clock skew between all machines in all sites to be smaller than {{max-idle delay}}, but I think it's a reasonable assumption, and we could warn when not satisfied. I think this is a middle ground between option 1 (more tolerant, at most delaying the expiration with the clock skew) and option 2 (less tolerant, possibly expiring entries on just one site).
> XSite Max Idle
> --------------
>
> Key: ISPN-11176
> URL: https://issues.redhat.com/browse/ISPN-11176
> Project: Infinispan
> Issue Type: Enhancement
> Components: Cross-Site Replication, Expiration
> Reporter: Will Burns
> Assignee: Will Burns
> Priority: Major
> Fix For: 12.0.0.Final
>
>
> Max idle expiration currently doesn't work with xsite. That is if an entry was written and replicated to both sites but one site never reads the value, but the other does. If they then need to read the value from the other site it will be expired (assuming the max idle time has elapsed).
> There are a few ways we can do this.
> 1. Keep access times local to every site. When a site finds an entry is expired it asks the other site(s) if it has a more recent access. If a site is known to have gone down we should touch all entries, since they may not have updated access times. Requires very little additional xsite communication.
> 2. Batch touch commands and only send every so often. Has window of loss, but should be small. Requires more site usage. Wouldn't work for really low max idle times as an entry could expire before the touch command is replicated.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 3 months