[infinispan-issues] [JBoss JIRA] (ISPN-11176) XSite Max Idle

Dan Berindei (Jira) issues at jboss.org
Mon Aug 10 06:40:00 EDT 2020


    [ https://issues.redhat.com/browse/ISPN-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375439#comment-14375439 ] 

Dan Berindei commented on ISPN-11176:
-------------------------------------

{quote}
The problem if it is async is you have a window of consistency loss if a node is taken down. We can mitigate this issue by performing a touch all when it occurs. I was told this is not acceptable, and I agree, for clustered max idle where it would occur for a single node. However, I believe that it should be okay when it would occur only when an entire site is lost.
{quote}

Does the consistency loss mean a write could be undone, or is it something else that's maybe more palatable to users?

I agree sending a touch command for all non-expired entries would be very expensive, especially for x-site where the user would make some estimations before deployment of how much bandwidth Infinispan would require, and having to send+process lots of max-idle commands would increase the latency of all the other x-site commands going through the same site masters.

One thing that's not clear to me is when you say "an entire site is lost", are you talking about a site being taken offline? Because a site can easily disappear from the bridge cluster's view for a while just because its site master crashed.

OTOH the take offline policies are maybe too lenient (although I haven't checked if Pedro made any recent changes w/ IRAC), so a sync x-site RPC is likely to time out before the site is taken offline.



> XSite Max Idle
> --------------
>
>                 Key: ISPN-11176
>                 URL: https://issues.redhat.com/browse/ISPN-11176
>             Project: Infinispan
>          Issue Type: Enhancement
>          Components: Cross-Site Replication, Expiration
>            Reporter: Will Burns
>            Assignee: Will Burns
>            Priority: Major
>             Fix For: 12.0.0.Final
>
>
> Max idle expiration currently doesn't work with xsite. That is if an entry was written and replicated to both sites but one site never reads the value, but the other does. If they then need to read the value from the other site it will be expired (assuming the max idle time has elapsed).
> There are a few ways we can do this.
> 1. Keep access times local to every site. When a site finds an entry is expired it asks the other site(s) if it has a more recent access. If a site is known to have gone down we should touch all entries, since they may not have updated access times. Requires very little additional xsite communication.
> 2. Batch touch commands and only send every so often. Has window of loss, but should be small. Requires more site usage. Wouldn't work for really low max idle times as an entry could expire before the touch command is replicated.



--
This message was sent by Atlassian Jira
(v7.13.8#713008)


More information about the infinispan-issues mailing list