[
https://issues.redhat.com/browse/ISPN-11176?page=com.atlassian.jira.plugi...
]
Dan Berindei commented on ISPN-11176:
-------------------------------------
{quote}At the last F2F, while discussing locking w/x-site, I brought up the possibility of
partitioning entries with a primary site (in the similar way that we partition them within
a single cluster). With this in place, max-idle processing would be initiated only on the
primary site for a given entry and cascaded to backup sites accordingly. It would,
however, require that reads from the backup sites "touch" the primary site.
Ideally, reads from a backup site would be rare - and only happen on failure of the
primary site. For normal operation, this strategy has the advantage of minimal x-site
traffic and simplifies max-idle processing by using a single authority for determining
when to expire a given entry thereby preventing premature expirations.{quote}
Thanks for this [~pferraro], I meant to mention partitioning entries w/ a
meta-consistent-hash, but I forgot about it.
Unfortunately, I see 2 problems with this approach:
1. We don't have a proper meta-cluster-view. Each site only knows which other sites
still have a site master in the bridge cluster view, and which sites are
"online" as backups of caches from the local cluster. None of them are good
enough as a source of truth to determine the primary site of a key IMO, so we'd need
something new.
2. We don't know 100% what the users want from x-site max-idle, but one of the
requirements seems to be that it has to work with a "dumb" load balancer,
directing requests to different sites randomly.
XSite Max Idle
--------------
Key: ISPN-11176
URL:
https://issues.redhat.com/browse/ISPN-11176
Project: Infinispan
Issue Type: Enhancement
Components: Cross-Site Replication, Expiration
Reporter: Will Burns
Assignee: Will Burns
Priority: Major
Fix For: 12.0.0.Final
Max idle expiration currently doesn't work with xsite. That is if an entry was
written and replicated to both sites but one site never reads the value, but the other
does. If they then need to read the value from the other site it will be expired (assuming
the max idle time has elapsed).
There are a few ways we can do this.
1. Keep access times local to every site. When a site finds an entry is expired it asks
the other site(s) if it has a more recent access. If a site is known to have gone down we
should touch all entries, since they may not have updated access times. Requires very
little additional xsite communication.
2. Batch touch commands and only send every so often. Has window of loss, but should be
small. Requires more site usage. Wouldn't work for really low max idle times as an
entry could expire before the touch command is replicated.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)