[infinispan-dev] X-Site: Site Unreachable vs. Site Down

Erik Salter an1310 at hotmail.com
Sun Sep 16 21:21:08 EDT 2012


Hi all,

For the X-Site pull request, Bela, Mircea and I had a design review.  One of
the items that came up was the ability to mark a site as being “down” –
where a site has been unreachable for a period of time.  This mostly applies
to the synchronous replication case where the backup failure policy has been
configured as “FAIL”, i.e:

<namedCache name="importantCache">
 <sites>
    <backups>
 
<backup site="NYC" strategy="SYNC" backupFailurePolicy="FAIL" timeout="16000
0"/>
   </backups>
</sites>
</namedCache>

The current implementation would be to fail all requests until a SA realizes
the site is offline and mark it through a JMX  operation (provided in this
release?).   Since I cannot afford a 100% failure rate until somebody gets
called, I think we need to take it a step further and add an element to mark
a site as offline after a period of time.   (Note, though, a site can only
be brought back online manually.)

Mircea talked about adding an element in the configuration for a custom
callback implementation.  However, I think this is useful enough -- not only
for me -- but for other ISPN/JDG users as well.  (Not to mention we can't
add configuration for callbacks)

Comments?

Thanks,

Erik




More information about the infinispan-dev mailing list