Hi, I'm geting regular but intermittent invaliation failures from my distributed JBoss Cache / Hibernate setup and I'm hoping someone can give me a few pointers on how to track down this problem.
I can easily recreate the problem in our live environment, by updating a cached entity on one node, then checking the same entity on another node. Approx 50% of the time, the cache hasn't invalidated on the 2nd node, so the older version of the entity is shown.
I'm using a TCP multicast (because this is on EC2 so I can't use UDP) and there are currently 8 nodes in the cache.
Debug logs aren't showing any messages that would indicate a problem and the logs are identical when the invalidation occurs and when it fails. Can anyone suggest suitable log setting that might show some more useful logs (switch all to trace level outputted far too much info to make sense of)?
Thanks in advance for any help or suggestions!
JBoss Cache v3.1.0
JGroups v2.6.7
Hibernate v3.3.2
The debug logs I get on the node that is updating the entity are a follows:
23 Jan 2013 17:04:38,536 [http-bio-8443-exec-8] DEBUG InvalidationInterceptor:244 - Is a CRUD method
23 Jan 2013 17:04:38,550 [http-bio-8443-exec-8] DEBUG InvalidationInterceptor:244 - Is a CRUD method
23 Jan 2013 17:04:38,550 [http-bio-8443-exec-8] DEBUG InvalidationInterceptor:381 - Cache [XX.XX.XX.XX:6800] replicating InvalidateCommand{fqn=/mbCache/configData/ENTITY/com.package.EntityName#4881}
My cache settings are:
<!-- A config appropriate for entity/collection caching that
uses pessimistic locking -->
<cache-config name="pessimistic-entity">
<!-- Node locking scheme -->
<attribute name="NodeLockingScheme">PESSIMISTIC</attribute>
<!--
READ_COMMITTED is as strong as necessary for most
2nd Level Cache use cases.
-->
<attribute name="IsolationLevel">READ_COMMITTED</attribute>
<!-- Mode of communication with peer caches.
INVALIDATION_SYNC is highly recommended as the mode for use
with entity and collection caches.
-->
<attribute name="CacheMode">INVALIDATION_SYNC</attribute>
<!-- Name of cluster. Needs to be the same for all members, in order
to find each other -->
<attribute name="ClusterName">pessimistic-entity</attribute>
<!-- Use a UDP (multicast) based stack. A udp-sync stack might be
slightly better (no JGroups FC) but we stick with udp to
help ensure this cache and others like timestamps-cache
that require FC can use the same underlying JGroups resources. -->
<attribute name="MultiplexerStack">tcp</attribute>
<!-- Whether or not to fetch state on joining a cluster. -->
<attribute name="FetchInMemoryState">false</attribute>
<!--
The max amount of time (in milliseconds) we wait until the
state (ie. the contents of the cache) are retrieved from
existing members at startup. Ignored if FetchInMemoryState=false.
-->
<attribute name="StateRetrievalTimeout">20000</attribute>
<!--
Number of milliseconds to wait until all responses for a
synchronous call have been received.
-->
<attribute name="SyncReplTimeout">20000</attribute>
<!-- Max number of milliseconds to wait for a lock acquisition -->
<attribute name="LockAcquisitionTimeout">15000</attribute>
<!--
Indicate whether to use marshalling or not. Set this to true if you
are running under a scoped class loader, e.g., inside an application
server.
-->
<attribute name="UseRegionBasedMarshalling">true</attribute>
<!-- Must match the value of "useRegionBasedMarshalling" -->
<attribute name="InactiveOnStartup">true</attribute>