Manik,
Many thanks for your reply - just to let you know we found out what the problem was.
We have a listener attached to each cache which helps us manage indexes on the cache (we
index some entities to prevent us having to traverse the all the entities when retrieving
a selection of items - through jboss cache we have pretty much killed off any need to
access the database with the exception of the inital load, writes and updates)
Within the listener there was a HashMap instance variable - it was this variable which was
causing JVM to lock up - when we did a thread dump we noticed that the replication thread
was getting stuck on the put method of this HashMap at the same time as another normal
application thread was - hence the replication queue was backing up and causing the
replication exceptions to be thrown to the other nodes on the cluster. Searching the net I
found a number of instances where hashmap can spin out of control if two thread hit the
rebuild method of the hashmap. We have since adding read/write lock to this variable and
we have yet to experience this issue again. Unfortunately it was fairly hard to find as
you would have expected to see a deadlock - and it was only when we did a number of thread
over a minute that we picked this up.
We have briefly tried the version three and things seem go really well until ramped up the
load on the clustered cache (the local caches where absolutely flying) it started to
through lock errors (which was actually the last issue I was expecting given the new
locking system - before pointing the figure at the new version I want to do more analysis
at our end as it could be some silly code at our end or the database running really
slowly).
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4185792#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...