Hi.
I am currently using cache 2.1.0 GA and jgroups 2.6.1 with buddy replication. Buddy rep is
configured to use one buddy only.
The setup is four nodes with ip addresses like:
172.16.0.5
| 172.16.0.6
| 172.16.0.7
| 172.16.0.8
|
| They are all started in the stated order so that .5 is the coordinator. The first node
up (.5) will insert data to the cache and then data will gravitate to the other nodes when
needed. This will occur mostly initially when applying load to the system. Data affinity
is handled by a layer above the cache.
|
| Using this scenario with 2.0.0 GA presented no problems except adding new nodes during
load, so we are now investigating 2.1.0.
|
| The issue I'm facing is that the coordinator seems to get two buddy backups, one
being itself.
|
| This is the contents on 172.16.0.5 (coordinator):
|
|
| | null {}
| | /91 {91=com.cubeia.firebase.game.table.InternalMetaData@1134d9b
com.cubeia.testgame.server.game.TestGame@1cdc8ce}
| | /63 {63=com.cubeia.firebase.game.table.InternalMetaData@cc9ff5
com.cubeia.testgame.server.game.TestGame@951aeb}
| | /92 {92=com.cubeia.firebase.game.table.InternalMetaData@e60a39
com.cubeia.testgame.server.game.TestGame@185c84e}
| | ...
| |
| | /_BUDDY_BACKUP_ {}
| | /172.16.0.8_8786 {}
| | /15 {15=com.cubeia.firebase.game.table.InternalMetaData@d9e1c0 null}
| | /16 {16=com.cubeia.firebase.game.table.InternalMetaData@742062 null}
| | /172.16.0.5_8786 {}
| | /31 {}
|
| Notice that there are two members listed under_BUDDY_BACKUP, one is .8 and the other
one is .5, which is itself.
|
| Now, on 172.16.8 we get a lot of lock timeouts like the one below:
|
| Caused by: org.jboss.cache.lock.TimeoutException: read lock for
/_BUDDY_BACKUP_/172.16.0.5_8786 could not be acquired by
GlobalTransaction:<172.16.0.6:8786>:41 after 5000 ms. Locks: Read lock owners: []
| | Write lock owner: GlobalTransaction:<172.16.0.6:8786>:1
| | , lock info: write owner=GlobalTransaction:<172.16.0.6:8786>:1
(activeReaders=0, activeWriter=Thread[Incoming,TableSpace,172.16.0.8:8786,5,Thread Pools],
waitingReaders=25, waitingWriters=0, waitingUpgrader=0)
| |
|
| 172.16.0.8 also shows two members under the buddy backup:
|
| null {}
| | /28 {}
| | /29 {}
| | /92 {}
| | ...
| | /_BUDDY_BACKUP_ {}
| | /172.16.0.7_8786 {}
| | /91 {91=com.cubeia.firebase.game.table.InternalMetaData@1fbeed6 null}
| | /41 {41=com.cubeia.firebase.game.table.InternalMetaData@fd3922 null}
| | /115 {115=com.cubeia.firebase.game.table.InternalMetaData@b215d9 null}
| | ...
| | /172.16.0.5_8786 {}
| | /31 {}
| |
|
| It seems like .8's buddy to backup is in fact .7. But we still hold some buddy ref
to the .5 member as well. In fact, all the lock timeouts on .8 is related to to .5 buddy
back fqn:
|
| failure acquiring lock: fqn=/_BUDDY_BACKUP_/172.16.0.5_8786
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4115924#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...