Jason T. Greene wrote:
Jason T. Greene wrote:
> Bela Ban wrote:
>> So it *does* throw an exception ?
>>
>> I didn't look in the code, but my TestNG errors on my Mac were
>> caused by none of the members being able to provide initial state,
>> and therefore the get() assertions after the 2nd node joined, failed.
>>
>> I would have expected an exception right ? I'll dig into this
>> tomorrow again..
>>
>
> Moving this to the the dev list since Brian mentioned we are skipping
> folks.
>
>
Right, it *should* fail completely if no state transfer is successful
(it throws a CacheException in RPCManagerImpl.start()). If it doesn't
then it's a bug somewhere. It would take quite awhile for this to
happen though. It loops through every server 5 times, and waits
increasing amounts of time (2 ^ iteration seconds).
I think we should also adjust configuration.getStateRetrievalTimeout(),
similarly to 'wait' (maybe double it ?).
The state retrieval timeout is always 15 secs, but if I don't manage to
fetch the state on the first time, I will certainly also fail the 2nd
and subsequent times because the test keeps adding nodes to the cache.
Rather than multiplying 'wait by 4, we should double the state retrieval
timeout. I don't see why we have a wait in there anyway.
I can see where you throw an exception if the initial state transfer
fails, after ca 1000 secs.
Question: where do you ship the logs ? RPCManagerImpl.start() only
transfers the initial state, right ?
--
Bela Ban
Lead JGroups / Clustering Team
JBoss - a division of Red Hat