[jbosscache-dev] Re: NBST failing on my Mac

Tue Feb 17 03:49:09 EST 2009

Jason T. Greene wrote:
> Jason T. Greene wrote:
>> Bela Ban wrote:
>>> So it *does* throw an exception ?
>>>
>>> I didn't look in the code, but my TestNG errors on my Mac were 
>>> caused by none of the members being able to provide initial state, 
>>> and therefore the get() assertions after the 2nd node joined, failed.
>>>
>>> I would have expected an exception right ? I'll dig into this 
>>> tomorrow again..
>>>
>>
>> Moving this to the the dev list since Brian mentioned we are skipping 
>> folks.
>>
>>
>
> Right, it *should* fail completely if no state transfer is successful 
> (it throws a CacheException in RPCManagerImpl.start()). If it doesn't 
> then it's a bug somewhere. It would take quite awhile for this to 
> happen though. It loops through every server 5 times, and waits 
> increasing amounts of time  (2 ^ iteration seconds).

I think we should also adjust configuration.getStateRetrievalTimeout(), 
similarly to 'wait' (maybe double it ?).

The state retrieval timeout is always 15 secs, but if I don't manage to 
fetch the state on the first time, I will certainly also fail the 2nd 
and subsequent times because the test keeps adding nodes to the cache.

Rather than multiplying 'wait by 4, we should double the state retrieval 
timeout. I don't see why we have a wait in there anyway.

I can see where you throw an exception if the initial state transfer 
fails, after ca 1000 secs.

Question: where do you ship the logs ? RPCManagerImpl.start() only 
transfers the initial state, right ?

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss - a division of Red Hat