On Tue, Dec 4, 2012 at 11:32 AM, Mircea Markus <mmarkus@redhat.com> wrote:

On 4 Dec 2012, at 09:22, Sanne Grinovero wrote:

[...]

I don't think the cache should ever be in an illegal state to be used
after being started. So Infinispan should not require tests to wait
for a "cluster to be formed", I'd rather guarantee that after a cache
is started it's usable.
+1. Unless the test relies/verifies internal state, e.g. locks being acquired, data present in the data container etc.


It's not just a question of what you want to check, it's also a question of what you don't want to check... I think in general a test should focus on a specific issue, and we know state transfer is always a potential source of (unrelated) failures. So I'd rather have tests that do test state transfer and command forwarding, and tests that avoid state transfer and command forwarding (by waiting for the cluster to form completely).

I'm pretty sure this is another instance of ISPN-2473, and once we have a fix (and a unit test) for this particular failure, MarshallExternalPojosTest could very well wait for the cluster to form and ignore any state transfer-related issues.

BTW, I also got an exception yesterday in MarshallExternalPojosTest and I investigated it, but in my case the error was much weirder: two nodes both opened a TCP connection to each other, yet none of them received the forwarded command. I've asked Bela to investigate as well, but he didn't find anything suspicious in JGroups.