I think in both cases (repl and dist) it still may make sense in some cases. E.g., in
dist, if a node joins, existing owners could, rather than push data to the joiner, just
push a list of {key: version} tuples, which may be significantly smaller than the values.
The joiner can then load stuff from a cache loader based on key/version - we'd need a
new API on the CacheLoader, like load(Set<KeyVersionPair> keys) - this can be
implemented pretty efficiently in many cache stores such as JDBC. The keys that the cache
loader doesn't retrieve would need to be pulled back across the network.
I don't know a lot about the subject but for comparing state efficiently
Merkle trees seem to be heavily used[1].
Certainly not high prio, but something to think about for
Infinispan.next().
+1.
[1]
http://en.wikipedia.org/wiki/Hash_tree