I think in both cases (repl and dist) it still may make sense in some cases. E.g., in dist, if a node joins, existing owners could, rather than push data to the joiner, just push a list of {key: version} tuples, which may be significantly smaller than the values. The joiner can then load stuff from a cache loader based on key/version - we'd need a new API on the CacheLoader, like load(Set<KeyVersionPair> keys) - this can be implemented pretty efficiently in many cache stores such as JDBC. The keys that the cache loader doesn't retrieve would need to be pulled back across the network.
I don't know a lot about the subject but for comparing state efficiently Merkle trees seem to be heavily used[1].