[infinispan-dev] Versioned entries - overview of design, looking for comments

Manik Surtani manik at jboss.org
Wed Sep 14 10:03:49 EDT 2011


So I've been hacking on versioned entries for a bit now, and want to run the designs by everyone. Adding an EntryVersion to each entry is easy, making this optional and null by default easy too, and a SimpleVersion a wrapper around a long and a PartitionTolerantVersion being a vector clock implementation.  Also easy stuff, changing the entry hierarchy and the marshalling to ensure versions - if available - are shipped, etc.

Comparing versions would happen in Mircea's optimistic locking code, on prepare, when a write skew check is done.  If running in a non-clustered environment, the simple object-identity check we currently have is enough; otherwise an EntryVersion.compare() will need to happen, with one of 4 possible results: equal, newer than, older than, or concurrently modified.  The last one can only happen if you have a PartitionTolerantVersion, and will indicate a split brain and simultaneous update.

Now the hard part.  Who increments the version?  We have a few options, all seem expensive.

1) The modifying node.  If the modifying node is a data owner, then easy.  Otherwise the modifying node *has* to do a remote GET first (or at least a GET_VERSION) before doing a PUT.  Extra RPC per entry.  Sucks.

2) The data owner.  This would have to happen on the primary data owner only, and the primary data owner would need to perform the write skew check.  NOT the modifying node.  The modifying node would also need to increment and ship its own NodeClock along with the modification. Extra info to ship per commit.

I'm guessing we go with #2, but would like to hear your thoughts.

Cheers
Manik

--
Manik Surtani
manik at jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org






More information about the infinispan-dev mailing list