[JBoss JIRA] (ISPN-5309) Model data consistency for get()/put() operations in Infinispan

Wednesday, 18 March 2015

    [
https://issues.jboss.org/browse/ISPN-5309?page=com.atlassian.jira.plugin....
] 

Richard Achmatowicz commented on ISPN-5309:
-------------------------------------------

For the record:

...
 Hi Dan

 After reviewing your document Consistency Guarantees in Infinispan, I agree the starting
point would be to validate get() and put() data consistency semantics in the absence of
membership changes on the simplest of caches: no data store, no transactions, no
passivation nor eviction. These are all refinements of the more basic model.

 I have a few comments/clarifications on the following issues:
 - the expected data consistency for interleavings of gets and puts
 - use of cache topology and CH in replicated mode and distributed mode
 - some more detail on the processing of get and put operations in the context
 of a set of infinispan cache instances in distributed mode
 - message order assumptions

 data consistency
 ---------------
 In order to describe input sequences of get/put operations and the expected output values
they return after processing by Infinispan,
 I assume the following:

 - operations to be considered are ordered (or not) by the happens-before
 relation based on causality (as opposed to wall clock time)
 - initial data consistency model for get and put operations:
    - get(): operation that happens-after a get operation get(k) -> v sees the same
value v (or the value of a later write)
    - put(): operation that happens-after a write operation put(k=v) will see  the updated
value v (or the value of a later write)

 NB: this rules out any required data consistency between operations initiated at
different nodes, even though one may occur  after the other based on wall clock time
This sounds good.

Because a single client can connect to multiple nodes, in the future we should also model
consistency for operations that happen on different but do not overlap (the first
operation completely finishes before the other operation starts).

...
 replicated mode vs distributed mode
 -----------------------------------
 I'm a little unclear on the way in which replicated mode and distributed mode
 make use of cache topology information (which we can assume
 has been correctly distributed to all nodes):

 - do both of these modes make use the current cache topology (containing the
 CH) to lookup nodes corresponding to keys they need to access? Yes

...
 - is the only difference in the way that the CH is set up (so that
for
 cluster size N, there is one primary owner and N-1 backup owners)? There is one
more difference: replicated mode always uses broadcasts instead of unicasts. It should
only be relevant when the JGroups view changes, though.

...
 - I assume only one CH is used in regular mode (as opposed to
rehashing mode) Yes, we only have 2 CHs during state transfer.

...
 distributed mode get()/put() design
 -----------------------------------
 Here is a summary of the way in which get and puts are processed:

 (taken almost verbatim from section 4 of Consistency Guarantees in Infinispan,
reorganized a little)

 get():
    - when an originator performs a get(K):
      - if originator == owner, return the local value
      - if originator != owner, originator sends a remote GET_REQ command to
      all its owners in read-CH and uses the first successful non-null
      response (GET_RSP) Correct. We do a retry when the topology id changes, but we
can ignore that for now.

...
 put:()
    - when an originator performs a put(K,V):
      - if originator == primary owner
        - primary owner acquires the key lock, and replicates (PUT_REPL_REQ)
        the write to the backup owner(s)
        - backup owners confirm the write to the primary owner (PUT_REPL_RSP)
        - primary owner updates the value locally, unlocks the key lock, and
        replies to the originator
      - if originator != primary owner
        - originator will send a remote PUT_REQ to the primary owner of the
        modified key (looked up from which CH?)
        - primary owner acquires the key lock, and replicates (PUT_REPL_REQ)
        the write to the backup owner(s)
        - backup owners confirm the write to the primary owner (PUT_REPL_RSP)
        - primary owner updates the value locally, unlocks the key lock, and
        replies to the originator (PUT_RSP) Correct.

...
 Are there any corrections to these descriptions that you would need
to make?
 Do you have the actual message types for these messages passed back and
 forth? Remote get uses ClusteredGetCommand for replication, and the response is a
SuccessfulResponse with the entry value (in some cases with an InternalCacheEntry, but we
can ignore that for now).
Remote put uses PutKeyValueCommand, both for the originator to talk to the primary and for
the primary to talk to the backup owners. The responses are again SuccessfulResponses,
containing the previous value.

...
 messaging assumptions
 ---------------------
 The processing of get()/put() makes use of unicast and RPC to communicate
 between nodes. We automatically use broadcasts instead of unicasts when a message
is being sent to all the nodes in the JGroups view. So we always use broadcasts in
replicated mode, and mostly unicasts in distributed mode.

...
 What assumptions are there on message order for:
    - the unicasts from originator to primary
    - the RPCs from primary owner to backup owners In a DIST_SYNC/REPL_SYNC cache,
all messages marked as OOB, so we don't assume any ordering.

We do assume that messages are ordered in DIST_ASYNC/REPL_ASYNC mode. Let's stick with
synchronous caches, though 

...
 Model data consistency for get()/put() operations in Infinispan
 ---------------------------------------------------------------

                 Key: ISPN-5309
                 URL: https://issues.jboss.org/browse/ISPN-5309
             Project: Infinispan
          Issue Type: Task
            Reporter: Richard Achmatowicz
            Assignee: Richard Achmatowicz

 This will be the first in a series of modelling/validation exercises for some of the more
critical Infinispan protocols. 
 We shall use TLA+ / PlusCal / TLA+ Tools to do the following:
 - model the design of processing of get()/put() operations in an Infinispan cluster
 - model client interactions with that cluster
 - describe the data consistency requirements of get()/put() operations
 - verify that the data consistency semantics of Infinispan are preserved in the face of
concurrent client interactions
 TLA+ / PLusCal can be thought of as a pseudo-code language which has a well-defined
semantics and is testable using the tools in the TLA+ Toolkit. 
 The benefits of such an exercise are that we end up with:
 - a specification of data consistency guarantees that Infinispan provides
 - a semantically precise pseudo-code description of the design of get()/put() processing
 - a verification that the protocol design is correct 
 We start here with the simple case of modelling data consistency in the absence of
failures. In later exercises, we aim to tackle rebalancing and non-blocking state transfer
in the face of membership changes and partitions. 

--
This message was sent by Atlassian JIRA
(v6.3.11#6341)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009