Inter-site state transfer is the mechanism in which a running site will push its state to another site. This currently is a manual process, applicable in the cases where a data center is being brought online or recovering a failed data center. It follows the same design principals of non-blocking state transfer. It is not enough to simply iterate over the keyset and apply the current values, since they may have been modified or deleted during normal operation. JMX OPERATIONS ---------- Per cache manager: 1. pushState( String siteName ): This invokes, on a per-cache manager basis, the inter-site state transfer method. This will use the configured backup policy of the cluster. 2. pushState( String siteName, String cacheName ): Similar to the above, but invokes it only using the policy of the defined cache. 2. pushState( String siteName, String cacheName, int chunkSite ): Similar to the above, with the number of keys to transmit at once 3. stop( String siteName ): This aborts any state transfer operation. 4. stop( String siteName, String cacheName ): 5. KeyState getStatus( String siteName, String cacheName ): This returns the number of keys that have been successfully pushed to the remote site vs. the number of keys to push. COMPONENTS: ---------- There are two main components of inter-site state transfer. There is the XSiteStateProvider on the pushing "site" side, and there is the XSiteStateConsumer on the remote node. Note that these are logical components, and their implementation may not follow the same pattern. XSiteStateProducer - OutboundTransferTask XStiteStateConsumer - remote2LocalTx map - removedKey map - Running/not running state management (may be more than one variable) COMMANDS --------- - XSiteStateTransferRequestCommand (producer --> consumer): This is invoked on a per-cache manager basis that alerts the receiving site to incoming requests and sets internal state accordingly - XSiteStateRequestCommand (producer-->consumer): Invoked to transmit current tx information to the remote site. - XSiteFinalizeStateTransferCommand (producer --> consumer ): This is invoked when the state provider has generated all the state necessary for transmitting the data. Note that the above are issued on a per-cache manager basis. The receiving cluster does not know about which caches will be transmitted, since teh XSiteBackup impl tracks them on tx boundaries anyway. It is the responsibility of the Producer to aggregate all status before issuing the finalize command. DESIGN OVERVIEW --------------- Conceptually, inter-site state transfer uses the NBST design pattern for producing and consuming state (and all its related semantics), while leveraging the XSite implementation for replaying modifications on the receiving site. To start the inter-site state transfer mechanism, a system administrator will connect to any data grid node in the reference site and issue a JMX command (admin console, JMX). It is important to note that the state transfer may be invoked from any node on a local cluster. After invoking this command, the XSiteStateProvider sends a the XSiteRequestCommand to the SiteMaster of the remote site: .When the XSiteStateProvider on the remote site receives this message, it sets up its internal state and replies. The local site can start pushing state. The XSiteStateConsumer will calculate the set of nodes that can generate the state. These nodes will be own the primary segments for a set of keys. This is a subset of the total nodes in the local cluster. A great deal of the state provider implementation is already complete. This can leverage the following class for the necessary functionality: https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/statetransfer/OutboundTransferTask.java#L217 During this operation, the current transaction state (the current "prepares") from the ORIGINATING site are pushed first (similar to the NBST method) in a XSiteStateRequest command. The XSiteStateConsumer will process these tx information and place these commands into a local map. The key is the GlobalTransaction as generated by the pushing site. The object is a composite of the LocalTransaction and the current status: - prepare_received: - commit_received: - committed i.e. ConcurrentMap< GlobalTransaction, CompositeTxStatus > remote2LocalTx map. For each state, the behavior is as following: - prepare_received, committed: Ignore this message, as it was already handled. - commit_received: We grab the local transaction and commit it, and set the state to "committed" - No entry: Start a local transaction and add it to the map with the status "prepare_received" AFTER the transaction state is pushed, the XSiteStateConsumer will ask for the rest of the state. The state is applied to the remote site as a putIfAbsent. Note that putIfAbsent fails if a write has already happened across the cluster, thereby guaranteeing the most current copy is applied to the backup. A note about deletes: We are waiting for tombstoning functionality (ISPN-2362), but it is not available yet. So we have an alternate implementation: If the remote state coordinator intercepts commands (or receives commands of some sort), we know when there's been a delete processed. So we can keep some sort of state in addition to the map above. This means if the processing key from the state transfer has already been deleted by a separate write operation, we can simply skip over that key. Once all state has been sent for all backed-up caches, the XSiteStateProducer will send a new command to finalize the state transfer operations: XSiteFinalizeStateTransferCommand This may involve some of the following: - Removes all the transactions marked as "committed". - The XSiteStateConsumer must ONLY handle CommitCommands that are still kept in the remote2localTx map to in order to finish them up. This can mean committing all local transactions, or rolling them back in the case no messages are received during - Clean up the maps. (???) TBD: Can this be leveraged from the StaleTransactionCleanupService?