Hi,
Re:
https://jira.jboss.org/jira/browse/ISPN-399
This has been halting my progress with
https://jira.jboss.org/jira/browse/ISPN-384 but I
think I've found a way to fix the issue. Before that though, let me explain the
approaches that I tried that would not work:
- When a state transfer request is received for a cache that has not started, one option
might have been to start the cache there and then. However, this won't work since
clients might be starting caches with programatically defined configurations, hence
InboundInvocationHandlerImpl can't have any knowledge of that.
- The most sensible solution IMO would be for the state provider to simply provide empty
state and the receiver to get on with it. Initially I thought that I could just get
InboundInvocationHandlerImpl to get hold of the marshaller, start an object output on the
passed stream and write a false boolean indicating that the state provider cannot provide
state. However, this has a different meaning and the receiver throws an Exception when it
sees such thing.
- Finally, the only solution I could find was to add another boolean to the state
provider/receiver protocol that indicates whether the state transfer manager for that
cache is started. So, if the state provider found the cache was not started, it would do
the same, write a false boolean. On the receiver side, we'd read the started flag
first and then read the rest of the existing protocol, i.e. canProviderState...etc.
Although the solution is relatively simple, it has some caveats including change of
protocol format for state transfer which if we want to maintain backwards compatibility
with 4.0, requires adding versioning to the state transfer protocol itself, something we
haven't been doing so far. IOW, we do have version at the marshaller level, but
there's no versioning for the collection of calls that StateTransferManagerImpl does
to figure out the state. In spite of all this, we might be able to workaround this if
we're able to change Marshaller.startObjectInput and return not only ObjectInput, but
also the versionId read, i.e. MyObjectInput that extends ObjectInput and contains
ObjectInput and the version. StateTransferManagerImpl could do a hacky cast and figure out
the version of the remote node and hence switch between the different protocol versions.
To avoid such hacky things in the future, including 4.1, StateTransferManagerImpl could
also write the version before writing anything else.
The other caveat is that InboundInvocationHandlerImpl ends up having logic wrt to
responding to state generation, hence leading to protocol information escaping
StateTransferManagerImpl. However, there's no StateTransferManagerImpl for the
unstarted cache, so not much that I can do there, unless the protocol itself is abstracted
to a separate class that is independent of the per cache StateTransferManagerImpl
component.
I've attached a patch to the jira with v1 of the fix. Note that it does not contain
any of my suggestions wrt supporting multiversioned state transfer protocol.
Thoughts?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache