[
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-880?pag...
]
Emmanuel Bernard commented on HSEARCH-880:
------------------------------------------
h2.Cluster with one way communication - minor bump case
{quote}
you say it's allowed to fail when a new feature is being used. Is transmitting a new
feature not something that we're supposed to bump the mayor version for?
{quote}
The reason for bumping the protocol version was related to the inability to deserialize it
with a previous version.
Let's say, Lucene introduces a new way of optimizing or a new Fieldable type. You can
safely parse the message (at least with Avro) and you can process the message safely as
long as the new optimization operation and the new Fieldable type is not used. Note that
your Delete+Add example is so impacting that we might want to bump the version number
anyways.
Let me return the question, when would you only bump the minor version? In which
scenario?
{quote}I would rather have expected to have an option on the sender side to send
"backwards compatible messages", i.e. convert each Update to the couple of
operations. So people could define a version number in their configuration, then update
the software but have it still send messages the old way.
{quote}
That's kind of orthogonal and on a case by case basis. Update has a backward
compatible mode, not all operations have. Imagine boost did not exist and is introduced
in Lucene, if the user makes use of boost, it has to fail on the other side. If it does
not we can process it.
{quote}
bq. If message_major or message_minor < node_major or node_minor, we use the older
protocol deserializer.
In practice, how are "older" implementations loaded by the factory? I won't
assume with a classloader really loading the older jar? duplicating the packages into
different names for each byte-format change?
{quote}
I have no clear idea to be honest.
It depends on the serialization provider. For Avro, we probably would have different
versions of Works.avpr and the corresponding .avro files. Should the parsong code be
different? Maybe in some case but not in all.
For a provider like the JavaSerializationProvider, then yes you need different packages
(assuming you don't play with read/writeObject.
{quote}
could there ever be a problem where a new HSearch Engine cannot deal with an old HSearch
engine's message?
I think we should always be able to compensate, in theory. The problem is how to
compensate with our mistakes, i.e. how should the engine deal with the fact that we might
not do it in practice: even in the best effort we might miss to test unexpected message
combinations.
{quote}
Right. The idea IMO is to allow people to migrate smoothly from HSearch n to n+1 (probably
not for major bumps even). We won't try to support reading messages v 1.0 when we will
be at 23.45 :o)
{quote}
bq. What happens if B goes down and back up? Does it have a "new" name that
uniquely identify it?
What do you mean by name ? JGroups networks are identified mainly by their network
address, and a string defined in the configuration named cluster name.
So if it's the same node coming up again, it will have the same name (assuming I
understood your question). It becomes more tricky if a different node takes over the role
of Master, and happens to have different protocol versions. We will receive an event when
the cluster elements change, and then we should start a new handshake.
{quote}
Right my concern was that everytime a node goes back up, a new handshake has to be
initiated to potentially update the protocol version.
{quote}
bq. could it be that Serializer / Deserializer / LuceneWorksBuilder lead to the inability
to support a version n-1 (by adding of new methods or stuff like that?
Didn't understand this question. you mean we won't be able to support the previous
mayor version?
{quote}
It depends. What I am saying is that if you change, remove or add a new essential method
to Serializer / Deserializer / LuceneWorksBuilder, you need toimplement the behavior
related to this method on all protocols you aim to support. This is a legacy cost. So my
question was, could it be that we might not be able to do that at time. I'm trying to
think of scenarios where the model breaks.
Discussion on how to support backward / forward compatible
serialization layer
------------------------------------------------------------------------------
Key: HSEARCH-880
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-880
Project: Hibernate Search
Issue Type: New Feature
Components: serialization
Reporter: Emmanuel Bernard
Fix For: 4.0
h1. General principles
The serialized message needs the following elements:
* index name: to redirect the flux to the appropriate backend
* serialization provider id: if not present, a cluster must make sure to use the same
SerializationProvider for a given IndexManager
* protocol version: today the version is major.minor where the major increase means
incompatibility at the stream level, whereas minor means compatibility but with missing
features
* stream: this is the SerializationProvider specific byte[]
bq. Do we need a serialization provider id? In other words, do we need to be able to
hot-upgrade the SerializationProvider in a cluster?
h1. Exchanging messages in an heterogeneous cluster
h2. Cluster with one way communication (JMS)
In this case the master receives a message and must try and process it.
Receives an index name + serial provider id.
Use the serial provider id to deserialize the message.
If message_major > node_major, the serialization provider fails
If message_minor > node_minor, the serialization provider proceeds but some features
might not be supported and the deserialization might fail.
bq. this requires to send the Avro schema with each message which would be a huge loss to
support message_minor > node_minor
In the minor bump case:
* some feature might not be deserialized and simply ignored. A user is aware of the list
of features differences between each node.
* the stream might not be readable by an old version after all due to the use of some new
features => Exception
If message_major or message_minor < node_major or node_minor, we use the older
protocol deserializer.
bq. could there ever be a problem where a new HSearch Engine cannot deal with an old
HSearch engine's message?
h2. Cluster with two way communication (JGroups)
Each time a node A needs to send a message to a node B for the first time. It sends the
list of supported SerializationProvider id and for each the list of Versions supported.
The first SerializationProvider id is preferred and the latest versions are preferred.
A version is more recent if majorA > majorB and with majorA = majorB if minorA >
minorB.
Node B receives the handshake message and returns the appropriate serialization provider
id and version. Subsequent messages are exchanged with this accepted version between A and
B
bq. Is the JGroups clustering using multicast to send change messages ie does it know
which node it sends the message to to do the handshake?
bq. What happens if B goes down and back up? Does it have a "new" name that
uniquely identify it?
h1. API changes
SerializationProvider will need the following adjustments:
* a getSupportedVersions()
* a getSerializer(Version)
* a getDeserializer(Version)
bq. could it be that Serializer / Deserializer / LuceneWorksBuilder lead to the inability
to support a version n-1 (by adding of new methods or stuff like that?
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira