[infinispan-dev] Stream encoding of Flags and future compatibility

Tue May 15 10:36:23 EDT 2012

On 15 May 2012 15:26, Manik Surtani <manik at jboss.org> wrote:
> Interesting discussion for sure.  So what have we done in this regard?  I see (a) an optimisation for marshalling collections of Flags, and the ability to mark unknown flags accordingly (and log this rather than crash).

Exactly.

And as Galder suggested: https://issues.jboss.org/browse/ISPN-2044
thanks!

Sanne

>
>
> On 15 May 2012, at 11:11, Sanne Grinovero wrote:
>
>> On 15 May 2012 09:36, Dan Berindei <dan.berindei at gmail.com> wrote:
>>> On Mon, May 14, 2012 at 2:11 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>>>> On 11 May 2012 22:30, Dan Berindei <dan.berindei at gmail.com> wrote:
>>>>> On Fri, May 11, 2012 at 7:23 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>>>>>> On 11 May 2012 16:37, Galder Zamarreño <galder at redhat.com> wrote:
>>>>>>> Quickly tried this and caused no issues:
>>>>>>> https://github.com/galderz/infinispan/commit/7718926e5a4a6763506250362d7bd5cbdccd2931
>>>>>>
>>>>>> Looks good! I'm sure this doesn't solve all future migration problems,
>>>>>> but if we could keep this kind of tricks around it should improve
>>>>>> odds.
>>>>>> IMHO, this is a kind of sensitivity that we should apply across all
>>>>>> areas (not just flags).
>>>>>>
>>>>>
>>>>> Looks interesting, but then you have the opposite problem: not all new
>>>>> flags can be ignored, so you need a way to specify that a new flag is
>>>>> "required". E.g. if we had just added a ZERO_LOCK_ACQUISITION_TIMEOUT
>>>>> flag then the client would be expecting spurious failures, but not
>>>>> extra long delays.
>>>>
>>>> You're right - but did you read the conversation on github? We already
>>>> pointed this out, still I believe we should have an option to ignore
>>>> unknown flags if/when/exclusively we think the migration is safe: we
>>>> should be able to tell after the fact, possibly even write migration
>>>> tests, but can't predict the future.
>>>>
>>>> We could also use a single bit in the externalized representation of a
>>>> flag to mean "safe to be ignored" for any flag, but I'm not sure that
>>>> all cases would be black/white .. more likely it will depend on use
>>>> case or actual configuration.
>>>>
>>>
>>> Right, the flag serialized format should tell the receiver whether it
>>> can be ignored. But if we encode an EnumSet<Flag> as a bitset, each
>>> flag will have only one bit, so we wouldn't have a place for the
>>> "migration safe" bit. I guess we could send two bit sets, one with the
>>> flags and one with the "migration safe" bits...
>>
>> I know I proposed this myself, but I would rather avoid this path..
>> even if some flags are unsafe, they might be safe in the configuration
>> / use case, and we might want to force it to proceed nevertheless. I
>> think we should always have the option to "force it to proceed".
>> In worst case one might want to block all transactions and kill the
>> clients, but you have to propose a way to migrate the data.
>>
>> #On two bitsets being sent..
>>
>> First problem is that we're sending redundant information; second is
>> that it should not be responsibility of the older node (in terms of
>> version of the code being run) to properly understand the newer nodes,
>> it should be the other way around.
>>
>> So rather than sending two bitsets, in this case the responsibility
>> belongs to the *sender*: since it's the one containing the newer code,
>> it might know how to deal with older nodes, or might be patched/fixed
>> to know how to deal with older nodes. It's obviously harder to patch
>> existing nodes! Especially if they are running already (and you don't
>> have Byteman).
>> If we could keep track of the version of other nodes in the View, we
>> could have a translation table plugged in on how to talk to a
>> different version node.. maintaining such a table would be much
>> simpler as it would live independently from the main code, and be
>> fixed and hacked on.. possibly even loaded dynamically. So that's what
>> I would do in the long term.. in the short term, let's just save some
>> bytes using optimal encoding for the EnumSet.
>>
>> Especially of note, if the sender (newer node) knows by looking into
>> the translaction table that an operation/command can't be possibly
>> sent to the older node, it should be easier to find an alternative or
>> workaround.
>> when migrating an existing cluster by a rolling upgrade, for sure the
>> existing application is NOT using features which exist only in the
>> never version.. that's granted and we can take advantage of that.
>>
>>>
>>>>
>>>>>> On a totally different page, why are we serializing Flags one-by-one ?
>>>>>> We mostly need to serialize EnumSets right?
>>>>>> An EnumSet can be encoded by using the bits of a couple of bytes.
>>>>>> Three bytes looks like enough for all our needs.. we could even be
>>>>>> clever and reserve a special Externalizer-ID for the empty set, to
>>>>>> avoid 3 bytes where none are needed.
>>>>>> While currently we need an integer (4 bytes) to encode the header for
>>>>>> "EnumSet", plus (4 bytes header + 1 byte value) * each flag -> a lot.
>>>>>>
>>>>>
>>>>> RiverMarshaller already has an optimization for the empty set:
>>>>> https://github.com/dmlloyd/jboss-marshalling/blob/master/river/src/main/java/org/jboss/marshalling/river/RiverMarshaller.java#L613
>>>>
>>>> That code makes perfectly sense in a general purpose use case, but it
>>>> still needs to serialize the Class definition: we can avoid that, so
>>>> we should!
>>>>
>>>
>>> I think we need to encode the Flag class name as long as we consider
>>> the flag set as just another parameter in the Object[] parameter array
>>> - so the externalizer doesn't know how to distinguish between an
>>> EnumSet<Flag> and an EnumSet<UserEnumType>.
>>
>> Can't we write it using the special-purpose Externalizer explicitly?
>> To read it, it will have it's own Externalizer ID.
>>
>>
>>>>> I'm not sure why it doesn't encode each element as a bit, it might be
>>>>> to keep wire compatibility when the order of values in an enum
>>>>> changes.
>>>>
>>>> That's a safe behaviour, expected for a default use case. But if we
>>>> decide to add the UNKNOWN flag, we could use bitsets.
>>>>
>>>
>>> I think the UNKNOWN flag is useful as long as we parse each flag
>>> independently, but if we start encoding the flags as a bitset (or
>>> two), then we don't need UNKNOWN any more. We would only need to
>>> ensure the position of a flag in the bitset remains constant between
>>> versions, which means either relying on the declaration order and
>>> keeping that constant (no removing or reordering of flags) or adding a
>>> position field to the enum.
>>
>> Right, I only asked for the UNKNOWN for the time being, as we don't
>> have anything better yet.
>> And I guess it's safe to have such a flag for future.. you never know.
>>
>>> Assuming the order in the bitset remains constant from one version to
>>> the next, there are only two scenarios:
>>> a) The added flags are not important, and the receiver can just ignore
>>> them - in which case the parsed flag set will only include the flags
>>> that the receiver knows about.
>>> b) The added flags are important, and the receiver should reject the
>>> command - in which case the externalizer should just throw an
>>> exception instead of reading the flag set and setting the UNKNOWN
>>> flag.
>>>
>>>
>>>>> However, because there is only one EnumSet for all Enum types, a
>>>>> hypothetical EnumSetExternalizer also needs to write the name of the
>>>>> enum class - if we wanted to serialize EnumSet<Flag> in 2 bytes then
>>>>> we'd need to make the transformation in ReplicableCommandExternalizer.
>>>>>
>>>>> Cheers
>>>>> Dan
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org
> twitter.com/maniksurtani
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev