[infinispan-dev] Stream encoding of Flags and future compatibility

Galder Zamarreño galder at redhat.com
Wed May 16 06:21:12 EDT 2012


On May 15, 2012, at 10:36 AM, Dan Berindei wrote:

> On Mon, May 14, 2012 at 2:11 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>> On 11 May 2012 22:30, Dan Berindei <dan.berindei at gmail.com> wrote:
>>> On Fri, May 11, 2012 at 7:23 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>>>> On 11 May 2012 16:37, Galder Zamarreño <galder at redhat.com> wrote:
>>>>> Quickly tried this and caused no issues:
>>>>> https://github.com/galderz/infinispan/commit/7718926e5a4a6763506250362d7bd5cbdccd2931
>>>> 
>>>> Looks good! I'm sure this doesn't solve all future migration problems,
>>>> but if we could keep this kind of tricks around it should improve
>>>> odds.
>>>> IMHO, this is a kind of sensitivity that we should apply across all
>>>> areas (not just flags).
>>>> 
>>> 
>>> Looks interesting, but then you have the opposite problem: not all new
>>> flags can be ignored, so you need a way to specify that a new flag is
>>> "required". E.g. if we had just added a ZERO_LOCK_ACQUISITION_TIMEOUT
>>> flag then the client would be expecting spurious failures, but not
>>> extra long delays.
>> 
>> You're right - but did you read the conversation on github? We already
>> pointed this out, still I believe we should have an option to ignore
>> unknown flags if/when/exclusively we think the migration is safe: we
>> should be able to tell after the fact, possibly even write migration
>> tests, but can't predict the future.
>> 
>> We could also use a single bit in the externalized representation of a
>> flag to mean "safe to be ignored" for any flag, but I'm not sure that
>> all cases would be black/white .. more likely it will depend on use
>> case or actual configuration.
>> 
> 
> Right, the flag serialized format should tell the receiver whether it
> can be ignored. But if we encode an EnumSet<Flag> as a bitset, each
> flag will have only one bit, so we wouldn't have a place for the
> "migration safe" bit. I guess we could send two bit sets, one with the
> flags and one with the "migration safe" bits…

Hmmmm, not sure..

> 
>> 
>>>> On a totally different page, why are we serializing Flags one-by-one ?
>>>> We mostly need to serialize EnumSets right?
>>>> An EnumSet can be encoded by using the bits of a couple of bytes.
>>>> Three bytes looks like enough for all our needs.. we could even be
>>>> clever and reserve a special Externalizer-ID for the empty set, to
>>>> avoid 3 bytes where none are needed.
>>>> While currently we need an integer (4 bytes) to encode the header for
>>>> "EnumSet", plus (4 bytes header + 1 byte value) * each flag -> a lot.
>>>> 
>>> 
>>> RiverMarshaller already has an optimization for the empty set:
>>> https://github.com/dmlloyd/jboss-marshalling/blob/master/river/src/main/java/org/jboss/marshalling/river/RiverMarshaller.java#L613
>> 
>> That code makes perfectly sense in a general purpose use case, but it
>> still needs to serialize the Class definition: we can avoid that, so
>> we should!
>> 
> 
> I think we need to encode the Flag class name as long as we consider
> the flag set as just another parameter in the Object[] parameter array
> - so the externalizer doesn't know how to distinguish between an
> EnumSet<Flag> and an EnumSet<UserEnumType>.

-1 to encoding class name. You don't even know it at runtime cos of lack of reified generics.

Easiest is to wrap EnumSet<Flag> around a custom type and register it in ExternalizerTable.

> 
>>> I'm not sure why it doesn't encode each element as a bit, it might be
>>> to keep wire compatibility when the order of values in an enum
>>> changes.
>> 
>> That's a safe behaviour, expected for a default use case. But if we
>> decide to add the UNKNOWN flag, we could use bitsets.
>> 
> 
> I think the UNKNOWN flag is useful as long as we parse each flag
> independently, but if we start encoding the flags as a bitset (or
> two), then we don't need UNKNOWN any more. We would only need to
> ensure the position of a flag in the bitset remains constant between
> versions, which means either relying on the declaration order and
> keeping that constant (no removing or reordering of flags) or adding a
> position field to the enum.
> 
> Assuming the order in the bitset remains constant from one version to
> the next, there are only two scenarios:
> a) The added flags are not important, and the receiver can just ignore
> them - in which case the parsed flag set will only include the flags
> that the receiver knows about.

Fine

> b) The added flags are important, and the receiver should reject the
> command - in which case the externalizer should just throw an
> exception instead of reading the flag set and setting the UNKNOWN
> flag.

Why should the receiver reject the command? It might work in non-optimal way, or ignore certain stuff, but I don't see a case where it should reject.

Rejecting would in fact stop normal rolling upgrades.

> 
> 
>>> However, because there is only one EnumSet for all Enum types, a
>>> hypothetical EnumSetExternalizer also needs to write the name of the
>>> enum class - if we wanted to serialize EnumSet<Flag> in 2 bytes then
>>> we'd need to make the transformation in ReplicableCommandExternalizer.
>>> 
>>> Cheers
>>> Dan
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list