[infinispan-dev] ProtoStream and ease of use

Mircea Markus mmarkus at redhat.com
Tue Jul 16 12:57:05 EDT 2013


On 16 Jul 2013, at 09:29, Emmanuel Bernard <emmanuel at hibernate.org> wrote:

> I've been thinking about it over night.
> 
> ## Protofiles provided by the server
> 
> Let's assume, the Protofiles are retrieved from the server for a given
> (class) name.

I don't think class would work as a name as not all the languages have the concept.
I think it should be numeric/int(let's call it protoId) as the type is sent together with the proto-marshalled object to the server on each request - otherwise the server would not know what it needs to deserialize. The protoId it needs to be compact(not string). 
In order to understand each other, both the client and the server should be aware of the (protoId, protofile) mapping.

> And that if it's not there the client pushes a protofile
> and register it. (Is that what you guys had in mind?

A bit more basic in the first stage: the user defines both the protofile and the (protoId, protofile) mapping and makes it available to both client and server.

> 
> In this case, ProtoStream uses the protofile to create a binding between
> a fieldname and a field id. If the convention based approach is built on
> top of ProtoStream, then, we just need the fieldnames by reflection and
> we are good.
> 
> If the protofile is not present on the server, then we are free to
> create one from either the protostream method calls or from the
> convention based approach.

If the protofile is not present on the server, we also need to build the (protoId, protofile) mapping and make sure that the server and all other clients see the same mapping. 
In order to do this, the server should keep a (className, protofile) in parallel to the (protoId, protofile) one. Or it can replace (protoId, protofile) with (className, protofile) but then the serialisation overhead would increase significantly.   

> 
> ## Protofiles are packaged with the application
> 
> Assuming, the previous hypothesis is not correct and that you had in mind
> to impose the protofile to be present in the aplication classpath
> (bundled), then we have two options
> 
> ### Have the user write the protofile manually
> 
> (I believe that's ProtoStream's approach as well, the user has to write
> a protofile).

yes

> If we have the protofile, then we are good, we can use a convention or a
> simple annotation binding the domain model classes to their protofile
> resource and we can use fieldnames and optional @PSType.

Based on the assumption that field name matches the name defined in the class. I think that would work and it doesn't rely on the order of the fields.
And requires the domain model and the fields within the class to be in the control of the user / not to be 3rd parties.

> 
> ### Infer the Protofile from the convention
> 
> (that was my original idea as discussed yesterday)
> If a protofile is not present, then we could infer one from the domain
> model structure (plus some annotation hint). Here the problem of field
> ordering is a problem if we don't store the infered protofile.
> 
> JVMs do not guarantee that the reflection provides the fields in a
> deterministic order. From some reading we could use an
> AnnotationProcessor to record the source ordering of fields, this is
> guaranteed. We would need an annotation to mark the classes Infinispan
> is interested in unfortunately
> 
>    @Data
>    class Address {
>        private String street;
>        ...
>    }
> 
> And the AP would react to @Data

interesting

> 
> ## Conclusion
> 
> Totally inferring the schema is more complicated than I initially
> imagined but assuming that the user has to write the protofile, then the
> convention based approach seems to work smoothly enough.

I think Adrian suggested a similar approach in a previous email :-)
I think this is a nice solution and would work for most of our java users. 
It has two limitations:
- doesn't work for 3rd party domain classes
- it's not API portable between different languages, e.g we won't have a similar API in C++ 

> 
> Emmanuel
> 
> On Mon 2013-07-15 20:29, Adrian Nistor wrote:
>> BaseMessage is not mandatory. It's there to 'help' you implement Message interface, which is optional anyway and trivially simple to implement; it's only a holder for the bag of unknown fields. Ignore that and you loose support for maintaining unknown fields but otherwise everything works fine.
>> 
>> The class field ordering issue is problematic. Besides that, schema evolution might lead to removed fields so we must be able to skip a range of ids easily. I think the sane (but verbose) way to solve this is to make @PSType mandatory for all fields to be marshalled as you suggested, and PSType should have and 'index' or 'id' parameter. But that's mostly goodbye convention then.
>> 
>> Let's also note that were' striving to be cross language here and on the server side we won't have the user's domain classes available in CP (because they might not even be written in java). So the annotations are not going to help there. We would still need the protobuf file, which could be generated based on the annotated java code as we discussed earlier today, but ... generators kill kittens :( so would try to avoid that.
>> 
>> Since we want to be cross-language I think we should go schema-first. So maybe we could have the protobuf hand-written and the java classes (pojos) minimally annotated with just the full name of the corresponding protobuf message type. From here everything could work automagically provided that class field names match the field name in the proto file and also field types match so no value convertor needs to be specified. If they don't, then annotate.
>> 
>> Does this seem a little better?
>> 
>> On 07/15/2013 06:57 PM, Emmanuel Bernard wrote:
>> 
>>> ProtoStream is like this
>>> 
>>> https://github.com/infinispan/protostream/blob/master/core/src/test/java/org/infinispan/protostream/domain/Account.java
>>> https://github.com/infinispan/protostream/blob/master/core/src/test/java/org/infinispan/protostream/domain/marshallers/AccountMarshaller.java
>>> (I believe the BaseMessage superclass of Account is optional, not sure).
>>> 
>>> A convention based approach would be like this
>>> 
>>>    package org.infinispan.protostream.domain;
>>>    import org.infinispan.protostream.BaseMessage;
>>>    /**
>>>     * @author ebernard at redhat.com
>>>     */
>>>    public class Account {
>>>       private int id;
>>>       private String description;
>>>       public int getId() {
>>>          return id;
>>>       }
>>>       public void setId(int id) {
>>>          this.id = id;
>>>       }
>>>       public String getDescription() {
>>>          return description;
>>>       }
>>>       public void setDescription(String description) {
>>>          this.description = description;
>>>       }
>>>       @Override
>>>       public String toString() {
>>>          return "Account{" +
>>>                "id=" + id +
>>>                ", description='" + description + '\'' +
>>>                ", unknownFieldSet='" + unknownFieldSet + '\'' +
>>>                '}';
>>>       }
>>>    }
>>> 
>>> Or let's imagine that we need to make id use a specific protobuf type
>>> 
>>>    package org.infinispan.protostream.domain;
>>>    import org.infinispan.protostream.BaseMessage;
>>>    /**
>>>     * @author ebernard at redhat.com
>>>     */
>>>    public class Account {
>>>       @PSType(UINT32)
>>>       private int id;
>>>       private String description;
>>>       public int getId() {
>>>          return id;
>>>       }
>>>       public void setId(int id) {
>>>          this.id = id;
>>>       }
>>>       public String getDescription() {
>>>          return description;
>>>       }
>>>       public void setDescription(String description) {
>>>          this.description = description;
>>>       }
>>>       @Override
>>>       public String toString() {
>>>          return "Account{" +
>>>                "id=" + id +
>>>                ", description='" + description + '\'' +
>>>                ", unknownFieldSet='" + unknownFieldSet + '\'' +
>>>                '}';
>>>       }
>>>    }
>>> 
>>> Note that a concern is that field ordering (in the bytecode) is not guaranteed across VMs and compilation and I believe that is an important factor of ProtoBuf. So somehow we would need a way to express field indexes wich would amke the annotation approach more verbose.
>>> 
>>> On Mon 2013-07-15 16:04, Manik Surtani wrote:
>>>> I'm sorry I missed this.  Is there an example of each API somewhere?
>>>> 
>>>> On 15 Jul 2013, at 14:01, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>>>> 
>>>>> Mircea, Adrian and I had an IRC chat on ProtoStream and ProtoStuff.
>>>>> 
>>>>> check out
>>>>> http://transcripts.jboss.org/channel/irc.freenode.org/%23infinispan/2013/%23infinispan.2013-07-15.log.html
>>>>> starting at 11:00 and finishing at 12:30
>>>>> 
>>>>> A short summary of what has been discussed:
>>>>> 
>>>>> - ProtoStream is a good cross-platform solution but
>>>>> - complicated for the simple pure Java case
>>>>> - encourages a technical superclass (EJB 2 !!!!!)
>>>>> - ProtoStuff convention + annotation based approach
>>>>> https://code.google.com/p/protostuff/wiki/ProtostuffRuntime is nice
>>>>> for the pure Java case
>>>>> - ProtoStuff is many things and has a non ProtoBuf compliant format for
>>>>> cycle ref and polymorphism
>>>>> - ProtoStream supports unknown fields (future version of a schema),
>>>>> ProtoBuf does not
>>>>> - we could build a convention based solution atop ProtoStream
>>>>> - assuming UnknownFieldSet and BaseMessage are optional
>>>>> - using (cross platform) conventions
>>>>> - with metadata to go beyond conventions (annotation, programmatic
>>>>>   API, XML...)
>>>>> 
>>>>>       public long size; //uses fixed64 by default
>>>>>       @PSType(UINT64) long size; //override protobuf type
>>>>> 
>>>>> - Infinispan will/could(?) have a repo of schema that can be queries
>>>>> - we are talking about how the schema is resolved / generated
>>>>> - what we send through the wire is independent of the Proto*
>>>>> 
>>>>> A ProtoBuf vs ProtoStream comparison points
>>>>> https://gist.github.com/mmarkus/5999646
>>>>> 
>>>>> Emmanuel
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> --
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> twitter.com/maniksurtani
>>>> 
>>>> Platform Architect, JBoss Data Grid
>>>> http://red.ht/data-grid
>>>> 
>>>> 
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)







More information about the infinispan-dev mailing list