Good morning Summers, I was reviewing the whole document and I think is the great start.
Either way that looked a bit confuse to me, so here comes my attempt to reorganize it (is
just a suggestion):
Another change that I would suggest is the inclusion of the field “revision” like Qmx
already suggested, with your idea of the checksum. And also the inclusion of the field
“signature” to make sure that data is not corrupted or tampered.
Thoughts?
--
abstractj
On January 7, 2014 at 12:21:35 PM, Summers Pittman (supittma(a)redhat.com) wrote:
I've updated the sync spec on the data-sync branch on
aerogear.org
with
what qmx posted yesterday and some ideas I had as well. If I don't
get
any tomatoes I will try to see what a POC on Android looks like this
afternoon.
Sync doc follows.
# Status: Experimental
# AeroGear Data Sync
## basics?
Since we've been catering the enterprise market, this essentially
means
we need to get the __boring__ stuff right first, then move over
to the
__shiny__ stuff, like realtime data sync, update policies &
friends.
### data model
For starters, I think that the most important thing that needs
to be
agreed upon is the data model and the atomic operations around
it. As
previous discussed, I really like CouchDB's datamodel -- and
hate erlang ;)
`{_id:, content:, rev:}`
#### JS
Well, it's JSON, it _Just Works_™
#### Java
I didn't want to pick on Java, but since its fame forces me to it.
First
stab (a courtesy of our friend Dan Bevenius):
public interface Document {
public String id;
public String content;
public String rev;
}
We naturally want to kick this a notch, and use objects instead
of plain
strings:
public interface Document {
public ID id;
public T content;
public String rev;
}
In this case, we can use the convention requiring that `T` is any
**object serializable to JSON**. `ID` is a convenience shorthand
since
it's a **GUID/UUID**. I think this key isn't necessarily a natural
key
(a surrogate key instead).
#### Objective-C
volunteers needed ;)
### Transactions
These are the most basic parts of sync I can think of that our system
should be able to do/manage. Our internal representation of
the client
documents and collections should make implementing this automatically
and without user intervention as simple as possible
* Detect Change
When a user changes her local data, the system should note the
change and generate a sync message to send to the server. This
can be
done automatically or manually but SHOULD be done automatically.
* Send update
When a sync message is ready to be sent, and the system allows for
it to be sent (network available, not in blackout window from
exponential backoff, etc) then sync message should be sent.
This being
done automatically should be the default, but the developer
can override
this behavior.
* Receive Update
When a client updates it data and successfully syncs to the remote
server, the remote server will notify all of the relevant clients.
The
client must automatically and without user intervention receive
this
update and either act on it or store it for later processing.
* Apply Update
Once a client application has an update message from the server,
it
can apply the message correctly to the data on it. This should
be done
automatically as part of receiving the update, but it may be done
manually or may be delayed and automatically executed later.
* Detect Conflict
When applying an update fails, the system must detect this. The
system will provide state to the application and/or the user
to handle
the conflict. The user MUST NOT have to check for conflicts on
her own.
* Resolve Conflict
There must be a mechanism for resolving a conflict. The CAN be
done automatically using default resolvers provided by AeroGear,
by
using a resolver provided by the developer/user, or by the app
user
selecting the correct merge. This will possibly generate a new
sync
message.
### API levels
As soon as we have a rough data-model defined, we can start dabbling
around different API levels to be served:
(parts **I think** are potentially deliverable for a 1.0)
- level 0: explodes when there's a conflict
- level 1: semi-automatic conflict resolution via something
like
google's diff-match-patch
- level 2: business rules determine who wins a conflicting update
(supervisor wins over normal user)
(parts **I think** are potentially deliverable for a 2.0)
- level 3: real-time updates via diff-match-patch
- level 4: real-time updates via OT/EC
All those proposed API operations should be serializable, meaning
I can
potentially keep doing changes offline then just replying them
to the
server when online.
### transport
Since we know about the future-looking ideas on v2.0, it would
be really
nice for us to specify a very simple/dumb JSON-based protocol
for those
change messages. Something that could accomodate both the full
document
updates and the OT/EC incremental bits too. I have no ideas on
this, tbh.
#### Strawman - Summers
{id : Object, data : String, checksum: long}
**id** :
This is the global identifier for the object. This field is optional.
**data** :
This is the sync data for the application. It may be a diff, a whole
object, etc. This field is required.
**checksum** :
This is the client's idea of what a known good sync will look like.
If, post merge, the server's checksum and client's check sum
do not
match then the client is out of sync and must resync and handle
the
conflict.
## Appendix Use Cases:
Here are a few contrived use cases that we may want to keep in mind.
1. Legacy Bug Trackers From Hell
a. It is a webapp written in COBOL, no one will ever EVER update
or
change the code
b. It has TONS of legacy but important data
c. It has TONS of users
d. It only has a few transactions per day, all creating and
updating
bug reports
e. Multiple users can edit the same report
2. Slacker Gallery
a. Each User has a multiple galleries, each gallery has multiple
photos
b. A Gallery has only one user, but the user may be on multiple
devices
c. Galleries may be renamed, created, and deleted
d. Photos may only be created or deleted. Photos also have meta
data
which may be updated, but its creation and deletion is tied to
the Photo
object.
3. Dropbox clone
a. A folder of files may be shared among users
b. There is a size limit to files and how much storage may be used
per folder
c. Files are not updated. If there is a new file, there is an
atomic
delete and create operation
4. Email client
a. This is an AG-controller which accesses a mail account.
b. There are mobile offline and sync enabled clients which connect
to this controller.
5. Google Docs clone
a. Operational Transform out the wazzoo
b. What would the server need?
c. What would the client need?
6. Building Inspector app
Building inspector system - we have mobile apps that store relevant
info
and are bound to be accessed on places where we won't have any kind
of
connection, or very poor signal.
You can have several inspectors screening the same building
simultaneously.
Let's say we have Agnes and Joe are doing the fire extinguisher
inspection in a new hospital building. Technically each fire
extinguisher has its own identifier and can be an independent
document.
In this case we would have no conflict happening.
Now they start finding expired fire extinguishers and start
to add them
to the report. This report could potentially have two divergent
lists of
fire extinguishers to be replenished/revalidated, as the building's
compliance status.
7. Census App
Census system - we have mobile apps focused on offline data collection.
We have the previous year's info that needs to be updated on the
server.
The interviewee needs to take a call, then asks the interviewer
to come
back later. This results in two sets of changes for the same document,
stacked together, which should work flawlessly.
# Appendix Reference (Open Source) Products:
- Wave-in-a-box
- CouchDB
- Google Drive RealtimeAPI
- [diff-merge-patch
algorithm](http://code.google.com/p/google-diff-match-patch/)
- [Summers' Realtime Sync
Demo](http://www.youtube.com/watch?v=WEkZGbVk4Lc)
- [Summers' Devnexus Sync
Demo](https://plus.google.com/103442292643366117394/posts/HGVHwtPArPW)
- Google Android Sync Architecture
_______________________________________________
aerogear-dev mailing list
aerogear-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/aerogear-dev