Following are my results of the investigation of fh-sync (with fh-sync-server and fh-sync-js). To start off I felt that fh-sync was somewhat hard to use and get into. It could definitely use a getting started kind and more simple examples. Overview Mobile Platforms: we support Cordova, iOS and Android. Support for react native is possible and has been tried here: https://github.com/emilioicai/react-native-fh-sync Mobile API: i have mostly looked at the javascript API, but at least the Android API looks reasonably similar. The Javascript API is somewhat non-standard. It doesn't use Promises and makes use of older-style success/fail callbacks instead of error first callbacks. This makes it hard to use in combination with other libraries like async (which expects error first) or Promise based libraries. The API is based around Datasets and offers CRUDL operations. A dataset seems to translate to a Mongodb collection but can also be represented by another data store on the backend. A Typescript typing for the API is available wich provides good code completion support (even when used with regular Javascript). One problem with the typing is that the return types of the callbacks are not typed (they use Typescript's any type). Having code completion on the return types would help a lot considering there is no documentation that explains what the different operations (create, list, update) exactly return. Syncing mechanism to a remote database & other mobile apps The backend component (fh-sync-server) is dependent on Mongodb and Redis. If no external storage is used (and possibly also if it is) it will create a collection for each dataset in it's Mongo database. I have not been able to make this work, so I can't say if this storage is sufficient for a Setup where no external storage is available but i would assume that it is usually. The syncing mechanism relies on sync loops on the frontend and backend. The frequency of those loops is configurable. One of the problems of the mechanism is that it's hashing algorithm traverses the data and can become slow (also outlined in Wei's analysis). Syncing can take quite a bit (10-30 seconds) with the default configurations. Offline Capability The Javascript library has offline capabilities and offers a number of storage strategies. Again, this needs documentation, examples, when to use which strategy... There was some strange behavior with the 'dom' strategy where I deleted the database but the records reappeard (could also be seen as a feature). This did not happen using the 'memory' strategy. Conflicts Clients can list collisions (two updates to the same record from two different clients). Custom conflict solving strategies can be implemented on the backend by overriding handlers. The default implementation seems to accept a record that causes a conflict and overwrite the existing one. Authentication and Authorization fh-sync does not provider built in solutions for this. But there are ways to achieve this. Basic KeyCloak support has been implemented in fh-sync-server. Role based access could be implemented by overriding the CRUDL handlers on the server side. We might need to do some work to make it possible to pass the http request/response/session objects into those handlers. On the Client side authentication could be achieved by either handling it outside of fh-sync-js (e.g. store the auth token in a HTML header) or by overriding a handler that lets you add metadata to records before they are sent to the Server. Data Source Integrations This seems to be one of the strong points of fh-sync (as long as we want to store JSON objects). The backend allows us to override the handlers for all CRUDL operations. This means we usually get the ID/Record and can decide what to do with them. We could talk to Postgres, call external REST endpoints, store the records on disc... One of the problems is (as Wei also outlined) that storing binary data is not well supported in JSON. Base64 encoding could work but increases the payload size. We might want to look into frameworks like Apache Thrift to improve our binary data exchange story. Using with an API Gateway (3scale) This should not be a problem since fh-sync is completely http request based. Monitoring/Metrics fh-sync-server provides a good metrics endpoint already. Alerting/Hooks Since there is a Prometheus compatible metrics endpoint and Prometheus supports alerting, we might want look into that for Alerting support. Compatibility with Kubernetes/Openshift There should be nothing from preventing us running fh-sync-server on Openshift (e.g. process does not need to run as root). APBs and Images already exist. We have full control over this. Analysis based on Trello Cards As a Developer I wnat real time Sync with web sockets so that data is synchronized as quickly as possible Not met. fh-sync does not have support for Websockets at the moment. But with the way the sync protocol works i don't see any Reason why we could not make it work on Websockets. The authentication story mitght look a bit different in that case (no HTTP headers?). If we want to go full-binary support with Apache Thrift we're also good: thrift does support Websockets. As a developer I want to be able to trigger events when specific conditions (e.g. number of pending messages for a specific client) is reached so that I can proactively take action in response to such events. Not met. As suggested above, one way to achieve this would be using Prometheus buit-in alerting capabilities. The drawback is that this would then only work if the metrics-apb is also deployed. As a developer i would like to be able to protect Sync with an API Gaetway Met. Currently sync only uses http requests which makes it nicely compatible with 3scale. Even with Websockets it should be possible (haven't tested it though). As a developer I want to be able to expose metrics data about Sync in prometheus format from an endpoint of the service. Met. fh-sync-server provides this already and it can be extended if needed. As a developer I want to be able to trigger events when specific conditions (e.g. number of pending messages for a specific client) is reached so that I can proactively take action in response to such events. Not met. But this could also be achieved by making use of Prometheus Alerts and Gauges in the metrics endpoint, e.g.
sync_pending_messages{clientId=<clientId>} = <really high number>
|
As a developer I want a pluggable storage engine available on the client side so that I can use the most appropriate storage engine for my needs Met. fh-sync does allow to override CRUDL handlers. This can be used to implement different storage strageties like Filesystem, Relational Database, etc. As a developer I want the ability to synchronise binary data (e.g. photos, videos, audio files) via the Sync Framework. Not met. It might be possible by abusing JSON and base64 encoding but that's not ideal. We might want to look into frameworks like Apache Thrift for our protocol. As a security conscious developer I want to be able to apply role/permission checks on incoming sync requests so that I can ensure only authorised users are able to access specific datasets. Mostly met. But left to implementation: fh-sync allows to override all CRUDL handlers. Permission and Role checks can be performed there. We might need to do some work to make it possible to pass the http request/response/session objects to those handlers. In case of Websockets this could be more complex. As a developer I want the ability to encrypt key pieces of data while still being able to query against non encrypted fields so that I can still query data even when it has encrypted pieces. Not met. As far as i am aware there is not built-in encryption capabilities in fh-sync. I'm not sure if this belongs into fh-sync at all. But it might be possible to extend fh-sync-server and make it encrypt data with a (user) provided key before sending it on to an external data store. As a developer I want the option of using diff-sync so that only the changed data is synced and not the entire payload. Unsure. According to https://github.com/feedhenry/fh-sync/blob/master/docs/sync_protocol.adoc (Point 8) this is at least partly the case. As a developer I want to be able to choose where on the server side I store binary data that is being synced. Not met. Since we don't provide a good binary story at the moment. Otherwise this sounds like it could be achieved by overriding the handlers, or do we want to make it an easier Option for users? As a developer I want to be able to define the structure of my data sets so that I can use structured data stores on the client and provide the most efficient sync experience possible Not met. We don't have that at the moment. GraphQL could be an Option if it allows us to form a query that has all the fields that should be synced and leave out the ones we don't care about. Rather large piece of work i'd assume. As a sync customer I want to be able to review results of performance / load / stress / soak tests so that I know what levels of performance are possible and with what resource requirements (CPU / RAM etc...). Not met. As far as i am aware we don't have an easy way to review the data, but there might be some scenarios prepared by the QE team from the time when Sync was being prepared for production use. As a developer I want client side permission checks on token validity and roles/permissions checks so that I can fail fast on the client side without the need for a network request. Not met. The fh-sync client libraries don't provide this at the moment. The JS client has a handler that is executed before a Network request and can be overridden but this still needs work. As a developer I want to be able to choose the collision handling strategy to use for data collisions so that I can I can use the most appropriate strategy for my dataset. Not met. At the moment custom strategies can be implemented in the backend by overriding 'handleCollision'. But we don't provide an easy way for users to pick a predefined strategy. As a developer who does not trust the Internet and the powers that run it, I want to be able to set up a distributed peer to peer sync mesh so that I can achieve data synchronisation between clients without the need for a central server. Not met. At the moment the sync clients require a backend to work. Not sure how we could achieve this in the JS client. On Android and iOS we could make use of UDP discovery for the local Network and Hole Punching over the Internet. Although i don't know how good that would work on a phone. As a developer I want to be able to set upper limits on payload sizes (record size, dataset size, binary size) so that I can control the volume of data that is being synced. Not met. As far as i am aware we don't provide that option (at least i didn't see it defined in SyncOptions). As a developer I want to be able to consume datasets in "pages" - i.e. not get the entire dataset up front so that I can control the number of records returned. Not met. As far as i am aware doList does not take any Options and does not provide pagination capabilities. As a developer I want to be able to control sync frequencies on the client and server so that I co-ordinate syncing and remove un-necessary sync loops. Met. It looks like fh-sync allows us to define the frequencies of the sync loop on both, the backend and the clients. As a developer I want clear instructions (e.g. docs, examples, videos) on how to integrate the Sync SDK into my apps so that I can quickly get the SDK working and connect to the Sync. Not met. A big pain point IMO. We need a central place for docs, a clear structure, a getting started guide, examples that guide the user from a minimal demo to a complex setup with a custom storage. As a developer I want single sign on between OpenShift/MCP and all of the mobile services so that I only need to sign in once and am authenticated everywhere. Not met. But this is a problem we need to solve for other services as well (SDK metrics, etc.). |