[Hawkular-dev] Reform Inventory REST API

Mon May 2 16:13:16 EDT 2016

Hi everyone,

tl;dr
Inventory's REST API is ambiguous and doesn't reflect the generic structure of 
the inventory well. Let's change that before it's too late!

The access patterns in inventory's REST API predate the concept of the 
canonical path that we now use extensively throughout the model to uniquely 
identify the entities and because of that we're running into various issues 
with the REST API. From slight inconveniences to outright breakage due to 
ambiguous URLs.

Here I propose a reformed REST API centered around the canonical paths. It 
should have the same expressive power as the original REST API but should not 
suffer from the ambiguities and should be much more cohesive and "logical".
The only thing that was possible using 1 call in the old API that will require 
2 calls in the new API is disassociation of 2 entities (i.e. disassociate a 
metric from a resource, etc.). These operations are IMHO rather rare so I am 
not too worried about this.

I'd like to know your opinions on the new API:

URIs below are defined in EBNF,
CP stands for canonical path of some entity,
REL stands for a name of some relationship (pre-defined or user defined)

== sync endpoint
URI = "/", "sync", "/", CP;

This is the same as it is currently. There is a parallel thread from the last 
week about the evolution of sync that will be addressed, too.

== bulk endpoint
URI = "/", "bulk";
might go away - we have sync doing almost the same thing

== GET URLs

This is a little bit complex but what this does is that it enhances a 
canonical path as is currently known with the ability to define filters on 
each path progression step.
Basically, this is an attempt to express a graph traversal using an URL.

ANY = ? just a URI path-escaped string  representing an entity ID or 
relationship name ? ;
FILTER_NAME = "type" | "id" | "name" | "cp" | "identical" | "propertyName" | 
"propertyValue" ;
FILTER = FILTER_NAME , [ "=", ANY ] ; (* value is not required for the 
"identical" filter *)
PATH_SEGMENT_TYPE = "t" | "e" | "mp" | "f" | "rt" | "mt" | "ot" | "r" | "m" | 
"d" ;
PATH_SEGMENT_ID = ANY ;
PATH_STEP = "/", PATH_SEGMENT_TYPE, ";", PATH_SEGMENT_ID, { ";", FILTER } ;
WELL_KNOWN_REL = "contains" | "defines" | "incorporates" | "isParentOf" ;
ANY_REL = ANY ;
DIR_FILTER = ";", ( "in" | "out" | "both" )
REL_FILTER = ( "propertyName" | "propertyValue"), "=", ANY ;
REL_STEP = "/rl;", ( WELL_KNOWN_REL | ANY_REL ), [ DIR_FILTER ], { REL_FILTER 
} ;
FILTER_STEP = FILTER, { ";", FILTER } ;
RETURN_TYPE = "" | "treeHash" ;
PATH_END = ( PATH_STEP | FILTER_STEP ), RETURN_TYPE ;
URI = { ( PATH_STEP | FILTER_STEP ), [ REL_STEP ] }, [ PATH_END ];

The "identical" bit is currently not present in the REST API but is in the 
Java API. What it'd do here is that it would "widen" the start of the query 
from the one entity specified by the CP to all entities that are identical to 
it according to the identity hash rules (same id, same significant structure).

This is useful for scenarios like "querying all EAPs". The way this would work 
is that you'd have your resource type that you expect defined and a global 
level, possibly contained in a metadata pack. You'd then look for resources 
that are defined by the resource types identical to yours. Because feeds are 
free to (re-)define their resource types, this would match resources from 
feeds that have types identical to the global one. Note that there is no 
special relationship needed between the types - inventory figures this out 
automagically. This way we loosen the requirement for synchronizing the 
updates to the types defined by feeds and the user at the cost of "eventually 
consistent behavior" once the parties upgrade at their own pace.

=== Examples

==== Return a tenant

  /

==== Access Entity By Canonical Path

  /t;tenant_id/f;feed_id/rt;resourceType_id

This is actually equivalent to:

  /t;tenant_id/rl;contains;out/f;feed_id/rl;contains;out/rt;resourceType_id

which is no longer a canonical path but showcases how we declare "hops" over
specific relationships. "rl;contains;out" translates to "relationship with
name contains in the outgoing direction" and is implicit, if no other "hop"
between entities is specified.

To return the tree hash of the entity instead of the entity itself, one can:

  /f;feed_id/r;resource/treeHash

Note that the tenant in the path is optional because it can be deduced from 
the authorization details.

==== Accessing Targets of Relationships

  /f;feed_id/r;resource_id/rl;incorporates/type=metric

This is equivalent to the current 
`/feeds/feed_id/resources/resource_id/metrics`.

  /f;feed_id/r;resource_id/rl;isParentOf/

(notice the trailing slash)
"give me all children of resource with id 'resource_id'".

==== Accessing Relationships

  /f;feed_id/rl;contains

(notice the lack of trailing slash)
"find all the 'contains' relationships going out of the feed with id 
'feed_id'."

To access a single relationship with known id:

  /rl;relationship_id

==== More Complex Example

  /f;feed_id/type=rt;name=My%20EAP;identical/rl;defines/type=resource/rl;isParentOf/type=resource?recursive=true

"get a feed with id 'feed_id' and find all resource types called "My EAP" that 
it contains and all other resource types that are identical to it (them). Then 
find all the resources that those resource types define and find all the 
resources (recursively) that those resources are parents of."

=== Query Parameters

==== Paging `per_page`, `page`, `sort`, `order`
Paging is very expensive, because it implies fully iterating through the 
result set (to be able to sort or determine the total). We may think about 
some kind of server-side "live result set" that would hold the results on the 
server and be accessed using some kind of token (with a TTL). This is how 
neo4j does it and would avoid having to fetch the full result set on each 
paged request.

==== `recursive`
This causes the last hop (relationship + entity filter) to be recursively
searched and added to the results. Tinkerpop defines a more generic concept of
"loop" using a label as a marker of the start of the "recursive hop" but I 
don't think we need to be that powerful in a REST interface. Advanced users 
may want to use the `query` endpoint with the full power of Gremlin query.

== POST URLs
URI = "/", CP, "/", "resource" | "metric" | "resourceType" | ...;

The idea here is that you can create the entities only at their canonical 
positions. While the Java API allows for creation after a non-canonical 
traversal, I think this would be unnecessarily complicated for the REST API. 
The users would pass the familiar blueprint objects to these endpoints as they 
do currently.

Examples:

/feed - send a feed blueprint and this will create a new feed
/resourceType - send a resource type blueprint and it will create a new global 
resource type
/metricType
...
/f;feed/resourceType - creates a feed-local resource type

== PUT URLs
URI = "/", CP

just send an update object corresponding to the type of entity on the path

== DELETE URLs
URI = "/", CP

deletes the entity on the path

disassociation needs to be 2 steps - find the relationship in question and 
then
delete the relationship, e.g.:

  GET /f;feed_id/r;resource/rl;defines
  DELETE /rl;id-found-in-the-results-from-the-previous-query

== Advanced Querying
URI = "/", "query"

free form, read-only gremlin query for more complex queries (this needs
to wait for the port to Tinkerpop3)