On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa <rvansa@redhat.com> wrote:
1) 'cache that would persist the events with a monotonically increasing id'

I assume that you mean globally (for all entries) monotonous. How will
you obtain such ID? Currently, commands have unique IDs that are
<Address, Long> where the number part is monotonous per node. That's
easy to achieve. But introducing globally monotonous counter means that
there will be a single contention point. (you can introduce another
contention points by adding backups, but this is probably unnecessary as
you can find out the last id from the indexed cache data). Per-segment
monotonous would be probably more scalabe, though that increases complexity.

Having it per segment would imply only operations involving the same key would be ordered,
probably it's fine for most cases.

Could this order be affected during topology changes though? As I could observe, there is a small
window where there is more than 1 primary owner for a given key due to the fact that the CH propagation
is not complete.
 

2) 'The write to the event log would be async in order to not affect
normal data writes'

Who should write to the cache?
a) originator - what if originator crashes (despite the change has been
added)? Besides, originator would have to do (async) RPC to primary
owner (which will be the primary owner of the event, too).
b) primary owner - with triangle, primary does not really know if the
change has been written on backup. Piggybacking that info won't be
trivial - we don't want to send another message explicitly. But even if
we get the confirmation, since the write to event cache is async, if the
primary owner crashes before replicating the event to backup, we lost
the event
c) all owners, but locally - that will require more complex
reconciliation if the event did really happen on all surviving nodes or
not. And backups could have some trouble to resolve order, too.

IIUC clustered listeners are called from primary owner before the change
is really confirmed on backups (@Pedro correct me if I am wrong,
please), but for this reliable event cache you need higher level of
consistency.

Async writes to a cache event log would not provide the best of guarantees, agreed.

OTOH, to have the writes done synchronously, it'd be hard to avoid extra RPCs.
Some can be prevented by using a KeyPartitioner similar to the one used on the AffinityIndexManager [1],
so that Segment(K) = Segment(KE),  being K the key and KE the related event log key.

Still RPCs would happen to replicate events, and as you pointed out, it is not trivial to piggyback this on the triangle'd
data RPCs.

I'm starting to think that an extra cache to store events is overkill.

An alternative could be to bypass the event log cache altogether and store the events on the Lucene index directly.
For this a custom interceptor would write them to a local index when it's "safe" to do so, similar to what the QueryInterceptor
does with the Index.ALL flag, but only writing on primary + backup, more like to a hypothetical  "Index.OWNER" setup.

This index does not necessarily need to be stored in extra caches (like the Infinispan directory does) but can use a local MMap
based directory, making it OS cache friendly. At event consumption time, though, broadcast queries to the primary owners would be
needed to collect the events on each of the nodes and merge them before serving to the clients.


[1] https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/distribution/ch/impl/AffinityPartitioner.java


3) The log will also have to filter out retried operations (based on
command ID - though this can be indexed, too). Though, I would prefer to
see per-event command-id log to deal with retries properly.

4) Client should pull data, but I would keep push notifications that
'something happened' (throttled on server). There could be use case for
rarely updated caches, and polling the servers would be excessive there.

Radim


Makes sense, the push could be a notification that the event log changed and the
client would them proceed with its normal pull.


>
>
> [1]
> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal
>
> Thanks,
> Gustavo
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Radim Vansa <rvansa@redhat.com>
JBoss Performance Team

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev