On Fri, Dec 9, 2016 at 9:13 AM, Radim Vansa <rvansa(a)redhat.com> wrote:
1) 'cache that would persist the events with a monotonically
increasing id'
I assume that you mean globally (for all entries) monotonous. How will
you obtain such ID? Currently, commands have unique IDs that are
<Address, Long> where the number part is monotonous per node. That's
easy to achieve. But introducing globally monotonous counter means that
there will be a single contention point. (you can introduce another
contention points by adding backups, but this is probably unnecessary as
you can find out the last id from the indexed cache data). Per-segment
monotonous would be probably more scalabe, though that increases
complexity.
Having it per segment would imply only operations involving the same key
would be ordered,
probably it's fine for most cases.
Could this order be affected during topology changes though? As I could
observe, there is a small
window where there is more than 1 primary owner for a given key due to the
fact that the CH propagation
is not complete.
2) 'The write to the event log would be async in order to not affect
normal data writes'
Who should write to the cache?
a) originator - what if originator crashes (despite the change has been
added)? Besides, originator would have to do (async) RPC to primary
owner (which will be the primary owner of the event, too).
b) primary owner - with triangle, primary does not really know if the
change has been written on backup. Piggybacking that info won't be
trivial - we don't want to send another message explicitly. But even if
we get the confirmation, since the write to event cache is async, if the
primary owner crashes before replicating the event to backup, we lost
the event
c) all owners, but locally - that will require more complex
reconciliation if the event did really happen on all surviving nodes or
not. And backups could have some trouble to resolve order, too.
IIUC clustered listeners are called from primary owner before the change
is really confirmed on backups (@Pedro correct me if I am wrong,
please), but for this reliable event cache you need higher level of
consistency.
Async writes to a cache event log would not provide the best of guarantees,
agreed.
OTOH, to have the writes done synchronously, it'd be hard to avoid extra
RPCs.
Some can be prevented by using a KeyPartitioner similar to the one used on
the AffinityIndexManager [1],
so that Segment(K) = Segment(KE), being K the key and KE the related event
log key.
Still RPCs would happen to replicate events, and as you pointed out, it is
not trivial to piggyback this on the triangle'd
data RPCs.
I'm starting to think that an extra cache to store events is overkill.
An alternative could be to bypass the event log cache altogether and store
the events on the Lucene index directly.
For this a custom interceptor would write them to a local index when it's
"safe" to do so, similar to what the QueryInterceptor
does with the Index.ALL flag, but only writing on primary + backup, more
like to a hypothetical "Index.OWNER" setup.
This index does not necessarily need to be stored in extra caches (like the
Infinispan directory does) but can use a local MMap
based directory, making it OS cache friendly. At event consumption time,
though, broadcast queries to the primary owners would be
needed to collect the events on each of the nodes and merge them before
serving to the clients.
[1]
https://github.com/infinispan/infinispan/blob/master/core/sr
c/main/java/org/infinispan/distribution/ch/impl/AffinityPartitioner.java
3) The log will also have to filter out retried operations (based on
command ID - though this can be indexed, too). Though, I would prefer to
see per-event command-id log to deal with retries properly.
4) Client should pull data, but I would keep push notifications that
'something happened' (throttled on server). There could be use case for
rarely updated caches, and polling the servers would be excessive there.
Radim
Makes sense, the push could be a notification that the event log changed
and the
client would them proceed with its normal pull.
>
>
> [1]
>
https://github.com/infinispan/infinispan/wiki/Remote-Listene
rs-improvement-proposal
>
> Thanks,
> Gustavo
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Radim Vansa <rvansa(a)redhat.com>
JBoss Performance Team
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev