[infinispan-dev] Infinispan and change data capture

Radim Vansa rvansa at redhat.com
Fri Dec 9 04:13:52 EST 2016


On 12/08/2016 10:13 AM, Gustavo Fernandes wrote:
>
> I recently updated a proposal [1] based on several discussions we had 
> in the past that is essentially about introducing an event storage 
> mechanism (write ahead log) in order to improve reliability, failover 
> and "replayability" for the remote listeners, any feedback greatly 
> appreciated.

Hi Gustavo,

while I really like the pull-style architecture and reliable events, I 
see some problematic parts here:

1) 'cache that would persist the events with a monotonically increasing id'

I assume that you mean globally (for all entries) monotonous. How will 
you obtain such ID? Currently, commands have unique IDs that are 
<Address, Long> where the number part is monotonous per node. That's 
easy to achieve. But introducing globally monotonous counter means that 
there will be a single contention point. (you can introduce another 
contention points by adding backups, but this is probably unnecessary as 
you can find out the last id from the indexed cache data). Per-segment 
monotonous would be probably more scalabe, though that increases complexity.

2) 'The write to the event log would be async in order to not affect 
normal data writes'

Who should write to the cache?
a) originator - what if originator crashes (despite the change has been 
added)? Besides, originator would have to do (async) RPC to primary 
owner (which will be the primary owner of the event, too).
b) primary owner - with triangle, primary does not really know if the 
change has been written on backup. Piggybacking that info won't be 
trivial - we don't want to send another message explicitly. But even if 
we get the confirmation, since the write to event cache is async, if the 
primary owner crashes before replicating the event to backup, we lost 
the event
c) all owners, but locally - that will require more complex 
reconciliation if the event did really happen on all surviving nodes or 
not. And backups could have some trouble to resolve order, too.

IIUC clustered listeners are called from primary owner before the change 
is really confirmed on backups (@Pedro correct me if I am wrong, 
please), but for this reliable event cache you need higher level of 
consistency.

3) The log will also have to filter out retried operations (based on 
command ID - though this can be indexed, too). Though, I would prefer to 
see per-event command-id log to deal with retries properly.

4) Client should pull data, but I would keep push notifications that 
'something happened' (throttled on server). There could be use case for 
rarely updated caches, and polling the servers would be excessive there.

Radim

>
>
> [1] 
> https://github.com/infinispan/infinispan/wiki/Remote-Listeners-improvement-proposal
>
> Thanks,
> Gustavo
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Radim Vansa <rvansa at redhat.com>
JBoss Performance Team



More information about the infinispan-dev mailing list