[infinispan-dev] Design of Remote Hot Rod events

Mon Dec 2 04:57:46 EST 2013

On 11/26/2013 04:10 PM, Galder Zamarreño wrote:
> Hi Radim,
>
> Thanks for the excellent feedback, comments below:
>
> On Nov 13, 2013, at 11:33 AM, Radim Vansa <rvansa at redhat.com> wrote:
>
>> Hi, my couple of questions & remarks:
>>
>> 1. Why there is no RemoteCacheEntryCreated? I guess you had good reason
>> to exclude it but you could at least explain it. For the event lifecycle
>> creation sounds to me as important as removal.
> When designing this, I looked at the near cache use case as main drive (doesn't mean there aren't others, but it's the most obvious one IMO). For near caches, updates and removals are crucial. IOW, you could not build a near cache without receiving notification of those. Creation could be a "nice to have", so that clients can lazily fetch newly created entries in advance, but it could be wasteful if the client does not request those cached data.
>
> "If in doubt, leave it out" <- I applied that principle, but I'm happy to add create events if I hear about a use case that must have them. As a side note, we could make this more sophisticated by allowing the clients to express what operations they're interested in, potentially allowing those that are interested in created events to receive them. This would help with reducing unnecessary traffic, i.e. by not receiving notifications for those events not interested, but I wanted to keep it simple to start with.

I often think about ispn as a shared memory, providing communication 
between nodes. Messaging would be probably more fitting for such 
use-case, but I can imagine the listener waiting for some value to be 
inserted to the cache.
Nevertheless, you can always use putIfAbsent(K, DummyValue) and wait for 
the modification.

>
>> 2. Does removal due to expiration map to Removed as well? What about
>> invalidation in invalidation cache?
> Removal notifications based on expiration are tricky, particularly for the implications it has on plugged caches stores. See discussion [1]. These are not yet available for embedded caches, so we'd need to tackle that first before adding them for remote events.
>
> Invalidation in invalidated caches are really normal removes sent to other nodes, so events would be produced then.
>
>> 3. IMO, registering events for particular keys is not that optional. If
>> you allow only all-keys listener, you end up with users screwing
>> performance by registering listeners with if (key.equals(myKey)) {…}.
> Yeah, if users do that, there's a lot of traffic wasted, but again, I had the near cache use case in mind where you're interested in all data in the cache, as opposed to a subset. However, it could be added to the design.

I can imagine the near cache to be caching only events the client was 
previously interested in. You don't want to cache all the petabytes of 
data Infinispan will cache in the cluster, on one client. That does not 
scale, and Infinispan is all about scaling.
Besides that, being interested in all data and not providing the CREATE 
event seems somewhat contradictory to me.

>
>> 4. It seems to me that one global listener per client per cache is
>> enough. Will the client code register such single listener and multiplex
>> all the events to the registered listeners? Related to 3. if you don't
>> implement the filtering by key on server, you should at least already
>> provide this as client API and do the equals check locally.
>> Nevertheless, this would require client equality on keys.
> Not sure I understand your point ^.

The application could register multiple identical listeners. If the 
client code was dumb, it would register the same listener twice on 
server -> send notifications twice -> redundant traffic & processing on 
both client and server.
Let's decide whether it's a responsibility of application code to evade 
this scenario or if the client should do that.

>
>> 5. Are pre/post events supported here? I guess not, but this is
>> something to note.
> No, there won't be pre/post events. Too much traffic. There will only be post events.
>
>> 6. Are the events in fact async? It seems to me that these are (the ACKs
>> are only for delivery).
> Of course, we can't afford to have a server thread blocked waiting for an ACK from the client.
>> 7. The reliability guarantees should be specified more closely. From the
>> document it seems that we try to support the near-cache use case by
>> always sending the last update (the intermediate updates can be lost
>> according to ACK tracking), but the events themselves are not guaranteed
>> to be delivered. So is the target reliability "eventually synced cache"?
> Yeah, that's the idea. It's a trade off I made in order to avoid overloading clients when they've been disconnected.
>
>> 8. As the client itself is responsible for contacting each server and
>> registering the listener, there's another scenario besides server
>> failure. It takes some time before client receives new topology, so
>> another server might join and become primary owner - the client does not
>> register to that server until it's late and does not receive the update.
>> Even after the client joins, the server has not tracked the listener and
>> can't see that it should send the update.
>> Solution for this would be to keep a cache of listeners (replicated for
>> global ones, distributed for key-filtered), delay all writes until this
>> cache is replicated and then keep the event in memory even if the client
>> is not yet connected.
> That's certainly an interesting scenario. I'm not sure there's a need for replicaed/distributed cache at all here. In fact, in the design I've done I've tried to avoid any type of clustered state for this work. Any new joining node could keep a buffer of events for a X amount of time to allow all clients to have the time to register their listeners with the new server and receive events in case they are late.
OK, keeping some history would solve that as well.

Now, as there will be some code feeding the client with updates, I think 
that information about topology change should go through that channel as 
well in order to reduce the history period.

Radim

>
> Cheers,
>
> [1] https://issues.jboss.org/browse/ISPN-694
>
>> Radim
>>
>>
>> On 11/12/2013 04:17 PM, Galder Zamarreño wrote:
>>> Hi all,
>>>
>>> Re: https://github.com/infinispan/infinispan/wiki/Remote-Hot-Rod-Events
>>>
>>> I've just finished writing up the Hot Rod remote events design document. Amongst many other use cases, this will enable near caching use cases with the help of Hot Rod client callbacks.
>>>
>>> Cheers,
>>> --
>>> Galder Zamarreño
>>> galder at redhat.com
>>> twitter.com/galderz
>>>
>>> Project Lead, Escalante
>>> http://escalante.io
>>>
>>> Engineer, Infinispan
>>> http://infinispan.org
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> -- 
>> Radim Vansa <rvansa at redhat.com>
>> JBoss DataGrid QA
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Galder Zamarreño
> galder at redhat.com
> twitter.com/galderz
>
> Project Lead, Escalante
> http://escalante.io
>
> Engineer, Infinispan
> http://infinispan.org
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- 
Radim Vansa <rvansa at redhat.com>
JBoss DataGrid QA