[infinispan-dev] ClusteredListeners: message delivered twice

Fri Feb 21 11:03:11 EST 2014

On Mon, Feb 17, 2014 at 7:44 PM, William Burns <mudokonman at gmail.com> wrote:

> On Mon, Feb 17, 2014 at 7:53 AM, Sanne Grinovero <sanne at infinispan.org>
> wrote:
> > On 12 February 2014 10:40, Mircea Markus <mmarkus at redhat.com> wrote:
> >> Hey Will,
> >>
> >> With the current design, during a topology change, an event might be
> delivered twice to a cluster listener. I think we might be able to identify
> such situations (a node becomes a key owner as a result of the topology
> change) and add this information to the event we send, e.g. a flag
> "potentiallyDuplicate" or something like that. Event implementors might be
> able to make good use of this, e.g. checking their internal state if an
> event is redelivered or not. What do you think? Are there any other
> more-than-once delivery situations we can't keep track of?
>
> I agree, this would be important to track.  I have thus added a new
> flag to listeners that is set to true when a modification, removal, or
> create that is done on behalf of a command that was retried due to a
> topology change during the middle of it.  Also this gives the benefit
> not just for cluster listeners but regular listeners, since we could
> have double notification currently even.
>
> >
> > I would really wish we would not push such a burden to the API
> > consumer. If we at least had a modification counter associated with
> > each entry this could help to identify duplicate triggers as well (on
> > top of ordering of modification events as already discussed many
> > times).
>
> The issue in particular we have issues with listeners is when the
> primary owner replicates the update to backup owners and then crashes
> before the notification is sent.  In this case we have no idea from
> the originator's perspective if the backup owner has the update.  When
> the topology changes if updated it will be persisted to new owners
> (possibly without notification).  We could add a counter, however the
> backup owner then has no idea if the primary owner has sent the
> notification or not.  Without adding some kind of 2PC to the primary
> owner to tell the backup that it occurred, he won't know.  However
> this doesn't reliably tell the backup owner if the notification was
> fired even if the node goes down during this period.  Without
> seriously rewriting our nontx dist code I don't see a viable way to do
> this without the API consumer having to be alerted.
>

There's always going to be the possibility that a replication to one of the
backup owner fails and the command is aborted after the listener was
notified (but not on the successful backup owners). And even in tx mode,
the listeners are notified during the prepare phase and not during the
commit.

So I don't think we'll ever be able to make listeners 100% reliable, but
the "potentially duplicate" flag should be good enough.

Cheers
Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20140221/6eff2b4d/attachment.html