[infinispan-dev] Strict Expiration

William Burns mudokonman at gmail.com
Tue Jul 21 10:25:42 EDT 2015


So I wanted to sum up what it looks like the plan is for this in regards to
cluster expiration for ISPN 8.

First off to not make it ambiguous, maxIdle being used with a clustered
cache will provide undefined and unsupported behavior.  This can and will
expire entries on a single node without notifying other cluster members
(essentially it will operate as it does today unchanged).

This leaves me to talk solely about lifespan cluster expiration.

Lifespan Expiration events are fired by the primary owner of an expired key

- when accessing an expired entry.

- by the reaper thread.
If the expiration is detected by a node other than the primary owner, an
expiration command is sent to it and null is returned immediately not
waiting for a response.

Expiration event listeners follow the usual rules for sync/async: in the
case of a sync listener, the handler is invoked while holding the lock,
whereas an async listener will not hold locks.

It is desirable for expiration events to contain both the key and value.
However currently cache stores do not provide the value when they expire
values.  Thus we can only guarantee the value is present when an in memory
expiration event occurs.  We could plan on adding this later.

Also as you may have guessed this doesn't touch strict expiration, which I
think we have come to the conclusion should only work with maxIdle and as
such this is not explored with this iteration.

Let me know if you guys think this approach is okay.

Cheers,

 - Will

On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa <rvansa at redhat.com> wrote:

> Yes, I know about [1]. I've worked that around by storing timestamp in
> the entry as well and when a new record is added, the 'expired'
> invalidations are purged. But I can't purge that if I don't access it -
> Infinispan needs to handle that internally.
>
> Radim
>
> [1] https://hibernate.atlassian.net/browse/HHH-6219
>
> On 07/14/2015 05:45 PM, Dennis Reed wrote:
> > On 07/14/2015 11:08 AM, Radim Vansa wrote:
> >> On 07/14/2015 04:19 PM, William Burns wrote:
> >>>
> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns <mudokonman at gmail.com
> >>> <mailto:mudokonman at gmail.com>> wrote:
> >>>
> >>>       On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei
> >>>       <dan.berindei at gmail.com <mailto:dan.berindei at gmail.com>> wrote:
> >>>
> >>>           Processing expiration only on the reaper thread sounds nice,
> but I
> >>>           have one reservation: processing 1 million entries to see
> that
> >>>           1 of
> >>>           them is expired is a lot of work, and in the general case we
> >>>           will not
> >>>           be able to ensure an expiration precision of less than 1
> >>>           minute (maybe
> >>>           more, with a huge SingleFileStore attached).
> >>>
> >>>
> >>>       This isn't much different then before.  The only difference is
> >>>       that if a user touched a value after it expired it wouldn't show
> >>>       up (which is unlikely with maxIdle especially).
> >>>
> >>>
> >>>           What happens to users who need better precision? In
> >>>           particular, I know
> >>>           some JCache tests were failing because HotRod was only
> supporting
> >>>           1-second resolution instead of the 1-millisecond resolution
> >>>           they were
> >>>           expecting.
> >>>
> >>>
> >>>       JCache is an interesting piece.  The thing about JCache is that
> >>>       the spec is only defined for local caches.  However I wouldn't
> >>>       want to muddy up the waters in regards to it behaving differently
> >>>       for local/remote.  In the JCache scenario we could add an
> >>>       interceptor to prevent it returning such values (we do something
> >>>       similar already for events).  JCache behavior vs ISPN behavior
> >>>       seems a bit easier to differentiate.  But like you are getting
> at,
> >>>       either way is not very appealing.
> >>>
> >>>
> >>>
> >>>           I'm even less convinced about the need to guarantee that a
> >>>           clustered
> >>>           expiration listener will only be triggered once, and that the
> >>>           entry
> >>>           must be null everywhere after that listener was invoked.
> >>>           What's the
> >>>           use case?
> >>>
> >>>
> >>>       Maybe Tristan would know more to answer.  To be honest this work
> >>>       seems fruitless unless we know what our end users want here.
> >>>       Spending time on something for it to thrown out is never fun :(
> >>>
> >>>       And the more I thought about this the more I question the
> validity
> >>>       of maxIdle even.  It seems like a very poor way to prevent memory
> >>>       exhaustion, which eviction does in a much better way and has much
> >>>       more flexible algorithms.  Does anyone know what maxIdle would be
> >>>       used for that wouldn't be covered by eviction?  The only thing I
> >>>       can think of is cleaning up the cache store as well.
> >>>
> >>>
> >>> Actually I guess for session/authentication related information this
> >>> would be important.  However maxIdle isn't really as usable in that
> >>> case since most likely you would have a sticky session to go back to
> >>> that node which means you would never refresh the last used date on
> >>> the copies (current implementation).  Without cluster expiration you
> >>> could lose that session information on a failover very easily.
> >> I would say that maxIdle can be used as for memory management as kind of
> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record for a
> >> short while (regular transaction lifespan ~ seconds to minutes), and
> >> regularly the record is removed. However, to make sure that we don't
> >> leak records in this cache (if something goes wrong and the remove does
> >> not occur), it is removed.
> > Note that just relying on maxIdle doesn't guarantee you won't leak
> > records in this use case (specifically with the way the current
> > hibernate-infinispan 2LC implementation uses it).
> >
> > Hibernate-infinispan adds entries to its own Map stored in Infinispan,
> > and expects maxIdle to remove the map if it skips a remove.  But in a
> > current case, we found that due to frequent accesses to that same map
> > the entries never idle out and it ends up in OOME).
> >
> > -Dennis
> >
> >> I can guess how long the transaction takes place, but not how many
> >> parallel transactions there are. With eviction algorithms (where I am
> >> not sure about the exact guarantees) I can set the cache to not hold
> >> more than N entries, but I can't know for sure that my record does not
> >> suddenly get evicted after shorter period, possibly causing some
> >> inconsistency.
> >> So this is similar to WeakHashMap by removing the key "when it can't be
> >> used anymore" because I know that the transaction will finish before the
> >> deadline. I don't care about the exact size, I don't want to tune that,
> >> I just don't want to leak.
> >>
> >>    From my POV the non-strict maxIdle and strict expiration would be a
> >> nice compromise.
> >>
> >> Radim
> >>
> >>>           Note that this would make the reaper thread less efficient:
> with
> >>>           numOwners=2 (best case), half of the entries that the reaper
> >>>           touches
> >>>           cannot be expired, because the node isn't the primary node.
> And to
> >>>           make matters worse, the same reaper thread would have to
> perform a
> >>>           (synchronous?) RPC for each entry to ensure it expires
> everywhere.
> >>>
> >>>
> >>>       I have debated about this, it could something like a sync
> >>>       removeAll which has a special marker to tell it is due to
> >>>       expiration (which would raise listeners there), while also
> sending
> >>>       a cluster expiration event to other non owners.
> >>>
> >>>
> >>>           For maxIdle I'd like to know more information about how
> >>>           exactly the
> >>>           owners would coordinate to expire an entry. I'm pretty sure
> we
> >>>           cannot
> >>>           avoid ignoring some reads (expiring an entry immediately
> after
> >>>           it was
> >>>           read), and ensuring that we don't accidentally extend an
> >>>           entry's life
> >>>           (like the current code does, when we transfer an entry to a
> >>>           new owner)
> >>>           also sounds problematic.
> >>>
> >>>
> >>>       For lifespan it is simple, the primary owner just expires it when
> >>>       it expires there.  There is no coordination needed in this case
> it
> >>>       just sends the expired remove to owners etc.
> >>>
> >>>       Max idle is more complicated as we all know.  The primary owner
> >>>       would send a request for the last used time for a given key or
> set
> >>>       of keys.  Then the owner would take those times and check for a
> >>>       new access it isn't aware of.  If there isn't then it would send
> a
> >>>       remove command for the key(s).  If there is a new access the
> owner
> >>>       would instead send the last used time to all of the owners.  The
> >>>       expiration obviously would have a window that if a read occurred
> >>>       after sending a response that could be ignored.  This could be
> >>>       resolved by using some sort of 2PC and blocking reads during that
> >>>       period but I would say it isn't worth it.
> >>>
> >>>       The issue with transferring to a new node refreshing the last
> >>>       update/lifespan seems like just a bug we need to fix irrespective
> >>>       of this issue IMO.
> >>>
> >>>
> >>>           I'm not saying expiring entries on each node independently is
> >>>           perfect,
> >>>           far from it. But I wouldn't want us to provide new
> guarantees that
> >>>           could hurt performance without a really good use case.
> >>>
> >>>
> >>>       I would guess that user perceived performance should be a little
> >>>       faster with this.  But this also depends on an alternative that
> we
> >>>       decided on :)
> >>>
> >>>       Also the expiration thread pool is set to min priority atm so it
> >>>       may delay removal of said objects but hopefully (if the jvm
> >>>       supports) it wouldn't overrun a CPU while processing unless it
> has
> >>>       availability.
> >>>
> >>>
> >>>           Cheers
> >>>           Dan
> >>>
> >>>
> >>>           On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant
> >>>           <ttarrant at redhat.com <mailto:ttarrant at redhat.com>> wrote:
> >>>           > After re-reading the whole original thread, I agree with
> the
> >>>           proposal
> >>>           > with two caveats:
> >>>           >
> >>>           > - ensure that we don't break JCache compatibility
> >>>           > - ensure that we document this properly
> >>>           >
> >>>           > Tristan
> >>>           >
> >>>           > On 13/07/2015 18:41, Sanne Grinovero wrote:
> >>>           >> +1
> >>>           >> You had me convinced at the first line, although "A lot of
> >>>           code can now
> >>>           >> be removed and made simpler" makes it look extremely nice.
> >>>           >>
> >>>           >> On 13 Jul 2015 18:14, "William Burns" <
> mudokonman at gmail.com
> >>>           <mailto:mudokonman at gmail.com>
> >>>           >> <mailto:mudokonman at gmail.com
> >>>           <mailto:mudokonman at gmail.com>>> wrote:
> >>>           >>
> >>>           >>     This is a necro of [1].
> >>>           >>
> >>>           >>     With Infinispan 8.0 we are adding in clustered
> >>>           expiration.  That
> >>>           >>     includes an expiration event raised that is clustered
> >>>           as well.
> >>>           >>     Unfortunately expiration events currently occur
> >>>           multiple times (if
> >>>           >>     numOwners > 1) at different times across nodes in a
> >>>           cluster.  This
> >>>           >>     makes coordinating a single cluster expiration event
> >>>           quite difficult.
> >>>           >>
> >>>           >>     To work around this I am proposing that the expiration
> >>>           of an event
> >>>           >>     is done solely by the owner of the given key that is
> >>>           now expired.
> >>>           >>     This would fix the issue of having multiple events and
> >>>           the event can
> >>>           >>     be raised while holding the lock for the given key so
> >>>           concurrent
> >>>           >>     modifications would not be an issue.
> >>>           >>
> >>>           >>     The problem arises when you have other nodes that have
> >>>           expiration
> >>>           >>     set but expire at different times.  Max idle is the
> >>>           biggest offender
> >>>           >>     with this as a read on an owner only refreshes the
> >>>           owners timestamp,
> >>>           >>     meaning other owners would not be updated and expire
> >>>           preemptively.
> >>>           >>     To have expiration work properly in this case you
> would
> >>>           need
> >>>           >>     coordination between the owners to see if anyone has a
> >>>           higher
> >>>           >>     value.  This requires blocking and would have to be
> >>>           done while
> >>>           >>     accessing a key that is expired to be sure if
> >>>           expiration happened or
> >>>           >>     not.
> >>>           >>
> >>>           >>     The linked dev listing proposed instead to only expire
> >>>           an entry by
> >>>           >>     the reaper thread and not on access.  In this case a
> >>>           read will
> >>>           >>     return a non null value until it is fully expired,
> >>>           increasing hit
> >>>           >>     ratios possibly.
> >>>           >>
> >>>           >>     Their are quire a bit of real benefits for this:
> >>>           >>
> >>>           >>     1. Cluster cache reads would be much simpler and
> >>>           wouldn't have to
> >>>           >>     block to verify the object exists or not since this
> >>>           would only be
> >>>           >>     done by the reaper thread (note this would have only
> >>>           happened if the
> >>>           >>     entry was expired locally).  An access would just
> >>>           return the value
> >>>           >>     immediately.
> >>>           >>     2. Each node only expires entries it owns in the
> reaper
> >>>           thread
> >>>           >>     reducing how many entries they must check or remove.
> >>>           This also
> >>>           >>     provides a single point where events would be raised
> as
> >>>           we need.
> >>>           >>     3. A lot of code can now be removed and made simpler
> as
> >>>           it no longer
> >>>           >>     has to check for expiration.  The expiration check
> >>>           would only be
> >>>           >>     done in 1 place, the expiration reaper thread.
> >>>           >>
> >>>           >>     The main issue with this proposal is as the other
> >>>           listing mentions
> >>>           >>     is if user code expects the value to be gone after
> >>>           expiration for
> >>>           >>     correctness.  I would say this use case is not as
> >>>           compelling for
> >>>           >>     maxIdle, especially since we never supported it
> >>>           properly.  And in
> >>>           >>     the case of lifespan the user could very easily store
> >>>           the expiration
> >>>           >>     time in the object that they can check after a get as
> >>>           pointed out in
> >>>           >>     the other thread.
> >>>           >>
> >>>           >>     [1]
> >>>           >>
> >>>
> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html
> >>>           >>
> >>>           >>  _______________________________________________
> >>>           >>     infinispan-dev mailing list
> >>>           >> infinispan-dev at lists.jboss.org
> >>>           <mailto:infinispan-dev at lists.jboss.org>
> >>>           <mailto:infinispan-dev at lists.jboss.org
> >>>           <mailto:infinispan-dev at lists.jboss.org>>
> >>>           >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>           >>
> >>>           >>
> >>>           >>
> >>>           >> _______________________________________________
> >>>           >> infinispan-dev mailing list
> >>>           >> infinispan-dev at lists.jboss.org
> >>>           <mailto:infinispan-dev at lists.jboss.org>
> >>>           >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>           >>
> >>>           >
> >>>           > --
> >>>           > Tristan Tarrant
> >>>           > Infinispan Lead
> >>>           > JBoss, a division of Red Hat
> >>>           > _______________________________________________
> >>>           > infinispan-dev mailing list
> >>>           > infinispan-dev at lists.jboss.org
> >>>           <mailto:infinispan-dev at lists.jboss.org>
> >>>           > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>           _______________________________________________
> >>>           infinispan-dev mailing list
> >>>           infinispan-dev at lists.jboss.org
> >>>           <mailto:infinispan-dev at lists.jboss.org>
> >>>           https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> infinispan-dev mailing list
> >>> infinispan-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Radim Vansa <rvansa at redhat.com>
> JBoss Performance Team
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150721/c82306b7/attachment-0001.html 


More information about the infinispan-dev mailing list