[keycloak-dev] Issues with clustered invalidation caches
mposolda at redhat.com
Fri Nov 4 04:02:24 EDT 2016
On 03/11/16 14:09, Stian Thorgersen wrote:
> Sounds like a plan to me and I can't see us fixing it in a more
> trivial matter either.
> This also aligns better with what we've recently discussed with
> regards to cross DC support. Cross DC support would use JDG and have a
> replicate cache between DCs to send invalidation messages, which is
> exactly what you are proposing so this shoulod be a step towards that.
> It may make sense that you work that in straight away. Basically we
> should support propagating invalidation messages by using an external
> Infinispan/JDG cluster as the store for the cache.
So it seems, we may need something like InvalidationManager SPI, as an
abstraction just for the actual transport (sending/receiving
invalidation messages) ?
By default, it will just be replicated infinispan cache, but it will be
possible to use something completely different for actual message
transport (JMS messaging, external JDG for cross DC support etc). WDYT?
> On 3 November 2016 at 12:06, Marek Posolda <mposolda at redhat.com
> <mailto:mposolda at redhat.com>> wrote:
> I was looking at the cache issue reported by customer. I found the
> of it and couple of other related issues:
> KEYCLOAK-3857 - Bad performance with clustered invalidation cache when
> updating object
> KEYCLOAK-3858 - Removing model object send lots of invalidation
> across cluster
> KEYCLOAK-3859 - Lots of userCache invalidation messages when
> invalidating realm
> KEYCLOAK-3860 - All realm users are invalidated from cache when
> some realm object
> In shortcut, our cache works fine in local mode. But in cluster, there
> are issues with the invalidation caches . We don't have issues with
> stale entries, but this is purchased but lots of various performance
> issues like:
> - There are lots of invalidation messages sent across the cluster
> - Eviction on the node, which received invalidation event, is also
> uneffective. For example evicting realm with 1000 roles needs to call
> 1000 predicates, which iterates the cache 1000 times.
> - Invalidation cache doesn't allow to differ between the context
> why the
> object was invalidated. For example when I update realm settings on
> node1, I need to invalidate just the CachedRealm object, but not
> all the
> other objects dependent on the realm. However the invalidation event
> received on node2 doesn't know, if I invalidated CachedRealm
> because of
> realm update or because of realm removal. So for more safety, it
> removal, which evicts all realm objects! See
> <https://issues.jboss.org/browse/KEYCLOAK-3857> for details.
> - Finally we have the workaround with the "invalidation.key"
> objects in
> our invalidation caches. This is currently needed because when
> invalidating object on node1, the invalidation event is NOT
> received on
> node2 unless the object is there. Hence the workaround with the
> "invalidation.key" records just to avoid this limitation of
> For solve all these issues, I propose:
> - Instead of relying on invalidation caches, we will send notification
> across cluster what happened (eg. message "realm XY was updated"). All
> the nodes will receive this notification and will evict all their
> locally cached objects accordingly and bump their revisions locally.
> This would be much more stable, performant and will allow us to remove
> some workarounds.
> Some details:
> - The caches "realms" and "users" won't be "invalidation" caches, but
> they will be "local" caches.
> - When any object needs to be removed from cache because of some
> (eg. updating realm), the notification message will be sent from node1
> to all other cluster nodes. We will use the replicated cache for that.
> Node1 will send the notification message like "realm XY was updated" .
> - Other cluster nodes will receive this message and they will locally
> trigger evictions of all the objects dependent on particular realm. In
> case of realm update, it's just the CachedRealm object itself. In case
> of realm removal, it is all realm objects etc.
> - Note message will contain also context "realm XY was updated" or
> "realm XY was removed" . Not just "invalidate realm XY". This allows
> much more flexibility and in particular avoid the issues like
> <https://issues.jboss.org/browse/KEYCLOAK-3857> .
> - We already have replicated cache called "work", which we are
> using to
> notify other cluster nodes about various events. So we will just use
> this one. No need to add another replicated cache, we will
> probably just
> need to configure LRU eviction for the existing one.
> - Also note that messages will be always received. We won't need
> workaround with "invalidation.key" objects anymore.
> - Also we don't need recursive evictions (which has very poor
> performance. See https://issues.jboss.org/browse/KEYCLOAK-3857
> <https://issues.jboss.org/browse/KEYCLOAK-3857> ),
> because receiving node will know exactly what happened. It will remove
> objects just the same way like the "sender" node.
> - Finally the amount of traffic sent across the cluster will be
> much lower.
> This sounds like the big step, but IMO it's not that bad :) Note
> that we
> already have all the predicates in place for individual objects. The
> only change will be about sending/receiving notifications across
> cluster. I think I am able to prototype something by tomorrow to
> doublecheck this approach working and then finish it somewhen middle
> next week. WDYT?
> keycloak-dev mailing list
> keycloak-dev at lists.jboss.org <mailto:keycloak-dev at lists.jboss.org>
More information about the keycloak-dev