On 17/02/16 15:06, Pedro Igor Silva wrote:
I think it makes more sense to not spread administrative operations
on different nodes, but just on the coordinator. That would make the design more
predictable and make life easier when something wrong happens, given that you know that
only a specific node is able to perform the operation.
Not sure how manual works, but in theory you can have a specific cache or just use a
known-entry to propagate coordinator related events. So when you trigger a sync you
don't really start the work but indicates to the coordinator that a sync was
triggered. You still need the lock though, but that will be only a coordinator specific
thing.
Yeah, some kind of "locking" is always needed as issue can be seen
even
on single node. When the locking mechanism is cluster-aware then even
better, regarding implementation it's not a big difference.
Regarding every background task running on coordinator, there is the
disadvantage of this IMO, that coordinator is under bigger workload.
For the future versions, we discussed some improvements planned like:
- Possibility to see the progress in admin console (how many users were
synced already, possibility to cancel task etc)
- Possibility to run the sync in "distributable" manner. So that the
sync can be always started on coordinator as you suggested, but
coordinator will share the workload with other cluster nodes and
"manage" the work (ie. node1 is supposed to do page1+page2 and node2
page3+page4 etc). We already use something like this based on infinispan
Distributable executor service [1], which is very cool stuff IMO. Don't
you think this is better regarding workload (and time of the task) then
always execute everything on coordinator?
Unfortunately we are in the "bugfixing" phase and just addressing bugs,
seems that all the other features will need to wait...
[1]
http://infinispan.org/docs/8.2.x/user_guide/user_guide.html#DistributedEx...
Marek
Regards.
Pedro Igor
----- Original Message -----
From: "Marek Posolda" <mposolda(a)redhat.com>
To: "Pedro Igor Silva" <psilva(a)redhat.com>
Cc: keycloak-dev(a)lists.jboss.org
Sent: Wednesday, February 17, 2016 11:48:43 AM
Subject: Re: [keycloak-dev] Concurrent sync in cluster
Was thinking about it. The thing is that we support both periodic and
manual sync. And the manual sync can be triggered on any cluster node.
You can even reproduce issue in non-cluster environment with single host
if you trigger concurrently 2 sync at the same time (or if periodic is
in progress and you trigger manual etc).
The possibility to trigger on coordinator should work for scheduled
periodic cleanup tasks though. We don't support manual triggering for
them. Wonder if I should change this to trigger it always just on
coordinator.
Btv. I am not using any real long-live lock, just the kind of
"pseudo-lock" (based on the presence of some particular item in the
cache, which is removed once the task is finished).
Marek
On 17/02/16 14:14, Pedro Igor Silva wrote:
> Instead of locking could you identify the coordinator and only sync from federation
from the corresponding node ?
>
> Regards.
> Pedro Igor
>
> ----- Original Message -----
> From: "Marek Posolda" <mposolda(a)redhat.com>
> To: keycloak-dev(a)lists.jboss.org
> Sent: Wednesday, February 17, 2016 10:50:08 AM
> Subject: [keycloak-dev] Concurrent sync in cluster
>
> We had a bug
https://issues.jboss.org/browse/KEYCLOAK-2412 that there
> are errors when sync of users from federationProvider is triggered
> concurrently in more cluster nodes. This affects periodic sync as well.
>
> To avoid concurrent executions of same task, I've added ClusterProvider.
> This is based on infinispan and it provides some locking functionality
> to ensures that sync from federation can be executed just by one cluster
> node at a time. Even on single node (non-cluster setup), now you can't
> trigger sync multiple times concurrently. So for example if there is
> periodic sync in progress and you click in admin console on "Sync
> users", the sync won't happen.
>
> The same mechanism is now also used for scheduled tasks (Removing
> expired user sessions and expired events). Nobody reported any bug yet,
> however when removing of expired events/sessions is triggered
> concurrently by more cluster nodes, it can be issue too. So this is now
> avoided. Maybe we can improve even more and ensure that just cluster
> coordinator will run scheduled tasks and other nodes will just ignore them?
>
> ClusterProvider also adds possibility to register ClusterListener with
> any task, that should be executed once notification from any cluster
> node comes. This allows that when some federation provider is
> created/updated/removed, then all nodes are aware of the change and will
> immediately change (or remove) scheduled timer.
>
> PR is here
https://github.com/keycloak/keycloak/pull/2234
>
> Marek
> _______________________________________________
> keycloak-dev mailing list
> keycloak-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/keycloak-dev