<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 11/11/15 15:51, Marek Posolda wrote:<br>
</div>
<blockquote cite="mid:564355E0.1040609@redhat.com" type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<div class="moz-cite-prefix">On 11/11/15 15:36, Stian Thorgersen
wrote:<br>
</div>
<blockquote
cite="mid:CAJgngAeLDWnyE=OZ5k+0uZT9gV+jNvs=r_2WA4acXQhCEMxF1g@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 11 November 2015 at 15:23, Marek
Posolda <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:mposolda@redhat.com" target="_blank">mposolda@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span class="">
<div>On 11/11/15 09:01, Stian Thorgersen wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 10 November 2015
at 16:11, Marek Posolda <span dir="ltr"><<a
moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:mposolda@redhat.com"><a class="moz-txt-link-abbreviated" href="mailto:mposolda@redhat.com">mposolda@redhat.com</a></a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span>
<div>On 09/11/15 14:09, Stian
Thorgersen wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 9
November 2015 at 13:35,
Sebastien Blanc <span
dir="ltr"><<a
moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:sblanc@redhat.com"><a class="moz-txt-link-abbreviated" href="mailto:sblanc@redhat.com">sblanc@redhat.com</a></a>></span> wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">
<div>That would be really
nice indeed ! <br>
</div>
But are the markers files
not enough, instead of
also having a table in the
DB ?<br>
</div>
</blockquote>
<div><br>
</div>
<div>We need a way to prevent
multiple nodes in a cluster
to import the same file. For
example on Kerberos you end
up spinning up multiple
instances of the same Docker
image. <br>
</div>
</div>
</div>
</div>
</blockquote>
</span> I bet you meant 'Kubernetes' <span><span>
:-) </span></span></div>
</blockquote>
<div><br>
</div>
<div>Yup</div>
<div> </div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span><span>
</span></span><br>
<br>
+1 for the improvements. Besides those I
think that earlier or later, we will
need to solve long-running export+import
where you want to import 100.000 users.
<br>
</div>
</blockquote>
<div><br>
</div>
<div>+1</div>
<div> </div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> <br>
As I mentioned in another mail few weeks
ago, we can have:<br>
<br>
1) Table with the progress (51.000 users
already imported, around 49.000
remaining etc.)<br>
</div>
</blockquote>
<div><br>
</div>
<div>We would still need to split into
multiple files in either case. Having a
single json file with 100K users is
probably not going to perform very well.
So what I proposed would actually work for
long-running import as well. If each file
has a manageable amount of users (say ~5
min to import) then each file will be
marked as imported or failed. At least for
now I don't think we should do smaller
batches than one file. As long as one file
is imported within the same TX then it's
an all or nothing import.</div>
</div>
</div>
</div>
</blockquote>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> 2)
Concurrency and dividing the work among
cluster nodes (Node1 will import 50.000
users and node2 another 50.000 users)<br>
</div>
</blockquote>
<div><br>
</div>
<div>This would be solved as well. Each node
picks up a file that's not processed yet.
Marks it in the DB and then gets to
process it.</div>
<div> </div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> 3)
Failover (Import won't be completely
broken if cluster node crashes after
import 90.000, but can continue on other
cluster nodes)<br>
<br>
I think the stuff I did recently for
pre-loading offline sessions at startup
could be reused for this stuff too and
it can handle (2) and (3) . Also it can
handle parallel import triggered from
more cluster nodes. <br>
<br>
For example: currently if you trigger
kubernetes with 2 cluster nodes, both
nodes will start to import same file at
the same time because import triggered
by node1 is not yet finished before
node2 is started, so there is not yet
existing DB record that file is already
imported. With the stuff I did, just the
coordinator (node1) will start the
import . Node2 will wait until import
triggered by node1 is finished, but at
the same time it can "help" to import
some users (pages) if coordinator asks
him to do so. This impl is based on
infinispan distributed executor service
<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://infinispan.org/docs/5.3.x/user_guide/user_guide.html#_distributed_execution_framework">http://infinispan.org/docs/5.3.x/user_guide/user_guide.html#_distributed_execution_framework</a>
.</div>
</blockquote>
<div><br>
</div>
<div>The DB record needs to be created
before a node tries to import it,
including a timestamp when it started the
import. It should then be updated once the
import is completed, with the result.
Using the distributed execution framework
sounds like a good idea though. How do you
prevent scheduling the same job multiple
times? For example if all nodes on startup
scan the import folder and simply import
everything they find, then there will be
multiple of the same job. Not really a big
deal as the first thing the job should do
is check if there's a record in the DB
already.</div>
</div>
</div>
</div>
</blockquote>
</span> With distributed executor, it's the cluster
coordinator, which coordinates which node would import
what. It will send messages to cluster nodes like
"Hey, please import the file testrealm-users-3.json
with timestamp abcd123" . <br>
<br>
After node finishes the job, it notifies coordinator
and coordinator will insert DB record and mark it as
finished. So there is no DB record inserted before
node starts import, because whole coordination is
handled by the coordinator. Also there will never be
same file imported more times by different cluster
nodes. <br>
<br>
Only exception would be if cluster node crashes before
import is finished. Then it needs to be reimported by
other cluster node, but that's the case with DB locks
as well.<br>
<br>
IMO the DB locks approach doesn't handle well crash of
some cluster node. For example when node2 crashes
unexpectedly when it's importing the file
testrealm-users-3.json, the DB lock is held by this
node, so other cluster nodes can't start on importing
the file (until timeout occurs.)<br>
<br>
On the other hand, distributed executor approach may
have issues if there is inconsistent content of the
standalone/import directory among cluster nodes.
However it can be solved, so that each node will need
to send checksums of the files it has and coordinator
will need to ensure that file with checksum "abcd123"
is assigned just to the node which has this file.</div>
</blockquote>
<div><br>
</div>
<div>With Docker/Kubernetes all nodes would have the same
files. At least initially. Would be nice if we could
come up with a solution where you can just drop an
additional file onto any node and have it imported.</div>
</div>
</div>
</div>
</blockquote>
Exactly, was thinking about Docker too. Here we don't have any
issue at all.<br>
<br>
The main question here is, do we want to support the scenario when
various cluster nodes have different content? As I mentioned,
distributed coordinator can handle it, so that each cluster node
will send the checksums of the files it has and coordinator will
always assign to node just the checksums, which it has.<br>
<br>
However regardless of distributed executor approach or DB locks
approach, there may be still the issues. For example:<br>
1) The file testrealm.json with checksum "abc" is triggered for
import on node1<br>
2) At the same time, admin will do some minor change in this file
on node2 and save it. This will mean that checksum of the file on
node2 will be changed to "def"<br>
3) Node2 will trigger import of that file. So we have both node1
and node2 importing same file concurrently because the previously
retrieved lock was for "abc" checksum, but now checksum is "def" <br>
<br>
This problem will be with both DB lock and DistributedExecutor
approaches though...<br>
</blockquote>
Possible solution for this issue is, that when import is already in
progress, the newly added or changed checksums will be ignored. The
checksums will be always checked just at start of the import.<br>
<br>
Marek<br>
<blockquote cite="mid:564355E0.1040609@redhat.com" type="cite"> <br>
Marek<br>
<blockquote
cite="mid:CAJgngAeLDWnyE=OZ5k+0uZT9gV+jNvs=r_2WA4acXQhCEMxF1g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span
class="HOEnZb"><font color="#888888"><br>
<br>
Marek</font></span>
<div>
<div class="h5"><br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span><font
color="#888888"><br>
<br>
Marek</font></span>
<div>
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div dir="ltr"> <br>
</div>
<div class="gmail_extra"><br>
<div
class="gmail_quote">
<div>
<div>On Mon, Nov
9, 2015 at 1:20
PM, Stian
Thorgersen <span
dir="ltr"><<a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:sthorger@redhat.com"><a class="moz-txt-link-abbreviated" href="mailto:sthorger@redhat.com">sthorger@redhat.com</a></a>></span>
wrote:<br>
</div>
</div>
<blockquote
class="gmail_quote"
style="margin:0 0
0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div>
<div>
<div dir="ltr">Currently
we support
importing a
complete realm
definition
using the
import/export
feature.
Issues with
the current
approach is:
<div><br>
</div>
<div>* Only
complete realm
- not possible
to add to an
existing realm</div>
<div>* No good
feedback if
import was
successful or
not</div>
<div>* Use of
system
properties to
initiate the
import is not
very user
friendly</div>
<div>* Not
very elegant
for
provisioning.
For example a
Docker image
that want's to
bundle some
initial setup
ends up always
running the
import of a
realm, which
is skipped if
realm exists</div>
<div><br>
</div>
<div>To solve
this I've come
up with the
following
proposal:</div>
<div><br>
</div>
<div>Allow
dropping
representations
to be imported
into
'standalone/import'.
This should
support
creating a new
realm as well
as importing
into an
existing
realm. When
importing into
an existing
realm we will
have an import
strategy that
is used to
configure what
happens if a
resource
exists (user,
role, identity
provider, user
federtation
provider). The
import
strategies
are:</div>
<div><br>
</div>
<div>* Skip -
existing
resources are
skipped,</div>
<div>* Fail -
if any
resource
exists nothing
is imported</div>
<div>*
Overwrite -
any existing
resources are
deleted.</div>
<div><br>
</div>
<div>The
directory will
be scanned at
startup, but
there will
also be an
option to
monitor this
directory at
runtime.</div>
<div><br>
</div>
<div>To
prevent a file
being imported
multiple times
(also to make
sure only one
node in a
cluster
imports) we
will have a
table in the
database that
contains what
files was
imported, from
what node,
date and
result
(including a
list of what
resources
where
imported,
which was not,
and stack
trace if
applicable).
The primary
key will be
the checksum
of the file.
We will also
add marker
files
(<json
file>.imported
or <json
file>.failed).
The contents
of the marker
files will be
a json object
with date
imported,
outcome
(including
stack trace if
applicable) as
well as a
complete list
of what
resources was
successfully
imported, what
where not.</div>
<div><br>
</div>
<div>The files
will also
allow
resolving
system
properties and
environment
variables. For
example:</div>
<div><br>
</div>
<div>{</div>
<div>
"secret":
"${env.MYCLIENT_SECRET}"</div>
<div>}</div>
<div><br>
</div>
<div>This will
be very
convenient for
example with
Docker as it
would be very
easy to create
a Docker image
that extends
ours to add a
few clients
and users.</div>
<div><br>
</div>
<div>It will
also be
convenient for
examples as it
will make it
possible to
add the
required
clients and
users to an
existing
realm.</div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
</div>
</div>
_______________________________________________<br>
keycloak-dev
mailing list<br>
<a
moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:keycloak-dev@lists.jboss.org"><a class="moz-txt-link-abbreviated" href="mailto:keycloak-dev@lists.jboss.org">keycloak-dev@lists.jboss.org</a></a><br>
<a
moz-do-not-send="true"
class="moz-txt-link-freetext"
href="https://lists.jboss.org/mailman/listinfo/keycloak-dev"><a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/keycloak-dev">https://lists.jboss.org/mailman/listinfo/keycloak-dev</a></a><br>
</blockquote>
</div>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
keycloak-dev mailing list
<a moz-do-not-send="true" href="mailto:keycloak-dev@lists.jboss.org" target="_blank">keycloak-dev@lists.jboss.org</a>
<a moz-do-not-send="true" href="https://lists.jboss.org/mailman/listinfo/keycloak-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/keycloak-dev</a></pre>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
keycloak-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:keycloak-dev@lists.jboss.org">keycloak-dev@lists.jboss.org</a>
<a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/keycloak-dev">https://lists.jboss.org/mailman/listinfo/keycloak-dev</a></pre>
</blockquote>
<br>
</body>
</html>