<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Hi Randall,<br>
<br>
Infinispan supports both push and pull access models. The push
model is supported by events (and listeners), which are cluster
wide and are available in both library and remote mode (hotrod).
The notification system is pretty advanced as there is a filtering
mechanism available that can use a hand coded filter / converter
or one specified in jpql (experimental atm). Getting a snapshot of
the initial data is also possible. But infinispan does not produce
a transaction log to be used for determining all changes that
happened since a previous connection time, so you'll always have
to get a new full snapshot when re-connecting. <br>
<br>
So if Infinispan is the data store I would base the Debezium
connector implementation on Infinispan's event notification
system. Not sure about the other use case though.<br>
<br>
Adrian<br>
<br>
On 07/09/2016 04:38 PM, Randall Hauch wrote:<br>
</div>
<blockquote
cite="mid:2569A6BA-FBC2-40A7-A821-26676F10BEB0@redhat.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
The Debezium project [1] is working on building change data
capture connectors for a variety of databases. MySQL is available
now, MongoDB will be soon, and PostgreSQL and Oracle are next on
our roadmap.�
<div class=""><br class="">
</div>
<div class="">One way in which Debezium and Infinispan can be used
together is when Infinispan is being used as a cache for data
stored in a database. In this case, Debezium can capture the
changes to the database and produce a stream of events; a
separate process can consume these change and evict entries from
an Infinispan cache.</div>
<div class=""><br class="">
</div>
<div class="">If Infinispan is to be used as a data store, then it
would be useful for Debezium to be able to capture those changes
so other apps/services can consume the changes. First of all,
does this make sense? Secondly, if it does, then Debezium would
need an Infinispan connector, and it�s not clear to me how that
connector might capture the changes from Infinispan.<br class="">
<div class=""><br class="">
</div>
<div class="">Debezium typically monitors the log of
transactions/changes that are committed to a database. Of
course how this works varies for each type of database. For
example, MySQL internally produces a transaction log that
contains information about every committed row change, and
MySQL ensures that every committed change is included and that
non-committed changes are excluded. The MySQL mechanism is
actually part of the replication mechanism, so slaves update
their internal state by reading the master�s log. The Debezium
MySQL connector [2] simply reads the same log.</div>
<div class=""><br class="">
</div>
<div class="">Infinispan has several mechanisms that may be
useful:</div>
<div class=""><br class="">
</div>
<div class="">
<ul class="MailOutline">
<li class="">Interceptors - See [3]. This seems pretty
straightforward and IIUC provides access to all internal
operations. However, it�s not clear to me whether a single
interceptor will see all the changes in a cluster (perhaps
in local and replicated modes) or only those changes that
happen on that particular node (in distributed mode). It�s
also not clear whether this interceptor is called within
the context of the cache�s transaction, so if a failure
happens just at the wrong time whether a change might be
made to the cache but is not seen by the interceptor (or
vice versa).</li>
<li class="">Cross-site replication - See [4][5]. A
potential advantage of this mechanism appears to be that
it is defined (more) globally, and it appears to function
if the remote backup comes back online after being offline
for a period of time.</li>
<li class="">State transfer - is it possible to participate
as a non-active member of the cluster, and to effectively
read all state transfer activities that occur within the
cluster?</li>
<li class="">Cache store - tie into the cache store
mechanism, perhaps by wrapping an existing cache store and
sitting between the cache and the cache store</li>
<li class="">Monitor the cache store - don�t monitor
Infinispan at all, and instead monitor the store in which
Infinispan is storing entries. (This is probably the least
attractive, since some stores can�t be monitored, or
because the store is persisting an opaque binary value.)</li>
</ul>
<div class=""><br class="">
</div>
</div>
<div class="">Are there other mechanism that might be used?</div>
<div class=""><br class="">
</div>
<div class="">There are a couple of important requirements for
change data capture to be able to work correctly:</div>
<div class=""><br class="">
</div>
<div class="">
<ol class="MailOutline">
<li class="">Upon initial connection, the CDC connector must
be able to obtain a snapshot of all existing data,
followed by seeing all changes to data that may have
occurred since the snapshot was started. If the connector
is stopped/fails, upon restart it needs to be able to
reconnect and either see all changes that occurred since
it last was capturing changes, or perform a snapshot.
(Performing a snapshot upon restart is very inefficient
and undesirable.) This works as follows: the CDC connector
only records the �offset� in the source�s sequence of
events; what this �offset� entails depends on the source.
Upon restart, the connector can use this offset
information to coordinate with the source where it wants
to start reading. (In MySQL and PostgreSQL, every event
includes the filename of the log and position in that
file. MongoDB includes in each event the monotonically
increasing timestamp of the transaction.</li>
<li class="">No change can be missed, even when things go
wrong and components crash.</li>
<li class="">When a new entry is added, the �after� state of
the entity will be included. When an entry is updated, the
�after� state will be included in the event; if possible,
the event should also include the �before� state. When an
entry is removed, the �before� state should be included in
the event.</li>
</ol>
</div>
<div class=""><br class="">
</div>
<div class="">Any thoughts or advice would be greatly
appreciated.</div>
<div class=""><br class="">
</div>
<div class="">Best regards,</div>
<div class=""><br class="">
</div>
<div class="">Randall</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">[1] <a moz-do-not-send="true"
href="http://debezium.io" class="">http://debezium.io</a></div>
<div class="">[2]�<a moz-do-not-send="true"
href="http://debezium.io/docs/connectors/mysql/" class="">http://debezium.io/docs/connectors/mysql/</a></div>
<div class="">[3]�<a moz-do-not-send="true"
href="http://infinispan.org/docs/stable/user_guide/user_guide.html#_custom_interceptors_chapter"
class="">http://infinispan.org/docs/stable/user_guide/user_guide.html#_custom_interceptors_chapter</a></div>
<div class="">[4]�<a moz-do-not-send="true"
href="http://infinispan.org/docs/stable/user_guide/user_guide.html#CrossSiteReplication"
class="">http://infinispan.org/docs/stable/user_guide/user_guide.html#CrossSiteReplication</a></div>
<div class="">[5]�<a moz-do-not-send="true"
href="https://github.com/infinispan/infinispan/wiki/Design-For-Cross-Site-Replication"
class="">https://github.com/infinispan/infinispan/wiki/Design-For-Cross-Site-Replication</a></div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
infinispan-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a>
<a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/infinispan-dev">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a></pre>
</blockquote>
<br>
</body>
</html>