Thanks Manik! <br><br>Israel Lacerra<br><br><div class="gmail_quote">On Wed, May 5, 2010 at 6:27 AM, Manik Surtani <span dir="ltr">&lt;<a href="mailto:manik@jboss.org">manik@jboss.org</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">


Hi there<br>

<div class="im"><br>

On 4 May 2010, at 20:42, Israel Lacerra wrote:<br>

<br>

&gt; I&#39;m studying ISPN-200 cause I thinking about resolve this issue in my M. Sc. topic. About this, I want to make a couple of questions (and maybe they don&#39;t make sense):<br>

&gt;<br>

&gt; - Currently, If we have &quot;-Dinfinispan.query.indexLocalOnly=true&quot; the indexes are just local, right? And if &quot;-Dinfinispan.query.indexLocalOnly=false&quot;, the indexes are global shared. Am I right?<br>


<br>

</div>Yes.  Basically Lucene handles and stores the indexes.  Now you could have 2 scenarios.  Scenario 1: where each node has its own private, non-shared set of indexes.  Scenario 2: there is a shared, global index, where each node writes to and updates this global index (perhaps stored on NFS, etc).  The relevant scenario depends on how you configure Lucene.<br>


<br>

Now the switch in Infinispan controls which node(s) in the cluster actually do the indexing whenever there is a change in data in the cluster.  If you have configured Lucene to maintain non-shared indexes, then *every* node in the cache needs to update their own private index whenever there is a change in any entry, anywhere in the cluster.  -Dinfinispan.query.indexLocalOnly=false will force Infinispan nodes to index changes that happen anywhere in the cluster.<br>


<br>

If the indexes are global and shared, then there is no need for each node to update the indexes.  Only the node that initiated the change should update the indexes, and -Dinfinispan.query.indexLocalOnly=true will force this behaviour.<br>


<div class="im"><br>

&gt; - So, how ISPN-200 will work on this two possibilities?<br>

<br>

</div>As for ISPN-200, this is part of what we need to think about.  Ideally, the only approach that will truly scale is for each node to maintain not just shared or non-shared indexes, but a fragment of the global index.  A fragment that pertains to just the data it owns.  So, assume we have this setup with 4 nodes:<br>


<br>

Caches: {A, B, C, D}<br>

<br>

Keys:<br>

<br>

K1 -&gt; {A, B}<br>

K2 -&gt; {B, C}<br>

K3 -&gt; {C, D}<br>

<br>

A&#39;s index would have {K1}<br>

B&#39;s index would have {K1, K2}<br>

C&#39;s index would have {K2, K3}<br>

D&#39;s index would have {K3}<br>

<br>

So if we were to write a query that matches K1, that query would be sent to every node in the cluster and the results returned would look like:<br>

<br>

A: {K1}<br>

B: {K1}<br>

C: {}<br>

D: {}<br>

<br>

Similarly, if we were to write a query that matches K1 and K2, that query would be sent to every node in the cluster and the results returned would look like:<br>

<br>

A: {K1}<br>

B: {K1, K2}<br>

C: {K2}<br>

D: {}<br>

<br>

Now the tricky part will be to efficiently collate these partial results into a proper resultset to pass back to the user, including removing duplicates, proper ranking and ordering, etc.<br>

<br>

Hope this helps!<br>

<div><div></div><div class="h5"><br>

Cheers<br>

Manik<br>

<br>

--<br>

Manik Surtani<br>

<a href="mailto:manik@jboss.org">manik@jboss.org</a><br>

Lead, Infinispan<br>

Lead, JBoss Cache<br>

<a href="http://www.infinispan.org" target="_blank">http://www.infinispan.org</a><br>

<a href="http://www.jbosscache.org" target="_blank">http://www.jbosscache.org</a><br>

<br>

<br>

<br>

<br>

<br>

_______________________________________________<br>

infinispan-dev mailing list<br>

<a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>

</div></div></blockquote></div><br>