Well, I don't do SYNCHRONOUS_IGNORE_LEAVERS for commands that has less than quorumSize
number of destinations (so commands to single destination retain it's ResponseMode
unchanged) and recheck after the command that I still have enough member in the cluster.
But I will add a test to check that behaviour is correct. I am not sure on the next
sequence of events:
1) Node A locks key K with lock owner on node X
2) Node X dies
3) Lock ownership moves to node Y
4) Node B locks key K on Y
Best regards, Vitalii Tymchyshyn
________________________________
From: infinispan-dev-bounces(a)lists.jboss.org
[mailto:infinispan-dev-bounces@lists.jboss.org] On Behalf Of Dan Berindei
Sent: Thursday, June 06, 2013 5:09 AM
To: infinispan -Dev List
Subject: Re: [infinispan-dev] Using infinispan as quorum-based nosql
Say you have two transactions, tx1 and tx2. They both send a LockControlCommand(k1) to the
primary owner of k1 (let's call it B).
If the lock commands use SYNCHRONOUS_IGNORE_LEAVERS and B dies while processing the
commands, both tx1 and tx2 will think they have succeeded in locking k1.
So you're right, everything should be locked before prepare in pessimistic mode, but
LockControlCommands are also susceptible to SuspectExceptions. On the other hand, you can
use SYNCHRONOUS mode for LockControlCommands and you can just retry the transaction in
case of a SuspectException.
Unfortunately, you can't retry the transaction if the PrepareCommand fails (in
pessimistic mode; or the CommitCommand in optimistic mode), because it is executed in the
commit phase. The transaction manager swallows all the exceptions in the commit phase,
making it impossible to see if it failed because of a node leaving. I guess this means I
should increase the priority of
https://issues.jboss.org/browse/ISPN-2402 ...
On Thu, Jun 6, 2013 at 11:49 AM, <vitalii.tymchyshyn(a)ubs.com> wrote:
Hello.
We are using pessimistic transaction mode. In this case everything's already locked
by the time of prepare, is not it?
As of merge, for quorum mode it's simple - take data from quorum. I think I will try
to simply suppress sending data from non-quorum members on merge. Because currently
everyone sends it's data and it creates complete mess with unsynchronized data after
merge (depending on the timing).
Best regards, Vitalii Tymchyshyn
________________________________
From: infinispan-dev-bounces(a)lists.jboss.org
[mailto:infinispan-dev-bounces@lists.jboss.org] On Behalf Of Dan Berindei
Sent: Wednesday, June 05, 2013 12:04 PM
To: infinispan -Dev List
Subject: Re: [infinispan-dev] Using infinispan as quorum-based nosql
On Mon, Jun 3, 2013 at 4:23 PM, <vitalii.tymchyshyn(a)ubs.com> wrote:
Hello.
Thanks for your information. I will subscribe and vote for the issues noted.
In the meantime I've implemented hacky JgroupsTransport that "downgrades"
all (but CacheViewControlCommand and StateTransferControlCommand) SYNCHRONOUS
invokeRemotely calls to SYNCHRONOUS_IGNORE_LEAVERS and checks if required number of
answers was received with a filter (I've tried to use original invokeRemotely return
value but it often returns some strange value, like empty map). It seems to do the trick
for me. But I am still not sure if this has any side effects.
Indeed, I started working on a solution, but I over-engineered it and then I got
side-tracked with other stuff. Sorry about that.
The problem with using SYNCHRONOUS_IGNORE_LEAVERS everywhere, as I found out, is that you
don't want to ignore the primary owner of a key leaving during a prepare/lock command
(or the coordinator, in REPL mode prior to 5.3.0.CR1/ISPN-2772). If that happens, you have
to retry on the new primary owner, otherwise you can't know if the prepare command has
locked the key or not.
A similar problem appears in non-transactional caches with
supportsConcurrentUpdates=true: there the primary owner can ignore any of the backup
owners leaving, but the originator can't ignore the primary owner leaving.
For now I can see merge problem in my test: different values are picked during merge. I
am going to dig a little deeper and follow up. But it's already a little strange for
me, since the test algorithm is:
1)Assign "old" value to full cluster (it's REPL_SYNC mode)
2)Block coordinator
3)Writer "new" value to one of two remaining nodes. It's syncrhonized to
second remaining node
4)Unblock coordinator
5)Wait (I could not find a good way to wait for state transfer but wait in this case).
6)Check the value on coordinator
And in my test I am randomly getting "old" or "new" in assert. I am
now going to check why. May be I will need to "reinitialize" smaller cluster
part to ensure data is taken from the quorum part of the cluster.
We don't handle merges properly. See
https://issues.jboss.org/browse/ISPN-263 and the
discussion at
http://markmail.org/message/meyczotzobuva7js
What happens right now is that after a merge, all the caches are assumed to have
up-to-date data, so there is no state transfer. We had several ideas floating around on
how we could force the smaller partition to receive data from the quorum partition, but I
think with the public API your best option is to stop all the caches in the smaller
partition after the split and start them back up after the merge.
Cheers
Dan
Best regards, Vitalii Tymchyshyn
-----Original Message-----
From: infinispan-dev-bounces(a)lists.jboss.org
[mailto:infinispan-dev-bounces@lists.jboss.org] On Behalf Of Galder Zamarreno
Sent: Monday, June 03, 2013 9:04 AM
To: infinispan -Dev List
Subject: Re: [infinispan-dev] Using infinispan as quorum-based nosql
On May 30, 2013, at 5:10 PM, vitalii.tymchyshyn(a)ubs.com wrote:
Hello.
We are going to use Infinispan in our project as NoSQL solution. It
performs quite well for us, but currently we've faced next problem.
Note: We are using Infinispan 5.1.6 in SYNC_REPL mode in small cluster.
The problem is that when any node fails, any running transactions wait
for Jgroups to decide if it've really failed or not and rollback
because of SuspectException after that. While we can live with a
delay, we'd really like to skip rolling back. As for me, I actually
don't see a reason for rollback because transactions started after
leave will succeed. So, as for me, previously running transactions
could do the same.
We're aware of the problem (
https://issues.jboss.org/browse/ISPN-2402).
@Dan, has there been any updates on this?
The question for is if node that left will synchronize it's
state
after merge (even if merge was done without infinispan restart). As
for me, it should or it won't work correctly at all.
This is not in yet:
https://issues.jboss.org/browse/ISPN-263
So, I've found RpcManager's
ResponseMode.SYNCHRONOUS_IGNORE_LEAVERS
and think on switching to it for RpcManager calls that don't specify
ResponseMode explicitly. As for me, it should do the trick. Also, I am
going to enforce Quorum number of reponses, but that's another story.
So, how do you think, would it work?
^ Not sure if that'll work. @Dan?
P.S. Another Q for me, how does it work now, when SuspectException
is
thrown from CommitCommand broadcasting. Af far as I can see, commit is
still done on some remote nodes (that are still in the cluster), but
rolled back on local node because of this exception. Am I correct?
^ How Infinispan reacts in these situations depends a lot on the type of communications
(synchronous or asynchronous) and the transaction configuration. Mircea can provide more
details on this.
Cheers,
This
can cause inconsistencies, but we must leave with something in
peer-to-peer world :) The only other option is to switch from
write-all, read-local to write-quorum, read-quorum scenario that is
too complex move for Infinispan as for me.
Best regards, Vitalii Tymchyshyn
Please visit our website at
http://financialservicesinc.ubs.com/wealth/E-maildisclaimer.html
for important disclosures and information about our e-mail policies.
For your protection, please do not transmit orders or instructions by
e-mail or include account numbers, Social Security numbers, credit
card numbers, passwords, or other personal information.
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
galder(a)redhat.com
twitter.com/galderz
Project Lead, Escalante
http://escalante.io
Engineer, Infinispan
http://infinispan.org
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Please visit our website at
http://financialservicesinc.ubs.com/wealth/E-maildisclaimer.html
for important disclosures and information about our e-mail
policies. For your protection, please do not transmit orders
or instructions by e-mail or include account numbers, Social
Security numbers, credit card numbers, passwords, or other
personal information.
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Please visit our website at
http://financialservicesinc.ubs.com/wealth/E-maildisclaimer.html
for important disclosures and information about our e-mail
policies. For your protection, please do not transmit orders
or instructions by e-mail or include account numbers, Social
Security numbers, credit card numbers, passwords, or other
personal information.
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Please visit our website at
http://financialservicesinc.ubs.com/wealth/E-maildisclaimer.html
for important disclosures and information about our e-mail
policies. For your protection, please do not transmit orders
or instructions by e-mail or include account numbers, Social
Security numbers, credit card numbers, passwords, or other
personal information.