[infinispan-issues] [JBoss JIRA] (ISPN-4131) Lock acquired forever with delayed PrepareCommand

Dan Berindei (JIRA) issues at jboss.org
Wed May 7 06:38:56 EDT 2014


    [ https://issues.jboss.org/browse/ISPN-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966067#comment-12966067 ] 

Dan Berindei edited comment on ISPN-4131 at 5/7/14 6:38 AM:
------------------------------------------------------------

I think I have a possible fix for this: during the completed transactions cleanup, TransactionTable should save the biggest transaction id from each member.
When checking if a prepare should be rejected, it should also check the transaction id against the last saved transaction id, and if it's smaller reject it even if it's not in the completed transactions table.

It's not even a problem to fail prepare commands because they have been delayed for more than {{SyncConfiguration.replTimeout()}}, so we just have to check during validation that {{TransactionConfiguration.completedTxTimeout()}} is bigger. We just have to make sure we don't do the check for asynchronous caches.




was (Author: dan.berindei):
I think I have a possible fix for this: during the completed transactions cleanup, TransactionTable should save the biggest transaction id from each member.
When checking if a prepare should be rejected, it should also check the transaction id against the next-to-last saved transaction id, and if it's smaller reject it even if it's not in the completed transactions table.

It's not even a problem to fail prepare commands because they have been delayed for more than {{SyncConfiguration.replTimeout()}}, so we just have to check during validation that {{TransactionConfiguration.completedTxTimeout()}} is bigger. We just have to make sure we don't do the check for asynchronous caches.



> Lock acquired forever with delayed PrepareCommand
> -------------------------------------------------
>
>                 Key: ISPN-4131
>                 URL: https://issues.jboss.org/browse/ISPN-4131
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 6.0.2.Final, 7.0.0.Alpha1
>            Reporter: Radim Vansa
>            Assignee: Dan Berindei
>            Priority: Critical
>              Labels: 630betablocker
>
> Distributed transactional cache:
> 1. A sends Prepare to B
> 2. B receives Prepare, but due to ongoing ST it is blocked
> 3. B replication timeout elapses
> 4. B sends Rollback, this does not find the TX as Prepare was not executed yet. The transaction is put into completedTransactions.
> 5. Completed transactions timeout elapses. This is by default 15 seconds, way shorter than ST timeout (due to which the Prepare was blocked)
> 6. Prepare is executed on B, acquiring lock on K
> Nobody will rollback the TX as originator thinks it was already rolled back.
> Result: key K will be locked forever, all attempts to update/remove it will fail.



--
This message was sent by Atlassian JIRA
(v6.2.3#6260)


More information about the infinispan-issues mailing list