[JBoss JIRA] (JGRP-2234) Unlocked locks stay locked forever

Friday, 9 February 2018

    [
https://issues.jboss.org/browse/JGRP-2234?page=com.atlassian.jira.plugin....
] 

Bram Klein Gunnewiek edited comment on JGRP-2234 at 2/9/18 4:39 AM:
--------------------------------------------------------------------

[~belaban] Thanks for resolving this! Using the testcase provided is fine obviously.

A side note about the "multiple members holding the same lock in a split brain":
since JGroups can't prevent these scenario's I would suggest that in case of
"multiple lock holders" the lock should be released again when *all* lock
holders have released it. E.G. in a cluster of [A,B,C,D] that splits into [A,B][C,D] where
A and C hold the same lock (lock X) and they merge back into one cluster [A,B,C,D] I think
think lock X should be 'released' after both A and C unlock it. But that might be
something to think about in JGRP-2249.

was (Author: bramklg):
[~belaban] Thanks for resolving this! Using the testcase provided is fine obviously.

A side note about the "multiple members holding the same lock in a split brain":
since JGroups can't prevent these scenario's I would suggest that in case of
"multiple lock holders" the lock should be released again when *all* lock owners
have released it. E.G. in a cluster of [A,B,C,D] that splits into [A,B][C,D] where A and C
hold the same lock (lock X) and they merge back into one cluster [A,B,C,D] I think think
lock X should be 'released' after both A and C unlock it. But that might be
something to think about in JGRP-2249.

...
 Unlocked locks stay locked forever
 ----------------------------------

                 Key: JGRP-2234
                 URL: https://issues.jboss.org/browse/JGRP-2234
             Project: JGroups
          Issue Type: Bug
            Reporter: Bram Klein Gunnewiek
            Assignee: Bela Ban
             Fix For: 4.0.11

         Attachments: ClusterSplitLockTest.java, jg_clusterlock_output_testfail.txt

 As discussed in the mailing list we have issues where locks from the central lock
protocol stay locked forever when the coordinator of the cluster disconnects. We can
reproduce this with the attached ClusterSplitLockTest.java. Its a race condition and we
need to run the test a lot of times (sometimes > 20) before we encounter a failure. 
 What we think is happening: 
 In a three node cluster (node A, B and C where node A is the coordinator) unlock requests
from B and/or C can be missed when node A leaves and B and/or C don't have the new
view installed yet. When, for example, node B takes over coordination it creates the lock
table based on the back-ups. Lets say node C has locked the lock with name
'lockX'. Node C performs an unlock of 'lockX' just after node A
(gracefully) leaves and sends the unlock request to node A since node C doesn't have
the correct view installed yet. Node B has recreated the lock table where 'lockX'
is locked by Node C. Node C doesn't resend the unlock request so 'lockX' gets
locked forever.
 Attached is the testng test we wrote and the output of a test failure. 

--
This message was sent by Atlassian JIRA
(v7.5.0#75005)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006