[JBoss JIRA] (ELY-47) NFKC normalization in StringPrep is not in accordance with RFC
by David Lloyd (JIRA)
[ https://issues.jboss.org/browse/ELY-47?page=com.atlassian.jira.plugin.sys... ]
David Lloyd commented on ELY-47:
--------------------------------
Are you saying that the test incorrectly expects case folding, and that the implementation incorrectly performs case folding?
If so then I agree this needs to be fixed - and it should be done before final release - but unfortunately this would mean we cannot use the JDK's Normalizer class; we'd have to implement normalization ourselves.
> NFKC normalization in StringPrep is not in accordance with RFC
> --------------------------------------------------------------
>
> Key: ELY-47
> URL: https://issues.jboss.org/browse/ELY-47
> Project: WildFly Elytron
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Reporter: Jan Kalina
> Assignee: Darran Lofthouse
>
> StringPrep from utils use java.text.Normalizer to NFKC normalization. But this normalization is not in accordance with RFC 3454 - see mapping table:
> http://tools.ietf.org/html/rfc3454#appendix-B.2
> Relevant profile description:
> http://tools.ietf.org/html/rfc3454#section-3.2
> Full test is part of [pull request 13|https://github.com/wildfly-security/wildfly-sasl/pull/13], but for basic testing can be used this simple test:
> {code:java}
> @Test
> public void testNormalizationWithNFKC(){
> ByteStringBuilder b = new ByteStringBuilder();
> String before = "\u0041\u0042\u0043\u0044\u0045\u0046\u0047";
> String after = "\u0061\u0062\u0063\u0064\u0065\u0066\u0067";
> StringPrep.encode(before, b, StringPrep.NORMALIZE_KC);
> assertEquals(after, new String(b.toArray()));
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (WFLY-3269) XML parsing mandating the 'force' attribute on username-to-dn even though it has a default value.
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/WFLY-3269?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on WFLY-3269:
-----------------------------------------------
Darran Lofthouse <darran.lofthouse(a)redhat.com> changed the Status of [bug 1133961|https://bugzilla.redhat.com/show_bug.cgi?id=1133961] from NEW to ASSIGNED
> XML parsing mandating the 'force' attribute on username-to-dn even though it has a default value.
> -------------------------------------------------------------------------------------------------
>
> Key: WFLY-3269
> URL: https://issues.jboss.org/browse/WFLY-3269
> Project: WildFly
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Domain Management
> Reporter: Darran Lofthouse
> Assignee: Emmanuel Hugonnet
> Fix For: 9.0.0.Beta1
>
>
> {code}
> Trying so, I run in the error (when starting WildFly) :
> 10:28:29,674 ERROR [org.jboss.as.server] (Controller Boot Thread) JBAS015956: Caught exception during boot: org.jboss.as.controller.persistence.ConfigurationPersistenceException: JBAS014676: Failed to parse configuration
> at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:112) [wildfly-controller-8.0.0.Final.jar:8.0.0.Final]
> at org.jboss.as.server.ServerService.boot(ServerService.java:331) [wildfly-server-8.0.0.Final.jar:8.0.0.Final]
> at org.jboss.as.controller.AbstractControllerService$1.run(AbstractControllerService.java:256) [wildfly-controller-8.0.0.Final.jar:8.0.0.Final]
> at java.lang.Thread.run(Thread.java:724) [rt.jar:1.7.0_40]
> Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[76,25]
> Message: JBAS014724: Missing required attribute(s): FORCE
> at org.jboss.as.controller.parsing.ParseUtils.missingRequired(ParseUtils.java:134) [wildfly-controller-8.0.0.Final.jar:8.0.0.Final]
> at org.jboss.as.domain.management.parsing.ManagementXml.parseUsernameToDn_2_0(ManagementXml.java:2118) [wildfly-domain-management-8.0.0.Final.jar:8.0.0.Final]
> {code}
> {code}
> <security-realm name="MgtRealm">
> <authentication>
> <ldap connection="ovodavLDAP" base-dn="ou=People,dc=hydrogenic,dc=local">
> <!-- <advanced-filter filter="(&(cn=jboss-admin)(member=uid={0},ou=People,dc=hydrogenic,dc=local))" recursive="true"/> -->
> <username-filter attribute="uid"/>
> </ldap>
> </authentication>
> <authorization>
> <ldap connection="ovodavLDAP">
> <username-to-dn>
> <username-filter base-dn="ou=People,dc=hydrogenic,dc=local" recursive="false" attribute="uid" user-dn-attribute="dn" />
> </username-to-dn>
> <group-search group-name="SIMPLE" iterative="true" group-dn-attribute="dn" group-name-attribute="uid">
> <group-to-principal base-dn="ou=Groups,dc=hydrogenic,dc=local" recursive="true" search-by="DISTINGUISHED_NAME">
> <membership-filter principal-attribute="uniqueMember" />
> </group-to-principal>
> </group-search>
> </ldap>
> </authorization>
> </security-realm>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (WFLY-3269) XML parsing mandating the 'force' attribute on username-to-dn even though it has a default value.
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/WFLY-3269?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated WFLY-3269:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1133961
> XML parsing mandating the 'force' attribute on username-to-dn even though it has a default value.
> -------------------------------------------------------------------------------------------------
>
> Key: WFLY-3269
> URL: https://issues.jboss.org/browse/WFLY-3269
> Project: WildFly
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Domain Management
> Reporter: Darran Lofthouse
> Assignee: Emmanuel Hugonnet
> Fix For: 9.0.0.Beta1
>
>
> {code}
> Trying so, I run in the error (when starting WildFly) :
> 10:28:29,674 ERROR [org.jboss.as.server] (Controller Boot Thread) JBAS015956: Caught exception during boot: org.jboss.as.controller.persistence.ConfigurationPersistenceException: JBAS014676: Failed to parse configuration
> at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:112) [wildfly-controller-8.0.0.Final.jar:8.0.0.Final]
> at org.jboss.as.server.ServerService.boot(ServerService.java:331) [wildfly-server-8.0.0.Final.jar:8.0.0.Final]
> at org.jboss.as.controller.AbstractControllerService$1.run(AbstractControllerService.java:256) [wildfly-controller-8.0.0.Final.jar:8.0.0.Final]
> at java.lang.Thread.run(Thread.java:724) [rt.jar:1.7.0_40]
> Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[76,25]
> Message: JBAS014724: Missing required attribute(s): FORCE
> at org.jboss.as.controller.parsing.ParseUtils.missingRequired(ParseUtils.java:134) [wildfly-controller-8.0.0.Final.jar:8.0.0.Final]
> at org.jboss.as.domain.management.parsing.ManagementXml.parseUsernameToDn_2_0(ManagementXml.java:2118) [wildfly-domain-management-8.0.0.Final.jar:8.0.0.Final]
> {code}
> {code}
> <security-realm name="MgtRealm">
> <authentication>
> <ldap connection="ovodavLDAP" base-dn="ou=People,dc=hydrogenic,dc=local">
> <!-- <advanced-filter filter="(&(cn=jboss-admin)(member=uid={0},ou=People,dc=hydrogenic,dc=local))" recursive="true"/> -->
> <username-filter attribute="uid"/>
> </ldap>
> </authentication>
> <authorization>
> <ldap connection="ovodavLDAP">
> <username-to-dn>
> <username-filter base-dn="ou=People,dc=hydrogenic,dc=local" recursive="false" attribute="uid" user-dn-attribute="dn" />
> </username-to-dn>
> <group-search group-name="SIMPLE" iterative="true" group-dn-attribute="dn" group-name-attribute="uid">
> <group-to-principal base-dn="ou=Groups,dc=hydrogenic,dc=local" recursive="true" search-by="DISTINGUISHED_NAME">
> <membership-filter principal-attribute="uniqueMember" />
> </group-to-principal>
> </group-search>
> </ldap>
> </authorization>
> </security-realm>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (WFLY-3756) Rename package name of whole Arquillian Wildfly adapter
by Stefan Miklosovic (JIRA)
[ https://issues.jboss.org/browse/WFLY-3756?page=com.atlassian.jira.plugin.... ]
Stefan Miklosovic commented on WFLY-3756:
-----------------------------------------
PR sent.
> Rename package name of whole Arquillian Wildfly adapter
> -------------------------------------------------------
>
> Key: WFLY-3756
> URL: https://issues.jboss.org/browse/WFLY-3756
> Project: WildFly
> Issue Type: Feature Request
> Security Level: Public(Everyone can see)
> Components: Test Suite
> Affects Versions: 8.1.0.Final
> Reporter: Stefan Miklosovic
> Assignee: Stefan Miklosovic
>
> Speaking about Arquillian Wildfly container adapter, some time ago it seems to me it was directly embedded into wildfly repository at github when I recall that correctly.
> Right now, it is deleted from there and is moved to https://github.com/wildfly/wildfly-arquillian
> The problem is that when you want to make a test which mixes two containers together, to be concrete, good old AS7 and new Wildfly, you can not do that since its package name are just same so you have naming clash on your class path.
> I am author of multiple container extension (1) (2) under Arquillian umbrella which enables the usage of two different container adapters in one test run which is not possible normally. While it was possible to make the difference between Jboss AS 7 and Wildfly since theirs package names were org.jboss.as and org.wildfly respectively when Wildfly was embedded in Wildfly repo itself, you can not do this anymore.
> This affects e.g. guys from Infinispan project which are trying to cover the migration from JBoss AS to Wildfly and they are writing tests for it. (you have old Jbosses and Wildflies and Infinispan can migrate data from one server to another and drop the old ones).
> I suggest to rename package name to org.wildfly to not collide anymore.
> Thanks a lot!
> (1) https://github.com/arquillian/arquillian-droidium/tree/master/droidium-co...
> (2) https://github.com/arquillian/arquillian-droidium/tree/master/droidium-co...
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (ELY-43) Clean up Base64
by David Lloyd (JIRA)
[ https://issues.jboss.org/browse/ELY-43?page=com.atlassian.jira.plugin.sys... ]
David Lloyd commented on ELY-43:
--------------------------------
Also maybe:
* Pluggable alphabet support
> Clean up Base64
> ---------------
>
> Key: ELY-43
> URL: https://issues.jboss.org/browse/ELY-43
> Project: WildFly Elytron
> Issue Type: Feature Request
> Security Level: Public(Everyone can see)
> Components: Utils
> Reporter: Darran Lofthouse
> Assignee: Farah Juma
> Fix For: 1.0.0.Beta1
>
>
> The Base64 implementation has been split out of PasswordUtils some additional steps are needed to finish cleaning it up: -
> - Look at switching to input and output streams instead of the custom iterators it is using.
> - Consider the ByteStringBuilder from SASL
> - As potentially more visible ensure clearer method names.
> - Ensure adequate javadoc and cross referencing of standards supported.
> e.g. If we implement an RFC ensure the number is referenced.
> - Testing of each variant
> - Consider optional support, e.g. decoding a padded String
> - Go beyond testing we can decode what we encode and ensure pre-encoded values can be handled adequately.
> Any other clean up here that seems relevant.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (ELY-41) Password Recognition and Parsing Framework
by David Lloyd (JIRA)
[ https://issues.jboss.org/browse/ELY-41?page=com.atlassian.jira.plugin.sys... ]
David Lloyd commented on ELY-41:
--------------------------------
I want to point out that PasswordUtils doesn't support *all* password types, only UNIX modular crypt style password types (with a couple of extensions). There is no single standard for password string representation, so it doesn't make sense to have a single class unless you want to enter into heuristic behavior, which is a definite change from the current PasswordUtil class which is very deterministic.
> Password Recognition and Parsing Framework
> ------------------------------------------
>
> Key: ELY-41
> URL: https://issues.jboss.org/browse/ELY-41
> Project: WildFly Elytron
> Issue Type: Task
> Security Level: Public(Everyone can see)
> Components: API / SPI
> Reporter: Darran Lofthouse
> Assignee: Darran Lofthouse
> Fix For: 1.0.0.Beta1
>
>
> I don't think having a single PasswordUtils that recognises and parses all password types is going to be a good idea long term - I think a lot of the responsibility for what is supported needs to come from the realm.
> A scenario I am thinking is an LDAP server is configured to support clear text passwords, that server verifies the strength of the password before letting a user set it - this could be circumvented by setting the password value to something we would parse as one of the other password types. The problem is the user could just hash 'password' - this would pass the LDAP servers dictionary attack check.
> The second issue is that different formats could be realm specific, e.g. LDAP supports trival digests in formats slightly different to those we already support.
> One idea I am starting to think about it a password parser that a realm can build up with a set of supported password types, working on LDAP it is apparent realms potentially need configuration for the credential types they will claim to support before the RealmIdentity is identified so not a major deviation from the work I am already finding necessary.
> Looking at the current PasswordUtils.java the following public utility methods are exposed: -
> {code}
> org.wildfly.security.password.PasswordUtils
> org.wildfly.security.password.PasswordUtils.identifyAlgorithm(char[])
> org.wildfly.security.password.PasswordUtils.identifyAlgorithm(String)
> org.wildfly.security.password.PasswordUtils.getCryptStringChars(PasswordSpec)
> org.wildfly.security.password.PasswordUtils.getCryptString(PasswordSpec)
> org.wildfly.security.password.PasswordUtils.parseCryptString(String)
> org.wildfly.security.password.PasswordUtils.parseCryptString(char[])
> {code}
> From the perspective of a realm the primary task I am trying to achieve is to take a password string and convert it to a PasswordSpec. Algorithm identification seems to be used primarily by tests, not convinced it is justified in an API.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (JBJCA-1210) Add system property to disable delistResource call
by Jesper Pedersen (JIRA)
Jesper Pedersen created JBJCA-1210:
--------------------------------------
Summary: Add system property to disable delistResource call
Key: JBJCA-1210
URL: https://issues.jboss.org/browse/JBJCA-1210
Project: IronJacamar
Issue Type: Task
Security Level: Public (Everyone can see)
Components: Core
Reporter: Jesper Pedersen
Assignee: Jesper Pedersen
Fix For: 1.1.8.Final, 1.2.0.CR1
Provide
* ironjacamar.no_delist_resource for individual pools
* ironjacamar.no_delist_resource_all for all pools
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (WFLY-3724) Batch jobs don't receive partition-specific parameters
by Enrique González Martínez (JIRA)
[ https://issues.jboss.org/browse/WFLY-3724?page=com.atlassian.jira.plugin.... ]
Enrique González Martínez reassigned WFLY-3724:
-----------------------------------------------
Assignee: Enrique González Martínez (was: Jason Greene)
> Batch jobs don't receive partition-specific parameters
> ------------------------------------------------------
>
> Key: WFLY-3724
> URL: https://issues.jboss.org/browse/WFLY-3724
> Project: WildFly
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Batch
> Affects Versions: 8.1.0.Final
> Environment: Windows 7 Home Premium Service Pack 1 64-bit + JDK8u11 + WildFly 8.1.0 Final
> Reporter: Ari Silvan
> Assignee: Enrique González Martínez
>
> When defining a batch job chunk step to run as partitions, ItemReader doesn't receive the partition-specific parameters specified by an implementation of the PartitionPlan interface. Parameters are null. See steps to reproduce for further details.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months
[JBoss JIRA] (JGRP-1873) UNICAST2: unilateral connection close of receiver can lead to missing seqnos in sender
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1873?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-1873:
---------------------------
Description:
In {{UNICAST2}}, if we have a connection between sender A and receiver B, and B closes the connection (but not A), then A can end up with missing messages in its send table.
Example:
* A sends messages to B
* A has an entry for B in its send-table: {{B: 10|20}} (lowest sent=10, highest sent=20)
* B has an entry for A in its recv-table: {{A: 10|20}} (lowest received=10, highest received=20)
* B now gets a view that doesn't contain A and closes its connection to A
** This results in B's connection to A getting removed
* A now sends message {{A::21}}
* B doesn't find an entry in its recv-table for A and sends {{GET-FIRST-SEQNO}} to A
* A receives the request and sends message {{A::11 first}} - {{A:21}} to B. These messages are sent unreliably, so they can get dropped. Let's assume (for this example) that some of them are dropped.
* B does receive {{A::11 first}} and creates an entry for A in its recv-table: {{A: 11|21}} (next to be received is {{A:12}})
* Now a spurious {{STABLE(A::15)}} message by B is received by A
** This can happen when B sent the {{STABLE}} message *before* its connection to A was removed, but the message was delayed, e.g. by garbage collection
** Note that the connection ID ({{conn-id}} is the same, so A will _not_ reject the {{STABLE}} message by B
* A receives the {{STABLE}} message and purges elements up to 15, so its new entry for B is: {{B:: 15|21}}
* When B asks A for retransmission of messages {{A::12}} - {{A:21}}, A can only retransmit messages 16-21, but *not* {{A::12}} - {{A:15}} !
Depending on which messages from A (which it sent unreliably on reception of {{GET-FIRST-SEQNO}}) were received by B, there would be never-ending retransmission requests from B to A for all or some messages in {{A[12..15]}}, e.g.
{noformat}
WARN [org.jgroups.protocols.UNICAST2] A: (requester=B) message B::13 not found in retransmission table of B:
[15 | 15 | 22] (X elements, Y missing)
{noformat}
h5. Reordering of STABLE messages
In the worst case, as {{STABLE}} messages are not sent reliably and can therefore get dropped or reordered, if A gets another {{STABLE(10)}} message after the {{STABLE(15)}} message, the error message above would look like this:
{noformat}
WARN [org.jgroups.protocols.UNICAST2] A: (requester=B) message B::13 not found in retransmission table of B:
[10 | 10 | 22] (X elements, Y missing)
{noformat}
Note that, with https://issues.jboss.org/browse/JGRP-1872 fixed, this cannot occur anymore.
h5. Solution
There's no real solution but to upgrade to {{UNICAST3}}: when {{UNICAST3}} receives a view, it doesn't _remove_ receive (and send) connections immediately, but merely marks them as _closed_. The connection will only be removed after {{conn_close_timeout}} ms. If B therefore gets further messages from A, it will simply mark the receive connection as _open_ and doesn't need to send a {{GET-FIRST-SEQNO}} message to A as it still has all of A's messages.
We could think of a connection establishment and teardown protocol used by all of the unicast protocols, which establishes connections similar to TCP. Senders would block until a connection is established etc and new conn-ids would be created, plus the current send- and receive- seqnos would be exchanged. This could also be used as a second line of defense, to re-establish the connection when a sender doesn't find messages requested for retransmission by a receiver. As an alternative, we could create a new protocol which syncs a receive table with a sender, e.g. https://issues.jboss.org/browse/JGRP-1875.
To mitigate the above issue, {{FD_ALL}} rather than {{FD}} should be used, so that members suspect each other more or less at the same time. This is not the case with FD, where multiple hung (or GC'ing) members take N * timeout time to suspect. With {{FD_ALL}}, chances are that A and B suspect each other and later, both establish a new connection.
was:
In {{UNICAST2}}, if we have a connection between sender A and receiver B, and B closes the connection (but not A), then A can end up with missing messages in its send table.
Example:
* A sends messages to B
* A has an entry for B in its send-table: {{B: 10|20}} (lowest sent=10, highest sent=20)
* B has an entry for A in its recv-table: {{A: 10|20}} (lowest received=10, highest received=20)
* B now gets a view that doesn't contain A and closes its connection to A
** This results in B's connection to A getting removed
* A now sends message {{A::21}}
* B doesn't find an entry in its recv-table for A and sends {{GET-FIRST-SEQNO}} to A
* A receives the request and sends message {{A::11 first}} - {{A:21}} to B. These messages are sent unreliably, so they can get dropped. Let's assume (for this example) that some of them are dropped.
* B does receive {{A::11 first}} and creates an entry for A in its recv-table: {{A: 11|21}} (next to be received is {{A:12}})
* Now a spurious {{STABLE(A::15)}} message by B is received by A
** This can happen when B sent the {{STABLE}} message *before* its connection to A was removed, but the message was delayed, e.g. by garbage collection
** Note that the connection ID ({{conn-id}} is the same, so A will _not_ reject the {{STABLE}} message by B
* A receives the {{STABLE}} message and purges elements up to 15, so its new entry for B is: {{B:: 15|21}}
* When B asks A for retransmission of messages {{A::12}} - {{A:21}}, A can only retransmit messages 16-21, but *not* {{A::12}} - {{A:15}} !
Depending on which messages from A (which it sent unreliably on reception of {{GET-FIRST-SEQNO}}) were received by B, there would be never-ending retransmission requests from B to A for all or some messages in {{A[12..15]}}, e.g.
{noformat}
WARN [org.jgroups.protocols.UNICAST2] A: (requester=B) message B::13 not found in retransmission table of B:
[15 | 15 | 22] (X elements, Y missing)
{noformat}
h5. Reordering of STABLE messages
In the worst case, as {{STABLE}} messages are not sent reliably and can therefore get dropped or reordered, if A gets another {{STABLE(10)}} message after the {{STABLE(15)}} message, the error message above would look like this:
{noformat}
WARN [org.jgroups.protocols.UNICAST2] A: (requester=B) message B::13 not found in retransmission table of B:
[10 | 10 | 22] (X elements, Y missing)
{noformat}
Note that, with https://issues.jboss.org/browse/JGRP-1872 fixed, this cannot occur anymore.
h5. Solution
There's no real solution but to upgrade to {{UNICAST3}}: when {{UNICAST3}} receives a view, it doesn't _remove_ receive (and send) connections immediately, but merely marks them as _closed_. The connection will only be removed after {{conn_close_timeout}} ms. If B therefore gets further messages from A, it will simply mark the receive connection as _open_ and doesn't need to send a {{GET-FIRST-SEQNO}} message to A as it still has all of A's messages.
We could think of a connection establishment and teardown protocol used by all of the unicast protocols, which establishes connections similar to TCP. Senders would block until a connection is established etc and new conn-ids would be created, plus the current send- and receive- seqnos would be exchanged. This could also be used as a second line of defense, to re-establish the connection when a sender doesn't find messages requested for retransmission by a receiver.
To mitigate the above issue, {{FD_ALL}} rather than {{FD}} should be used, so that members suspect each other more or less at the same time. This is not the case with FD, where multiple hung (or GC'ing) members take N * timeout time to suspect. With {{FD_ALL}}, chances are that A and B suspect each other and later, both establish a new connection.
> UNICAST2: unilateral connection close of receiver can lead to missing seqnos in sender
> --------------------------------------------------------------------------------------
>
> Key: JGRP-1873
> URL: https://issues.jboss.org/browse/JGRP-1873
> Project: JGroups
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 3.5
>
>
> In {{UNICAST2}}, if we have a connection between sender A and receiver B, and B closes the connection (but not A), then A can end up with missing messages in its send table.
> Example:
> * A sends messages to B
> * A has an entry for B in its send-table: {{B: 10|20}} (lowest sent=10, highest sent=20)
> * B has an entry for A in its recv-table: {{A: 10|20}} (lowest received=10, highest received=20)
> * B now gets a view that doesn't contain A and closes its connection to A
> ** This results in B's connection to A getting removed
> * A now sends message {{A::21}}
> * B doesn't find an entry in its recv-table for A and sends {{GET-FIRST-SEQNO}} to A
> * A receives the request and sends message {{A::11 first}} - {{A:21}} to B. These messages are sent unreliably, so they can get dropped. Let's assume (for this example) that some of them are dropped.
> * B does receive {{A::11 first}} and creates an entry for A in its recv-table: {{A: 11|21}} (next to be received is {{A:12}})
> * Now a spurious {{STABLE(A::15)}} message by B is received by A
> ** This can happen when B sent the {{STABLE}} message *before* its connection to A was removed, but the message was delayed, e.g. by garbage collection
> ** Note that the connection ID ({{conn-id}} is the same, so A will _not_ reject the {{STABLE}} message by B
> * A receives the {{STABLE}} message and purges elements up to 15, so its new entry for B is: {{B:: 15|21}}
> * When B asks A for retransmission of messages {{A::12}} - {{A:21}}, A can only retransmit messages 16-21, but *not* {{A::12}} - {{A:15}} !
> Depending on which messages from A (which it sent unreliably on reception of {{GET-FIRST-SEQNO}}) were received by B, there would be never-ending retransmission requests from B to A for all or some messages in {{A[12..15]}}, e.g.
> {noformat}
> WARN [org.jgroups.protocols.UNICAST2] A: (requester=B) message B::13 not found in retransmission table of B:
> [15 | 15 | 22] (X elements, Y missing)
> {noformat}
> h5. Reordering of STABLE messages
> In the worst case, as {{STABLE}} messages are not sent reliably and can therefore get dropped or reordered, if A gets another {{STABLE(10)}} message after the {{STABLE(15)}} message, the error message above would look like this:
> {noformat}
> WARN [org.jgroups.protocols.UNICAST2] A: (requester=B) message B::13 not found in retransmission table of B:
> [10 | 10 | 22] (X elements, Y missing)
> {noformat}
> Note that, with https://issues.jboss.org/browse/JGRP-1872 fixed, this cannot occur anymore.
> h5. Solution
> There's no real solution but to upgrade to {{UNICAST3}}: when {{UNICAST3}} receives a view, it doesn't _remove_ receive (and send) connections immediately, but merely marks them as _closed_. The connection will only be removed after {{conn_close_timeout}} ms. If B therefore gets further messages from A, it will simply mark the receive connection as _open_ and doesn't need to send a {{GET-FIRST-SEQNO}} message to A as it still has all of A's messages.
> We could think of a connection establishment and teardown protocol used by all of the unicast protocols, which establishes connections similar to TCP. Senders would block until a connection is established etc and new conn-ids would be created, plus the current send- and receive- seqnos would be exchanged. This could also be used as a second line of defense, to re-establish the connection when a sender doesn't find messages requested for retransmission by a receiver. As an alternative, we could create a new protocol which syncs a receive table with a sender, e.g. https://issues.jboss.org/browse/JGRP-1875.
> To mitigate the above issue, {{FD_ALL}} rather than {{FD}} should be used, so that members suspect each other more or less at the same time. This is not the case with FD, where multiple hung (or GC'ing) members take N * timeout time to suspect. With {{FD_ALL}}, chances are that A and B suspect each other and later, both establish a new connection.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 2 months