[JBoss JIRA] (WFCORE-87) Display deployment timestamp
by Claudio Miranda (JIRA)
Claudio Miranda created WFCORE-87:
-------------------------------------
Summary: Display deployment timestamp
Key: WFCORE-87
URL: https://issues.jboss.org/browse/WFCORE-87
Project: WildFly Core
Issue Type: Enhancement
Components: Server
Affects Versions: 1.0.0.Alpha5
Reporter: Claudio Miranda
Assignee: Jason Greene
Priority: Minor
Display the deployment timestamp, that is the date of last modified deployment. It is useful for users to see the date and time of deployments.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (JGRP-1876) MERGE3 : Strange number and content of subgroups
by Karim AMMOUS (JIRA)
[ https://issues.jboss.org/browse/JGRP-1876?page=com.atlassian.jira.plugin.... ]
Karim AMMOUS edited comment on JGRP-1876 at 9/5/14 10:30 AM:
-------------------------------------------------------------
Below a concrete example of a strange merge occurred yesterday. It concerns only 8 members and Log level of class "Merger" was TRACE.
At beginning : {A, B, C, D, E, F} and {G, H}
Then {A, B, C, D} lost E and F => {A, B, C, D} , {A, B, C, D, E, F} and {G, H}
Views are :
View on A, B, C and D: [A|3] (4) {A, B, C, D}
View on E, F: [A|2] (6) {A, B, C, D, E, F}
View on G and H: [G|1] (2) {G, H}
Then a merge has been processed on 4 subgroups:
[A|3] (4) [A, B, C, D]
[A|2] (1) [G]
[A|2] (1) [H]
[G|1] (2) {G, H}
G was the merge leader and the new coord :MergeView::[G|4] (8) [G, A, B, C, D, E, F, H]
Please find enclosed all members logs.
was (Author: ammous):
Below a concrete example of a strange merge occurred yesterday. It concerns only 8 members and Log level of class "Merger" was TRACE.
At beginning : {A, B, C, D, E, F} and {G, H}
Then {A, B, C, D} lost E and F => {A, B, C, D} , {A, B, C, D, E, F} and {G, H}
Views are :
View on A, B, C and D: [A|3] (4) {A, B, C, D}
View on E, F: [A|2] (6) {A, B, C, D, E, F}
View on G and H: [G|1] (2) {G, H}
Then a merge has been processed on 4 subgroups:
[A|3] (4) [A, B, C, D]
[A|2] (1) [G]
[A|2] (1) [H]
[G|1] (2) {G, H}
G was the merge leader and the new coord :MergeView::[G|4] (8) [G, A, B, C, D, E, F, H]
Please find enclosed all members logs.
> MERGE3 : Strange number and content of subgroups
> ------------------------------------------------
>
> Key: JGRP-1876
> URL: https://issues.jboss.org/browse/JGRP-1876
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.4.2
> Reporter: Karim AMMOUS
> Assignee: Bela Ban
> Fix For: 3.6
>
> Attachments: 4Subgroups.zip, DkeJgrpAddress.java, MergeViewWith210Subgroups.log
>
>
> Using JGroups 3.4.2, a split occurred and a merge was processed successfully but number of subgroups is wrong (210 instead of 2).
> The final mergeView is correct and contains 210 members.
> Here is an extract of subviews:
> {code}
> INFO | Incoming-18,cluster,term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F] | [MyMembershipListener.java:126] | (middleware) | MergeView view ID = [serv-ZM2BU35940-58033:vt-14:192.168.55.55:1:CL(GROUP01)[F]|172]
> 210 subgroups
> [....
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [term-ETJ104215245-11092:host:192.168.56.72:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU38960-6907:asb:192.168.55.52:1:CL(GROUP01)[F]]
> [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]|171] (1) [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU47533-55240:vt-14:192.168.55.57:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU35943-49435:asb:192.168.55.51:1:CL(GROUP01)[F]]
> ....]
> {code}
> II wasn't able to reproduce that with a simple program. But I observed that merge was preceded by an ifdown/ifup on host 192.168.56.6. That member lost all others members, but it still present in their view.
> Example:
> {code}
> {A, B, C} => {A, B, C} and {C} => {A, B, C}
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (JGRP-1876) MERGE3 : Strange number and content of subgroups
by Karim AMMOUS (JIRA)
[ https://issues.jboss.org/browse/JGRP-1876?page=com.atlassian.jira.plugin.... ]
Karim AMMOUS updated JGRP-1876:
-------------------------------
Attachment: 4Subgroups.zip
4 subgroups logs
> MERGE3 : Strange number and content of subgroups
> ------------------------------------------------
>
> Key: JGRP-1876
> URL: https://issues.jboss.org/browse/JGRP-1876
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.4.2
> Reporter: Karim AMMOUS
> Assignee: Bela Ban
> Fix For: 3.6
>
> Attachments: 4Subgroups.zip, DkeJgrpAddress.java, MergeViewWith210Subgroups.log
>
>
> Using JGroups 3.4.2, a split occurred and a merge was processed successfully but number of subgroups is wrong (210 instead of 2).
> The final mergeView is correct and contains 210 members.
> Here is an extract of subviews:
> {code}
> INFO | Incoming-18,cluster,term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F] | [MyMembershipListener.java:126] | (middleware) | MergeView view ID = [serv-ZM2BU35940-58033:vt-14:192.168.55.55:1:CL(GROUP01)[F]|172]
> 210 subgroups
> [....
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [term-ETJ104215245-11092:host:192.168.56.72:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU38960-6907:asb:192.168.55.52:1:CL(GROUP01)[F]]
> [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]|171] (1) [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU47533-55240:vt-14:192.168.55.57:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU35943-49435:asb:192.168.55.51:1:CL(GROUP01)[F]]
> ....]
> {code}
> II wasn't able to reproduce that with a simple program. But I observed that merge was preceded by an ifdown/ifup on host 192.168.56.6. That member lost all others members, but it still present in their view.
> Example:
> {code}
> {A, B, C} => {A, B, C} and {C} => {A, B, C}
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (JGRP-1876) MERGE3 : Strange number and content of subgroups
by Karim AMMOUS (JIRA)
[ https://issues.jboss.org/browse/JGRP-1876?page=com.atlassian.jira.plugin.... ]
Karim AMMOUS commented on JGRP-1876:
------------------------------------
Below a concrete example of a strange merge occurred yesterday. It concerns only 8 members and Log level of class "Merger" was TRACE.
At beginning : {A, B, C, D, E, F} and {G, H}
Then {A, B, C, D} lost E and F => {A, B, C, D} , {A, B, C, D, E, F} and {G, H}
Views are :
View on A, B, C and D: [A|3] (4) {A, B, C, D}
View on E, F: [A|2] (6) {A, B, C, D, E, F}
View on G and H: [G|1] (2) {G, H}
Then a merge has been processed on 4 subgroups:
[A|3] (4) [A, B, C, D]
[A|2] (1) [G]
[A|2] (1) [H]
[G|1] (2) {G, H}
G was the merge leader and the new coord :MergeView::[G|4] (8) [G, A, B, C, D, E, F, H]
Please find enclosed all members logs.
> MERGE3 : Strange number and content of subgroups
> ------------------------------------------------
>
> Key: JGRP-1876
> URL: https://issues.jboss.org/browse/JGRP-1876
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.4.2
> Reporter: Karim AMMOUS
> Assignee: Bela Ban
> Fix For: 3.6
>
> Attachments: DkeJgrpAddress.java, MergeViewWith210Subgroups.log
>
>
> Using JGroups 3.4.2, a split occurred and a merge was processed successfully but number of subgroups is wrong (210 instead of 2).
> The final mergeView is correct and contains 210 members.
> Here is an extract of subviews:
> {code}
> INFO | Incoming-18,cluster,term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F] | [MyMembershipListener.java:126] | (middleware) | MergeView view ID = [serv-ZM2BU35940-58033:vt-14:192.168.55.55:1:CL(GROUP01)[F]|172]
> 210 subgroups
> [....
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [term-ETJ104215245-11092:host:192.168.56.72:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU38960-6907:asb:192.168.55.52:1:CL(GROUP01)[F]]
> [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]|171] (1) [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU47533-55240:vt-14:192.168.55.57:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU35943-49435:asb:192.168.55.51:1:CL(GROUP01)[F]]
> ....]
> {code}
> II wasn't able to reproduce that with a simple program. But I observed that merge was preceded by an ifdown/ifup on host 192.168.56.6. That member lost all others members, but it still present in their view.
> Example:
> {code}
> {A, B, C} => {A, B, C} and {C} => {A, B, C}
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (WFCORE-86) Rare fail of org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestCase due to TimeoutException
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-86?page=com.atlassian.jira.plugin.... ]
Brian Stansberry moved WFLY-3815 to WFCORE-86:
----------------------------------------------
Project: WildFly Core (was: WildFly)
Key: WFCORE-86 (was: WFLY-3815)
Affects Version/s: (was: 8.1.0.Final)
Component/s: Domain Management
(was: Domain Management)
> Rare fail of org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestCase due to TimeoutException
> ------------------------------------------------------------------------------------------------------------
>
> Key: WFCORE-86
> URL: https://issues.jboss.org/browse/WFCORE-86
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Reporter: Dominik Pospisil
> Assignee: Dominik Pospisil
>
> java.lang.Exception: Http request failed.
> at org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestCase.checkURL(RolloutPlanTestCase.java:423)
> at org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestCase.testRollbackAcrossGroupsRolloutPlan(RolloutPlanTestCase.java:322)
> ...
> Caused by: java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask.get(FutureTask.java:201)
> at org.jboss.as.test.integration.common.HttpRequest.execute(HttpRequest.java:50)
> at org.jboss.as.test.integration.common.HttpRequest.get(HttpRequest.java:80)
> at org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestCase.checkURL(RolloutPlanTestCase.java:420)
> ... 43 more
> Standard Output
> [Host Controller] [0m[0m14:49:48,985 INFO [org.jboss.as.repository] (management-handler-thread - 3) JBAS014900: Content added at location /mnt/hudson_workspace/workspace/eap-6x-as-testsuite-RHEL-matrix-OracleJDK7/378ed68b/jboss-eap-6.3-src/testsuite/domain/target/domains/CLITestSuite/master/data/content/65/679323a365c7fcc3e6453a9fa3e114cfcd7ecb/content[0m
> [Server:main-two] 14:49:49,579 INFO [org.jboss.as.server.deployment] (MSC service thread 1-2) JBAS015876: Starting deployment of "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:test-one] 14:49:49,583 INFO [org.jboss.as.server.deployment] (MSC service thread 1-4) JBAS015876: Starting deployment of "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:main-one] 14:49:49,586 INFO [org.jboss.as.server.deployment] (MSC service thread 1-5) JBAS015876: Starting deployment of "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:main-three] 14:49:49,698 INFO [org.jboss.as.server.deployment] (MSC service thread 1-1) JBAS015876: Starting deployment of "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:main-one] 14:49:49,696 INFO [org.jboss.web] (ServerService Thread Pool -- 73) JBAS018210: Register web context: /RolloutPlanTestCase
> [Server:main-one] 14:49:49,710 INFO [org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestServlet] (ServerService Thread Pool -- 73) RolloutServlet initialized: 1401389389710
> [Server:main-three] 14:49:49,791 INFO [org.jboss.web] (ServerSe[Server:main-two] 14:49:49,762 INFO [org.jboss.web] (ServerService Thread Pool -- 58) JBAS018210: Register web context: /RolloutPlanTestCase
> [Server:main-two] 14:49:49,788 INFO [org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestServlet] (ServerService Thread Pool -- 58) RolloutServlet initialized: 1401389389786
> [Server:test-one] 14:49:49,784 INFO [org.jboss.web] (ServerService Thread Pool -- 21) JBAS018210: Register web context: /RolloutPlanTestCase
> rvice Thread Pool -- 67) JBAS018210: Register web context: /RolloutPlanTestCase
> [Server:main-three] 14:49:49,804 INFO [org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestServlet] (ServerService Thread Pool -- 67) RolloutServlet initialized: 1401389389803
> [Server:test-one] 14:49:49,820 INFO [org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestServlet] (ServerService Thread Pool -- 21) RolloutServlet initialized: 1401389389820
> [Server:other-two] 14:49:49,891 INFO [org.jboss.as.server.deployment] (MSC service thread 1-1) JBAS015876: Starting deployment of "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:other-two] 14:49:50,826 INFO [org.jboss.web] (ServerService Thread Pool -- 71) JBAS018210: Register web context: /RolloutPlanTestCase
> [Server:other-two] 14:49:50,845 INFO [org.jboss.as.test.integration.domain.management.cli.RolloutPlanTestServlet] (ServerService Thread Pool -- 71) RolloutServlet initialized: 1401389390845
> [Server:main-two] 14:49:51,172 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018559: Deployed "RolloutPlanTestCase.war" (runtime-name : "RolloutPlanTestCase.war")
> [Server:test-one] 14:49:51,174 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018559: Deployed "RolloutPlanTestCase.war" (runtime-name : "RolloutPlanTestCase.war")
> [Server:main-one] 14:49:51,169 INF[Server:other-two] 14:49:51,180 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018559: Deployed "RolloutPlanTestCase.war" (runtime-name : "RolloutPlanTestCase.war")
> [Server:main-three] 14:49:51,180 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018559: Deployed "RolloutPlanTestCase.war" (runtime-name : "RolloutPlanTestCase.war")
> O [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018559: Deployed "RolloutPlanTestCase.war" (runtime-name : "RolloutPlanTestCase.war")
> [Server:main-two] 14:50:11,333 INFO [org.jboss.web] (ServerService Thread Pool -- 66) JBAS018224: Unregister web context: /RolloutPlanTestCase
> [Server:test-one] 14:50:11,346 INFO [org.jboss.web] (ServerService Thread Pool -- 58) JBAS018224: Unregister web context: /RolloutPlanTestCase
> [Server:main-three] 14:50:11,376 INFO [org.jboss.web] (ServerService Thread Pool -- 69) JBAS018224: Unregister web context: /RolloutPlanTestCase
> [Server:main-two] 14:50:11,388 INFO [org.jboss.as.server.deployment] (MSC service thread 1-7) JBAS015877: Stopped deployment RolloutPlanTestCase.war (runtime-name: RolloutPlanTestCase.war) in 62ms
> [Server:test-one] 14:50:11,397 INFO [org.jboss.as.server.deployment] (MSC service thread 1-7) JBAS015877: Stopped deployment RolloutPlanTestCase.war (runtime-name: RolloutPlanTestCase.war) in 59ms
> [Server:main-three] 14:50:11,548 INFO [org.jboss.as.server.deployment] (MSC service thread 1-8) JBAS015877: Stopped deployment RolloutPlanTestCase.war (runtime-name: RolloutPlanTestCase.war) in 175ms
> [Server:other-two] 14:50:11,634 INFO [org.jboss.web] (ServerService Thread Pool -- 78) JBAS018224: Unregister web context: /RolloutPlanTestCase
> [Server:other-two] 14:50:11,704 INFO [org.jboss.as.server.deployment] (MSC service thread 1-3) JBAS015877: Stopped deployment RolloutPlanTestCase.war (runtime-name: RolloutPlanTestCase.war) in 132ms
> [Server:main-one] 14:50:11,748 INFO [org.jboss.web] (ServerService Thread Pool -- 78) JBAS018224: Unregister web context: /RolloutPlanTestCase
> [Server:main-one] 14:50:11,763 INFO [org.jboss.as.server.deployment] (MSC service thread 1-3) JBAS015877: Stopped deployment RolloutPlanTestCase.war (runtime-name: RolloutPlanTestCase.war) in 385ms
> [Server:main-three] 14:50:12,225 INFO [org.jboss.as.repository] (host-controller-connection-threads - 1) JBAS014901: Content removed from location /mnt/hudson_workspace/workspace/eap-6x-as-testsuite-RHEL-matrix-OracleJDK7/378ed68b/jboss-eap-6.3-src/testsuite/domain/target/domains/CLITestSuite/slave/data/servers/main-three/content/65/679323a365c7fcc3e6453a9fa3e114cfcd7ecb/content
> [Server:main-three] 14:50:12,226 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018558: Undeployed "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:test-one] 14:50:12,217 INFO [org.jboss.as.repository] (host-controller-connection-threads - 1) JBAS014901: Content removed from location /mnt/hudson_workspace/workspace/eap-6x-as-testsuite-RHEL-matrix-OracleJDK7/378ed68b/jboss-eap-6.3-src/testsuite/domain/target/domains/CLITestSuite/master/servers/test-one/data/content/65/679323a365c7fcc3e6453a9fa3e114cfcd7ecb/content
> [Server:test-one] 14:50:12,218 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018558: Undeployed "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:main-two] 14:50:12,218 INFO [org.jboss.as.repository] (host-controller-connection-threads - 1) JBAS014901: Content removed from location /mnt/hudson_workspace/workspace/eap-6x-as-testsuite-RHEL-matrix-OracleJDK7/378ed68b/jboss-eap-6.3-src/testsuite/domain/target/domains/CLITestSuite/master/servers/main-two/data/content/65/679323a365c7fcc3e6453a9fa3e114cfcd7ecb/content
> [Server:main-two] 14:50:12,219 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018558: Undeployed "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:main-one] 14:50:12,221 INFO [org.jboss.as.repository] (host-controller-connection-threads - 1) JBAS014901: Content removed from location /mnt/hudson_workspace/workspace/eap-6x-as-testsuite-RHEL-matrix-OracleJDK7/378ed68b/jboss-eap-6.3-src/testsuite/domain/target/domains/CLITestSuite/master/servers/main-one/data/content/65/679323a365c7fcc3e6453a9fa3e114cfcd7ecb/content
> [Server:main-one] 14:50:12,223 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018558: Undeployed "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Server:other-two] 14:50:12,237 INFO [org.jboss.as.repository] (host-controller-connection-threads - 1) JBAS014901: Content removed from location /mnt/hudson_workspace/workspace/eap-6x-as-testsuite-RHEL-matrix-OracleJDK7/378ed68b/jboss-eap-6.3-src/testsuite/domain/target/domains/CLITestSuite/slave/data/servers/other-two/content/65/679323a365c7fcc3e6453a9fa3e114cfcd7ecb/content
> [Server:other-two] 14:50:12,252 INFO [org.jboss.as.server] (host-controller-connection-threads - 1) JBAS018558: Undeployed "RolloutPlanTestCase.war" (runtime-name: "RolloutPlanTestCase.war")
> [Host Controller] [0m[0m14:50:12,261 INFO [org.jboss.as.repository] (management-handler-thread - 4) JBAS014901: Content removed from location /mnt/hudson_workspace/workspace/eap-6x-as-testsuite-RHEL-matrix-OracleJDK7/378ed68b/jboss-eap-6.3-src/testsuite/domain/target/domains/CLITestSuite/master/data/content/65/679323a365c7fcc3e6453a9fa3e114cfcd7ecb/content[0m
> ERROR [org.jboss.as.cli.CommandContext] {
> "outcome" => "failed",
> "failure-description" => {"domain-failure-description" => "JBAS014807: Management resource '[
> (\"socket-binding-group\" => \"standard-sockets\"),
> (\"socket-binding\" => \"test-binding\")
> ]' not found"},
> "rolled-back" => true
> }
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (JGRP-1875) UNICAST3/UNICAST2: sync receiver table with sender table
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1875?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-1875:
--------------------------------
Perhaps we should drop this altogether and only implement JGRP-1874, which reduces traffic caused by {{GET-FIRST-SEQNO}} and sending of the missing messages...
> UNICAST3/UNICAST2: sync receiver table with sender table
> --------------------------------------------------------
>
> Key: JGRP-1875
> URL: https://issues.jboss.org/browse/JGRP-1875
> Project: JGroups
> Issue Type: Enhancement
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 3.5.1, 3.6
>
>
> If a receiver B closes its recv-table and the sender A doesn't, then (when receiving msgs from the sender) the receiver engages in a protocol using {{GET-FIRST-SEQNO}} to sync itself with the sender. This has several problems, detailed in JGRP-1873 and JGRP-1874. (Note that the other way round (sender closing send-table), there is no issue, as the sender will create a new connection with a new {{conn-id}}).
> To prevent {{STABLE}} messages interfering with {{GET-FIRST-SEQNO}} messages (JGRP-1873), we could run an additional {{SYNC}} protocol round, e.g.
> * B needs to get the lowest and highest seqno sent from A
> * B sends a {{SYNC}} message to A (instead of a {{GET-FIRST-SEQNO}} message)
> * B sets a flag that discards all {{STABLE}} or {{ACK}} messages on reception of {{SYNC}}
> * B replies with a {{SYNC-OK}} containing the _lowest_ and _highest_ sent seqnos
> * B creates an entry for A with the lowest and highest seqnos
> * B sends a {{SYNC-ACK}} to A
> * A resets the flag and starts accepting {{STABLE}} / {{ACK}} messages from B again
> * A and B now use the usual protocols to retransmit missing messages
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (JGRP-1875) UNICAST3/UNICAST2: sync receiver table with sender table
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1875?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-1875:
--------------------------------
Hmm, there's a problem when a new connection is established:
* A (conn-id=1) sends A1(first)-A10 to B
* B receives A:3 first. Since this isn't tagged as {{first}}, B sends a SYNC to A
* A hasn't seen that SYNC yet, so it changes its {{conn-id}} to 3
* B now receives A:1(first) and creates the connection with {{conn-id}} == 2
* B also receives the other messages and adds them to the connections recv-window
* Now B receives the {{SYNC-OK}} message with {{conn-id}} == 3
** B now removes the recv-window for A and creates a new one with {{conn-id}} == 3
This race between the first message and SYNC happens because we cannot guarantee that the first message from A is always received _first_ by B.
I'm starting to dislike this whole SYNC business: we're fixing an edge case that happens 0.001% of the time, but introduce a complexity that's not warranted !
So what are the alternatives ?
h5. Do nothing (status quo) / first message creates the recv-window
* With {{conn-close-timeout}} set to a reasonably large value, JGRP-1873 can almost never happen
* The first message creates the recv-window
** If we get some messages before the first, they're dropped on the ground, but as soon as the first message is received (or retransmitted), the recv-window is created. Not too bad, as the first message is usually a JOIN or JOIN-RSP, and only then does message traffic start
h5. Use SYNC altogether (replace first message above)
* This means that the first N messages would always get dropped until a SYNC and subsequent SYNC-OK have been received
** This would be completely new logic, replacing a proven algorithm
h5. Provide connection establishment a la TCP (SYN / SYNC-ACK - ACK)
* Complex state model (including timeouts and dreaded states such as TIME-WAIT)
* Redundant if run over TCP as transport
* Blocking semantics: threads need to block until connection has been established
> UNICAST3/UNICAST2: sync receiver table with sender table
> --------------------------------------------------------
>
> Key: JGRP-1875
> URL: https://issues.jboss.org/browse/JGRP-1875
> Project: JGroups
> Issue Type: Enhancement
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 3.5.1, 3.6
>
>
> If a receiver B closes its recv-table and the sender A doesn't, then (when receiving msgs from the sender) the receiver engages in a protocol using {{GET-FIRST-SEQNO}} to sync itself with the sender. This has several problems, detailed in JGRP-1873 and JGRP-1874. (Note that the other way round (sender closing send-table), there is no issue, as the sender will create a new connection with a new {{conn-id}}).
> To prevent {{STABLE}} messages interfering with {{GET-FIRST-SEQNO}} messages (JGRP-1873), we could run an additional {{SYNC}} protocol round, e.g.
> * B needs to get the lowest and highest seqno sent from A
> * B sends a {{SYNC}} message to A (instead of a {{GET-FIRST-SEQNO}} message)
> * B sets a flag that discards all {{STABLE}} or {{ACK}} messages on reception of {{SYNC}}
> * B replies with a {{SYNC-OK}} containing the _lowest_ and _highest_ sent seqnos
> * B creates an entry for A with the lowest and highest seqnos
> * B sends a {{SYNC-ACK}} to A
> * A resets the flag and starts accepting {{STABLE}} / {{ACK}} messages from B again
> * A and B now use the usual protocols to retransmit missing messages
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (JGRP-1877) System.nanoTime() can be negative
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1877?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-1877:
--------------------------------
Hmm... that would set the interrupt flag coming out of this method even if only 1 of 10 iterations ended up in the catch clause. Not sure what the right semantics should be, e.g. if a cond.wait() terminated correctly, should {{intr}} be set back to false ?
> System.nanoTime() can be negative
> ---------------------------------
>
> Key: JGRP-1877
> URL: https://issues.jboss.org/browse/JGRP-1877
> Project: JGroups
> Issue Type: Bug
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 3.5.1, 3.6
>
>
> According to the javadoc, {{System.nanoTime()}} should only be used to measure _elapsed time_, but not compute a _target time in the future_, as {{nanoTime()}} might return a a time in the future.
> Code like the one below might fail:
> {code:title=Responses.waitFor()|borderStyle=solid}
> public boolean waitFor(long timeout) {
> long wait_time;
> final long target_time=System.nanoTime() + TimeUnit.NANOSECONDS.convert(timeout, TimeUnit.MILLISECONDS); // ns
> lock.lock();
> try {
> while(!done && (wait_time=target_time - System.nanoTime()) > 0) {
> try {
> cond.await(wait_time,TimeUnit.NANOSECONDS);
> }
> catch(InterruptedException e) {
> }
> }
> return done;
> }
> finally {
> lock.unlock();
> }
> }
> {code}
> When computing {{target_time}}, {{System.nanoTime()}} could return a negative value (numeric overflow) or a value in the future. In the first case, {{target_time}} could be negative, so the method would not block at all. In the latter case, {{target_time}} could be huge, so the method would block for a long time.
> Investigate all occurrences where we use {{nanoTime()}} to compute a time in the future, and see what impact a future value value could have. Possibly replace with {{System.currentTimeMillis()}} or the _time service_.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (DROOLS-593) KieSession persistence failure when ProcessInstance is kept in-memory
by Mariano De Maio (JIRA)
[ https://issues.jboss.org/browse/DROOLS-593?page=com.atlassian.jira.plugin... ]
Mariano De Maio updated DROOLS-593:
-----------------------------------
Attachment: persistence-issue-on-6.0.1.Final.zip
Attached project to test issue
> KieSession persistence failure when ProcessInstance is kept in-memory
> ---------------------------------------------------------------------
>
> Key: DROOLS-593
> URL: https://issues.jboss.org/browse/DROOLS-593
> Project: Drools
> Issue Type: Bug
> Affects Versions: 6.0.1.Final
> Environment: linux Ubunt 13.10
> Core i5 processor
> 4 gb RAM
> Reporter: Mariano De Maio
> Assignee: Mark Proctor
> Priority: Minor
> Fix For: 6.1.0.Final
>
> Attachments: persistence-issue-on-6.0.1.Final.zip
>
>
> This issue is intermitent. However, on a moderately quick processor, it happens at least once within 1000 process instances.
> The issue presents itself in version 6.0.1.Final but not in 6.1.0.Final. It happens when a ProcessInstance object is referenced before invoking the signalEvent on a specific process instance.
> Any clarification on the reasons for this problem would be appreciated. As for suggested fixes, I would recommend moving to version 6.1.0.Final. I'm reporting it so that others with similar problems can try a workaround
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month
[JBoss JIRA] (DROOLS-593) KieSession persistence failure when ProcessInstance is kept in-memory
by Mariano De Maio (JIRA)
Mariano De Maio created DROOLS-593:
--------------------------------------
Summary: KieSession persistence failure when ProcessInstance is kept in-memory
Key: DROOLS-593
URL: https://issues.jboss.org/browse/DROOLS-593
Project: Drools
Issue Type: Bug
Affects Versions: 6.0.1.Final
Environment: linux Ubunt 13.10
Core i5 processor
4 gb RAM
Reporter: Mariano De Maio
Assignee: Mark Proctor
Priority: Minor
Fix For: 6.1.0.Final
This issue is intermitent. However, on a moderately quick processor, it happens at least once within 1000 process instances.
The issue presents itself in version 6.0.1.Final but not in 6.1.0.Final. It happens when a ProcessInstance object is referenced before invoking the signalEvent on a specific process instance.
Any clarification on the reasons for this problem would be appreciated. As for suggested fixes, I would recommend moving to version 6.1.0.Final. I'm reporting it so that others with similar problems can try a workaround
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 1 month