[JBoss JIRA] Created: (JGRP-1006) Shared transport: shunned multiple channels can not reconnect
by Vladimir Blagojevic (JIRA)
Shared transport: shunned multiple channels can not reconnect
--------------------------------------------------------------
Key: JGRP-1006
URL: https://jira.jboss.org/jira/browse/JGRP-1006
Project: JGroups
Issue Type: Bug
Affects Versions: 2.6
Reporter: Vladimir Blagojevic
Assignee: Vladimir Blagojevic
Fix For: 2.6.11
Brian Stansberry said:
One of the EJB3/AS integration testsuite runs hit a JGroups problem last
night. *Looks like* some issue between the two AS nodes (127.0.0.1 and
127.0.0.2) in the test caused the channels in one node to be
suspected/reconnect. Which didn't go well -- which is the subject of
this thread.
I have the server logs from both, which record at INFO or above; if
anyone wants them, ping me. Here are the highlights:
1) Both nodes connect 4 channels as part of AS startup: 3 using a shared
transport UDP stack (for JBC, HAPartition, JBM) and one TCP one for JBM.
2) A bunch of test cases then execute, each deploys an EJB3 jar on each
node in turn, which in turn causes another JBC channel to be created and
connect at test start and disconnect and close at test end. This channel
uses the same shared transport as the other 3, so now there are four
channels using that shared transport.
3) This create/close cycle continues a while, until on one node there is
some issue that results in one of the 4 UDP channels suspecting its peer
and all 4 UDP channels getting a new view:
13:23:58,150 INFO [DefaultPartition] Suspected member: 127.0.0.1:40089
13:23:58,257 INFO [GroupMember]
org.jboss.messaging.core.impl.postoffice.GroupMember$ControlMembershipListener@4cfcb4de
got new view [127.0.0.2:42088|2] [127.0.0.2:42088], old view is
[127.0.0.1:40089|1] [127.0.0.1:40089, 127.0.0.2:42088]
13:23:58,257 INFO [GroupMember] I am (127.0.0.2:42088)
13:23:58,265 INFO [DefaultPartition] New cluster view for partition
DefaultPartition (id: 2, delta: -1) : [127.0.0.2:1099]
13:23:58,275 INFO [MessagingPostOffice] JBoss Messaging is failing over
for failed node 0. If there are many messages to reload this may take
some time...
13:23:58,442 INFO [DefaultPartition] I am (127.0.0.2:1099) received
membershipChanged event:
13:23:58,442 INFO [DefaultPartition] Dead members: 1 ([127.0.0.1:1099])
13:23:58,442 INFO [DefaultPartition] New Members : 0 ([])
13:23:58,442 INFO [DefaultPartition] All Members : 1 ([127.0.0.2:1099])
13:23:58,581 INFO [RPCManagerImpl] Received new cluster view:
[127.0.0.2:42088|2] [127.0.0.2:42088]
13:23:58,620 INFO [RPCManagerImpl] Received new cluster view:
[127.0.0.2:42088|2] [127.0.0.2:42088]
13:24:00,260 WARN [GMS] 127.0.0.2:42088 failed to collect all ACKs (1)
for mcasted view [127.0.0.2:42088|2] [127.0.0.2:42088] after 2000ms,
missing ACKs from [127.0.0.2:42088] (received=[]),
local_addr=127.0.0.2:42088
13:24:00,661 INFO [MessagingPostOffice] JBoss Messaging failover completed
13:24:00,662 INFO [GroupMember] Dead members: 1 ([127.0.0.1:40089])
13:24:00,662 INFO [GroupMember] All Members : 1 ([127.0.0.2:42088])
4) At some point the suspected node realizes it is suspected and the 4
UDP channels decide to reconnect, two at about the same time and the
other two w/in 55 ms of the first two:
13:24:39,294 WARN [FD] I was suspected by 127.0.0.2:42088; ignoring the
SUSPECT message and sending back a HEARTBEAT_ACK
13:24:39,295 WARN [FD] I was suspected by 127.0.0.2:42088; ignoring the
SUSPECT message and sending back a HEARTBEAT_ACK
13:24:39,315 ERROR [ConnectionTable] failed sending data to
127.0.0.2:7900: java.net.SocketException: Broken pipe
13:24:39,325 WARN [GMS] I (127.0.0.1:40089) am not a member of view
[127.0.0.2:42088|2] [127.0.0.2:42088], shunning myself and leaving the
group (prev_members are [127.0.0.1:40089, 127.0.0.2:42088], current view
is [127.0.0.1:40089|1] [127.0.0.1:40089, 127.0.0.2:42088])
13:24:39,325 WARN [GMS] I (127.0.0.1:40089) am not a member of view
[127.0.0.2:42088|2] [127.0.0.2:42088], shunning myself and leaving the
group (prev_members are [127.0.0.1:40089, 127.0.0.2:42088], current view
is [127.0.0.1:40089|1] [127.0.0.1:40089, 127.0.0.2:42088])
13:24:39,339 WARN [GMS] I (127.0.0.1:40089) am not a member of view
[127.0.0.2:42088|2] [127.0.0.2:42088], shunning myself and leaving the
group (prev_members are [127.0.0.1:40089, 127.0.0.2:42088], current view
is [127.0.0.1:40089|1] [127.0.0.1:40089, 127.0.0.2:42088])
13:24:39,359 WARN [FD] I was suspected by 127.0.0.2:42088; ignoring the
SUSPECT message and sending back a HEARTBEAT_ACK
13:24:39,360 WARN [FD] I was suspected by 127.0.0.2:42088; ignoring the
SUSPECT message and sending back a HEARTBEAT_ACK
13:24:39,380 WARN [GMS] I (127.0.0.1:40089) am not a member of view
[127.0.0.2:42088|2] [127.0.0.2:42088], shunning myself and leaving the
group (prev_members are [127.0.0.1:40089, 127.0.0.2:42088], current view
is [127.0.0.1:40089|1] [127.0.0.1:40089, 127.0.0.2:42088])
5) This is where things break down. It looks like one of the 4 channels
is able to get a local_addr, but the others get null. Perhaps a
concurrency bug in Configurator.startProtocolStack? (Note the first 2
lines below have to do with the independent TCP channel used by JBM; I
post them for completeness.)
13:24:39,568 ERROR [ConnectionTable] failed sending data to
127.0.0.2:7900: java.net.SocketException: Broken pipe
13:24:39,959 ERROR [ConnectionTable] failed sending data to
127.0.0.2:7900: java.net.SocketException: Broken pipe
13:24:40,833 WARN [GMS]
13:24:40,834 WARN [GMS]
13:24:41,362 INFO [STDOUT]
---------------------------------------------------------
GMS: address is 127.0.0.1:56000 (cluster=DefaultPartition-SFSBCache)
---------------------------------------------------------
13:24:41,362 FATAL [JChannel] local_addr is null; cannot connect
13:24:41,363 FATAL [JChannel] local_addr is null; cannot connect
13:24:41,365 ERROR [JChannel] failure reconnecting to channel, retrying
org.jgroups.ChannelException: local_addr is null
at org.jgroups.JChannel.startStack(JChannel.java:1631)
at org.jgroups.JChannel.connect(JChannel.java:366)
at org.jgroups.JChannel$CloserThread.run(JChannel.java:2046)
13:24:41,365 ERROR [JChannel] failure reconnecting to channel, retrying
org.jgroups.ChannelException: local_addr is null
at org.jgroups.JChannel.startStack(JChannel.java:1631)
at org.jgroups.JChannel.connect(JChannel.java:366)
at org.jgroups.JChannel$CloserThread.run(JChannel.java:2046)
13:24:41,488 INFO [RPCManagerImpl] Received new cluster view:
[127.0.0.2:42088|3] [127.0.0.2:42088, 127.0.0.1:56000]
13:24:41,564 FATAL [JChannel] local_addr is null; cannot connect
13:24:41,564 ERROR [JChannel] failure reconnecting to channel, retrying
org.jgroups.ChannelException: local_addr is null
at org.jgroups.JChannel.startStack(JChannel.java:1631)
at org.jgroups.JChannel.connect(JChannel.java:366)
at org.jgroups.JChannel$CloserThread.run(JChannel.java:2046)
13:24:41,927 INFO [LegacyStateTransferIntegrator] Using version 4096
13:24:45,941 WARN [GMS] Merge aborted. Merge leader did not get
MergeData from all subgroup coordinators [127.0.0.2:42088, 127.0.0.1:40089]
13:24:45,993 WARN [GMS] Merge aborted. Merge leader did not get
MergeData from all subgroup coordinators [127.0.0.2:7900, 127.0.0.1:7900]
13:24:45,995 WARN [GMS] merge was supposed to be cancelled at merge
participant 127.0.0.1:7900 (merge_id=[127.0.0.1:7900|1246037080987]),
but it is not since merge ids do not match
13:24:46,029 WARN [GMS] Merge aborted. Merge leader did not get
MergeData from all subgroup coordinators [127.0.0.2:42088, 127.0.0.1:40089]
13:24:46,029 WARN [GMS] resume([127.0.0.1:40089|1246037080833]) does
not match null
6) Thereafter, there is continual looping of the CloserThread as it
tries to connect a channel, fails due to local_addr == null, waits 1 sec
and tries again, which goes on for hours:
13:24:41,363 FATAL [JChannel] local_addr is null; cannot connect
13:24:41,365 ERROR [JChannel] failure reconnecting to channel, retrying
org.jgroups.ChannelException: local_addr is null
at org.jgroups.JChannel.startStack(JChannel.java:1631)
at org.jgroups.JChannel.connect(JChannel.java:366)
at org.jgroups.JChannel$CloserThread.run(JChannel.java:2046)
As I said, I suspect a concurrency issue that's resulting in one channel
getting a local_addr but thereafter others with the same shared
transport get null.
Hmm, just noticed: the looping in #6 above repeats every 30 secs, not
every second. Which is the timeout in JChannel.startStack() :
/* Wait LOCAL_ADDR_TIMEOUT milliseconds for local_addr to have a
non-null value (set by SET_LOCAL_ADDRESS) */
local_addr=local_addr_promise.getResult(LOCAL_ADDR_TIMEOUT);
if(local_addr == null) {
log.fatal("local_addr is null; cannot connect");
throw new ChannelException("local_addr is null");
}
So, seems the recurring problem is local_addr_promise isn't getting set,
not that it is getting set, but with a null value. *However* the *first*
set of FATAL logging occurs just a couple seconds after GMS logs it is
going to reconnect. So that first issue where local_addr == null is not
due to a timeout.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
14 years, 9 months
[JBoss JIRA] Created: (JBRULES-2155) Make parsing of bendpoints more error-safe
by Kris Verlaenen (JIRA)
Make parsing of bendpoints more error-safe
------------------------------------------
Key: JBRULES-2155
URL: https://jira.jboss.org/jira/browse/JBRULES-2155
Project: JBoss Drools
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: drools-compiler (flow)
Reporter: Kris Verlaenen
Assignee: Kris Verlaenen
Fix For: 5.1.0.M1
In cases where the bendpoints are not specified as expected in the XML, the parser would throw out exceptions that don't contain a lot of information about the cause.
E.g.
java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:468)
at java.lang.Integer.<init>(Integer.java:620)
at org.drools.eclipse.flow.ruleflow.core.ConnectionWrapper.stringToBendpoints(ConnectionWrapper.java:94)
at org.drools.eclipse.flow.ruleflow.core.ConnectionWrapper.internalGetBendpoints(ConnectionWrapper.java:65)
at org.drools.eclipse.flow.common.editor.core.ElementConnection.getBendpoints(ElementConnection.java:123)
...
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
14 years, 9 months
[JBoss JIRA] Created: (JBASPECT-32) Fix dependency upon JACC API
by Andrew Lee Rubinger (JIRA)
Fix dependency upon JACC API
----------------------------
Key: JBASPECT-32
URL: https://jira.jboss.org/jira/browse/JBASPECT-32
Project: JBoss Aspects
Issue Type: Task
Components: security
Reporter: Andrew Lee Rubinger
Assignee: Andrew Lee Rubinger
Fix For: 1.0.1.GA-security
>From AS Branch_5_x "build" module:
[INFO] [dependency:tree]
[INFO] org.jboss.jbossas:jboss-as-build:pom:5.2.0-SNAPSHOT
[INFO] \- org.jboss.jbossas:jboss-as-aspects:jar:5.2.0-SNAPSHOT:compile
[INFO] \- org.jboss.aspects:jboss-security-aspects:jar:1.0.0.GA:compile
[INFO] \- javax.security:jacc:jar:1.0:compil
Instead we need to be defining the dependency upon "org.jboss.javaee:jboss-javaee".
As a result this doesn't make it into the AS dependency tree correctly (though the build places the proper artifact into the distribution/runtime). Embedded gets the wrong artifact as a result.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
14 years, 9 months
[JBoss JIRA] Resolved: (JBAS-7073) ValueExpression is not working with JBOSS 5.1.0
by Stan Silvert (JIRA)
[ https://jira.jboss.org/jira/browse/JBAS-7073?page=com.atlassian.jira.plug... ]
Stan Silvert resolved JBAS-7073.
--------------------------------
Resolution: Rejected
> ValueExpression is not working with JBOSS 5.1.0
> -----------------------------------------------
>
> Key: JBAS-7073
> URL: https://jira.jboss.org/jira/browse/JBAS-7073
> Project: JBoss Application Server
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JSF
> Environment: windows environment
> Reporter: anil kumar
> Assignee: Stan Silvert
> Priority: Critical
>
> sample code is here, its giving /welcomeJSF.jsp(26,12) According to TLD or attribute directive in tag file, attribute title does not accept any expressions
> welcomeJSF.jsp
> <%@page contentType="text/html"%>
> <%@page pageEncoding="UTF-8"%>
> <%@ page import="javax.el.ValueExpression" %>
> <%@ page import="javax.el.ExpressionFactory" %>
> <%@ page import ="com.sun.el.ExpressionFactoryImpl"%>
>
> <%@taglib prefix="f" uri="http://java.sun.com/jsf/core"%>
> <%@taglib prefix="h" uri="http://java.sun.com/jsf/html"%>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
> "http://www.w3.org/TR/html4/loose.dtd">
> <%--
> This file is an entry point for JavaServer Faces application.
> --%>
> <f:view>
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> <title>JSP Page</title>
> </head>
> <body>
>
> <%
> ExpressionFactory el = new ExpressionFactoryImpl();
> ValueExpression testExpression = el.createValueExpression( " Test" , new String().getClass());
> %>
>
> <h:form>
> <h:commandLink title="<%=testExpression%>" id="next" action="#">
> <h:outputText value="Next Page »" />
> </h:commandLink>
> <h1><h:outputText value="JavaServer Faces" /></h1>
>
> </h:form>
> </body>
> </html>
> </f:view>
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
14 years, 9 months
[JBoss JIRA] Commented: (JBAS-7073) ValueExpression is not working with JBOSS 5.1.0
by Stan Silvert (JIRA)
[ https://jira.jboss.org/jira/browse/JBAS-7073?page=com.atlassian.jira.plug... ]
Stan Silvert commented on JBAS-7073:
------------------------------------
It's not talking about the ValueExpression. It's talking about the fact that you used <%=something%>. This is not allowed for the title attribute.
> ValueExpression is not working with JBOSS 5.1.0
> -----------------------------------------------
>
> Key: JBAS-7073
> URL: https://jira.jboss.org/jira/browse/JBAS-7073
> Project: JBoss Application Server
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JSF
> Environment: windows environment
> Reporter: anil kumar
> Assignee: Stan Silvert
> Priority: Critical
>
> sample code is here, its giving /welcomeJSF.jsp(26,12) According to TLD or attribute directive in tag file, attribute title does not accept any expressions
> welcomeJSF.jsp
> <%@page contentType="text/html"%>
> <%@page pageEncoding="UTF-8"%>
> <%@ page import="javax.el.ValueExpression" %>
> <%@ page import="javax.el.ExpressionFactory" %>
> <%@ page import ="com.sun.el.ExpressionFactoryImpl"%>
>
> <%@taglib prefix="f" uri="http://java.sun.com/jsf/core"%>
> <%@taglib prefix="h" uri="http://java.sun.com/jsf/html"%>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
> "http://www.w3.org/TR/html4/loose.dtd">
> <%--
> This file is an entry point for JavaServer Faces application.
> --%>
> <f:view>
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> <title>JSP Page</title>
> </head>
> <body>
>
> <%
> ExpressionFactory el = new ExpressionFactoryImpl();
> ValueExpression testExpression = el.createValueExpression( " Test" , new String().getClass());
> %>
>
> <h:form>
> <h:commandLink title="<%=testExpression%>" id="next" action="#">
> <h:outputText value="Next Page »" />
> </h:commandLink>
> <h1><h:outputText value="JavaServer Faces" /></h1>
>
> </h:form>
> </body>
> </html>
> </f:view>
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
14 years, 9 months
[JBoss JIRA] Moved: (JBAS-7073) ValueExpression is not working with JBOSS 5.1.0
by Stan Silvert (JIRA)
[ https://jira.jboss.org/jira/browse/JBAS-7073?page=com.atlassian.jira.plug... ]
Stan Silvert moved JSFUNIT-216 to JBAS-7073:
--------------------------------------------
Project: JBoss Application Server (was: JSFUnit)
Key: JBAS-7073 (was: JSFUNIT-216)
Component/s: JSF
(was: Core)
Security: Public
> ValueExpression is not working with JBOSS 5.1.0
> -----------------------------------------------
>
> Key: JBAS-7073
> URL: https://jira.jboss.org/jira/browse/JBAS-7073
> Project: JBoss Application Server
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JSF
> Environment: windows environment
> Reporter: anil kumar
> Assignee: Stan Silvert
> Priority: Critical
>
> sample code is here, its giving /welcomeJSF.jsp(26,12) According to TLD or attribute directive in tag file, attribute title does not accept any expressions
> welcomeJSF.jsp
> <%@page contentType="text/html"%>
> <%@page pageEncoding="UTF-8"%>
> <%@ page import="javax.el.ValueExpression" %>
> <%@ page import="javax.el.ExpressionFactory" %>
> <%@ page import ="com.sun.el.ExpressionFactoryImpl"%>
>
> <%@taglib prefix="f" uri="http://java.sun.com/jsf/core"%>
> <%@taglib prefix="h" uri="http://java.sun.com/jsf/html"%>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
> "http://www.w3.org/TR/html4/loose.dtd">
> <%--
> This file is an entry point for JavaServer Faces application.
> --%>
> <f:view>
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> <title>JSP Page</title>
> </head>
> <body>
>
> <%
> ExpressionFactory el = new ExpressionFactoryImpl();
> ValueExpression testExpression = el.createValueExpression( " Test" , new String().getClass());
> %>
>
> <h:form>
> <h:commandLink title="<%=testExpression%>" id="next" action="#">
> <h:outputText value="Next Page »" />
> </h:commandLink>
> <h1><h:outputText value="JavaServer Faces" /></h1>
>
> </h:form>
> </body>
> </html>
> </f:view>
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
14 years, 9 months