[Clustering/JBoss] - Two-Node Cluster UDP OutOfMemoryError
by jowizzle
Hello,
First, and perhaps completely unrelated: Is it normal to see messages such as "additional data: 19 bytes" throughout the logs?
Moving on...
I have a two-node cluster of stock 4.0.5GA servers. After roughly 4 hours of operation one node will fail with an OutOfMemoryError stemming from org.jgroups.protocols.UDP. Both servers have two eth interfaces, so I set bind_addr on the UDP element accordingly in cluster-service.xml and jboss-service.xml in the tc5-cluster sar.
I enabled DEBUG for jgroups. It seems to get pretty messy. First, node2 stops ack'ing on are-you-alive messages. Then node1 gets susptected, but for no apparent reason. If I understand correctly, node1 is the coord, so node2 can't remove it and it will refuse to remove itself from the view. It may, however, opt to leave and rejoin.
Below is an excerpt from the cluster log file from around the time things begin to go awry. Any hints are greatly appreciated.
| 2007-04-25 15:33:07,237 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to node2:32802 (own address=node1:32839)
| 2007-04-25 15:33:07,269 DEBUG [org.jgroups.protocols.UDP]
| sending msgs:
| node2:32802: 1 msgs
|
| 2007-04-25 15:33:07,284 DEBUG [org.jgroups.protocols.FD] received ack from node2:32802
| 2007-04-25 15:33:07,316 DEBUG [org.jgroups.protocols.UDP]
| sending msgs:
| node2:32802: 1 msgs
|
| 2007-04-25 15:34:51,762 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to node2:32802 (own address=node1:32839)
| 2007-04-25 15:34:51,762 DEBUG [org.jgroups.protocols.FD] heartbeat missing from node2:32802 (number=0)
| 2007-04-25 15:34:51,762 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to node2:32805 (additional data: 19 bytes) (own address=node1:32842 (addit
| ional data: 19 bytes))
| 2007-04-25 15:34:51,762 DEBUG [org.jgroups.protocols.FD] heartbeat missing from node2:32805 (additional data: 19 bytes) (number=0)
| 2007-04-25 15:34:51,767 DEBUG [org.jgroups.protocols.FD] [SUSPECT] suspect hdr is [FD: SUSPECT (suspected_mbrs=[node1:32842 (additional data: 19 bytes)], fro
| m=node2:32805 (additional data: 19 bytes))]
| 2007-04-25 15:34:51,767 WARN [org.jgroups.protocols.FD] I was suspected, but will not remove myself from membership (waiting for EXIT message)
| 2007-04-25 15:34:51,768 DEBUG [org.jgroups.protocols.pbcast.STABLE] stable task started; num_gossip_runs=3, max_gossip_runs=3
| 2007-04-25 15:34:51,768 DEBUG [org.jgroups.protocols.pbcast.CoordGmsImpl] view=[node2:32805 (additional data: 19 bytes)|2] [node2:32805 (additional data: 19
| bytes)]
| 2007-04-25 15:34:51,768 DEBUG [org.jgroups.protocols.pbcast.GMS] [local_addr=node1:32842 (additional data: 19 bytes)] view is [node2:32805 (additional data:
| 19 bytes)|2] [node2:32805 (additional data: 19 bytes)]
| 2007-04-25 15:34:51,780 WARN [org.jgroups.protocols.pbcast.GMS] checkSelfInclusion() failed, node1:32842 (additional data: 19 bytes) is not a member of view
| [node2:32805 (additional data: 19 bytes)|2] [node2:32805 (additional data: 19 bytes)]; discarding view
| 2007-04-25 15:34:51,781 WARN [org.jgroups.protocols.pbcast.GMS] I (node1:32842 (additional data: 19 bytes)) am being shunned, will leave and rejoin group (p
| rev_members are [node1:32842 (additional data: 19 bytes) node2:32805 (additional data: 19 bytes) ])
| 2007-04-25 15:34:51,781 INFO [org.jgroups.JChannel] received an EXIT event, will leave the channel
| 2007-04-25 15:34:51,783 INFO [org.jgroups.JChannel] closing the channel
| 2007-04-25 15:34:51,786 ERROR [org.jgroups.protocols.UDP] [node1:32842 (additional data: 19 bytes)] exception=java.lang.OutOfMemoryError: heap allocation fai
| led, stack trace=java.lang.OutOfMemoryError: heap allocation failed
| at java.net.PlainDatagramSocketImpl.receive0(Native Method)
| at java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:181)
| at java.net.DatagramSocket.receive(DatagramSocket.java:724)
| at org.jgroups.protocols.UDP$UcastReceiver.run(UDP.java:1264)
| at java.lang.Thread.run(Thread.java:799)
|
| 2007-04-25 15:34:51,790 ERROR [org.jgroups.protocols.UDP] [node1:32839] exception=java.lang.OutOfMemoryError: heap allocation failed, stack trace=java.lang.O
| utOfMemoryError: heap allocation failed
| at java.net.PlainDatagramSocketImpl.receive0(Native Method)
| at java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:181)
| at java.net.DatagramSocket.receive(DatagramSocket.java:724)
| at org.jgroups.protocols.UDP$UcastReceiver.run(UDP.java:1264)
| at java.lang.Thread.run(Thread.java:799)
|
| 2007-04-25 15:34:51,795 DEBUG [org.jgroups.protocols.pbcast.NAKACK] contents for node1:32842 (additional data: 19 bytes):
|
| sent_msgs: [6837 - 6890]
| received_msgs:
| node2:32805 (additional data: 19 bytes): received_msgs: [], delivered_msgs: [276 - 328]
| node1:32842 (additional data: 19 bytes): received_msgs: [], delivered_msgs: [6838 - 6890]
|
| 2007-04-25 15:34:51,796 DEBUG [org.jgroups.protocols.FD_SOCK] socket to node2:32805 (additional data: 19 bytes) was reset
| 2007-04-25 15:34:51,796 DEBUG [org.jgroups.protocols.FD_SOCK] pinger thread terminated
| 2007-04-25 15:34:51,825 DEBUG [org.jgroups.protocols.UDP]
| sending msgs:
| node1:32839: 1 msgs
|
| 2007-04-25 15:34:52,092 ERROR [org.jgroups.protocols.UDP] [node1:32842 (additional data: 19 bytes)] exception=java.lang.OutOfMemoryError: heap allocation fai
| led, stack trace=java.lang.OutOfMemoryError: heap allocation failed
| at java.net.PlainDatagramSocketImpl.receive0(Native Method)
| at java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:181)
| at java.net.DatagramSocket.receive(DatagramSocket.java:724)
| at org.jgroups.protocols.UDP$UcastReceiver.run(UDP.java:1264)
| at java.lang.Thread.run(Thread.java:799)
|
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4041136#4041136
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4041136
18 years, 12 months
[Installation, Configuration & Deployment] - Re: The JSF library situation in JBoss explained (it's calle
by gDarius
Hello Stan,
Thanks for writing back so quickly. It was not clear on the wiki that the forced JSF was due to JEE5 compliance efforts, which is why I took such an incredulous tone. Regardless, you are correct about JEE5, I read the spec and sure enough, it requires JSF 1.2. Wow, I'm floored, that's a pretty gutsy move on their part ... And they are making your life difficult because then I see they say some seemingly contradictory things...
>From section EE.11.1, it states:
anonymous wrote : Compatibility is a core value of the Java EE platform. A Java EE product is required to support portable applications written to previous versions of the platform.
... but then at then end of that section, they get a bit dodgy with this:
anonymous wrote : Portable applications depend only on the APIs and behavior required by the Java EE specifications. In general, portable applications written to a previous version of the platform will continue to work without change and with identical behavior on the current version of the platform.
|
So, what does "in general" mean? Does that mean you are required or you are not required to host older J2EE applications? What percentage of the time is "in general"? 50%? 99.9%?
And from section EE.6.1.2, where it discusses that you must include JSF 1.2, it states:
anonymous wrote : All classes and interfaces required by the specifications for the APIs must be provided by the Java EE containers. In some cases, a Java EE product is not required to provide objects that implement interfaces intended to be implemented by an application server, nevertheless, the definitions of such interfaces must be included in the Java EE platform.
So, our app is a J2EE 1.4 app, not a 1.5 app. And by this spec, you're supposed to support apps written to those older versions WHILE loading JSF 1.2. I think they must mean for you to dynamically load JSF for web applications if the deployment descriptor identifies the app as JEE5, and to NOT load JSF otherwise (since it wasn't required for J2EE 1.4 and below.) Is this your interpretation as well?
I can tell you right now, it's going to be a long time before we upgrade to JEE5. So, for now, we are bundling our own JSF libraries and deploying to other vendor's platforms without trouble. We can't deploy our app to JBoss 4.2+ where our deployment descriptor indicates J2EE 1.4 (servlet 2.4 indicated in web.xml) and JSF 1.2 also gets pre-loaded for us. For my situation, if you could give me a way to override or mask off your JSF libraries from being loaded using the jboss-web.xml, that would be fine.
This is a difficult situation for JBoss. Sorry for being so harsh in my original message, I had no idea what was happening in JEE5 and how that was affecting you.
I will try to deploy to Glassfish and see how it does. We do not consider Glassfish to be a "force" in the market yet like JBoss is, but it will be an interesting experiment to see how well it does deploying our app. Geronimo was our initial platform only because its rich classloader controls made prototyping easy for other parts of our app.
Thanks,
gDarius
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4041127#4041127
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4041127
18 years, 12 months
[JBoss Seam] - <s:cache /> and <h:selectOneListBox />'s
by MikeDougherty
I'm having a bit of trouble using the s:cache JSF component in combination with my h:selectOneListBox's.
If I place the <s:cache /> tags just after the <ui:define /> tags (just like the Blog example shows), like so
| <ui:define name="body">
| <s:cache key="index" region="pageFragments">
| ...
| <h:form id="foo">
| <h:selectOneListBox id="foo" value="#{myFoo}">
| <si:selectItems var="foo" value="#{fooList.resultList}" />
| </h:selectOneListBox>
| <h:commandButton id="select" action="#{select}" value="Select Foo">
| </h:form>
| ...
| <h:form id="bar">
| <h:selectOneListBox id="bar" value="#{myBar}">
| <si:selectItems var="bar" value="#{barList.resultList}" />
| </h:selectOneListBox>
| <h:commandButton id="select" action="#{select}" value="Select Bar">
| </h:form>
| ...
| </s:cache>
| </ui:define>
|
I can select an item from one of the lists and submit the form once. But when I come back to the page I have "" text in the page, and can no longer submit any of the forms on the page.
If I put the <s:cache /> tag directly around each <h:selectOneListBox />
| <ui:define name="body">
| ...
| <h:form id="foo">
| <s:cache key="index-foo" region="pageFragments">
| <h:selectOneListBox id="foo" value="#{myFoo}">
| <si:selectItems var="foo" value="#{fooList.resultList}" />
| </h:selectOneListBox>
| <h:commandButton id="select" action="#{select}" value="Select Foo">
| </s:cache>
| </h:form>
| ...
| <h:form id="bar">
| <s:cache key="index-bar" region="pageFragments">
| <h:selectOneListBox id="bar" value="#{myBar}">
| <si:selectItems var="bar" value="#{barList.resultList}" />
| </h:selectOneListBox>
| <h:commandButton id="select" action="#{select}" value="Select Bar">
| </s:cache>
| </h:form>
| ...
| </ui:define>
|
I can submit the forms but, I have to click the submit button twice. The first time I get "value is not valid" (as if there was no selected item). Pressing the button again, submits the form.
Is there something special I have to do in order to get the <s:cache /> to work with my <h:selectOneListBox /> tags?
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4041123#4041123
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4041123
18 years, 12 months