There is no multicast - it's not supported in EC2. It's all point to
point TCP. Thus TCPPING. The was working in AS7.0.x. Maybe there's an
iptables issue that's preventing the 2 from seeing each other. Looking
into other causes.
Thanks -Bill
On 3/14/12 5:25 PM, Richard Achmatowicz wrote:
Are you sure your multicast messages are getting through? Bela has a
utility called McastSendTest and McastReeceiverTest in the JGroups
distro which allows checking if the network is set up OK for multicast.
On 03/14/2012 08:21 PM, William DeCoste wrote:
> OK, that's what I thought - the 2 nodes aren't seeing each other.
> Thanks for confirming the config is ok. I never see any logging with
> a membership of more than one. And I never saw any change if I killed
> a node.
>
> On 3/14/12 5:18 PM, Richard Achmatowicz wrote:
>> On 03/14/2012 08:15 PM, William DeCoste wrote:
>>> Hi Richard,
>>>
>>> I'm also deploying a clustered web app which causes JGroups and the
>>> rest of clustering to load.
>>>
>>> Maybe this is just a change in logging. I am seeing the following
>>> with AS7.1.0:
>>> 2012/03/14 19:53:21,710 INFO
>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>> (pool-17-thread-1) ISPN000094: Received new cluster view:
>>> [ip-10-190-239-128/ejb|0] [ip-10-190-239-128/ejb]
>>>
>>> Does "[ip-10-190-239-128/ejb|0] [ip-10-190-239-128/ejb]" indicate 2
>>> nodes? This is on Express thus the strange node names.
>> One node. IIRC, the first element [ip-10-190-239-128/ejb|0]
>> indicates the host which triggered the view change, and the second
>> element [ip-10-190-239-128/ejb] is the new group membership.
>>>
>>> I'm used to seeing jgroups debug logging of the pings that isn't
>>> happening. Was definietly seeing different logging in 7.0.x.
>>>
>>> Thanks -Bill
>>>
>>> On 3/14/12 4:55 PM, Richard Achmatowicz wrote:
>>>> Hi Bill
>>>>
>>>> The configuration is valid. I tried it out on my local machine. I
>>>> set up two AS 7.1.1. instances, configured as you describe but
>>>> using a different set of IPv4 addresses (I used 192.168.0.103,
>>>> 192.168.0.104 and used "iptables -F" to let multicast messages
>>>> through).
>>>>
>>>> Because clustering services are installed on-demand, they need
>>>> some event to to start them, so I also deployed a small clustered
>>>> web application.
>>>>
>>>> When I start both instances, no channels are started, as expected:
>>>> $ ./standalone.sh --server-config standalone-ha.xml
>>>> -Djboss.bind.address=192.168.0.103
>>>> -Djboss.bind.address.management=192.168.0.103 -Djboss.node.name=A
>>>> $ ./standalone.sh --server-config standalone-ha.xml
>>>> -Djboss.bind.address=192.168.0.104
>>>> -Djboss.bind.address.management=192.168.0.104 -Djboss.node.name=B
>>>>
>>>> When I deployed the clustered web app on each application server
>>>> instance to trigger Infinispan web cache container and JGroups tcp
>>>> channel startup, I saw for example, on the first host A:
>>>>
>>>> 18:58:41,086 INFO [org.jboss.as.clustering.infinispan] (MSC
>>>> service thread 1-8) JBAS010281: Started
>>>> //default-host//my-clustered-webapp cache from web container
>>>> 18:58:41,096 INFO
>>>> [org.infinispan.configuration.cache.EvictionConfigurationBuilder]
>>>> (MSC service thread 1-8) ISPN000152: Passivation configured
>>>> without an eviction policy being selected. Only manually evicted
>>>> entities will be pasivated.
>>>> 18:58:41,099 INFO
>>>> [org.infinispan.configuration.cache.EvictionConfigurationBuilder]
>>>> (MSC service thread 1-8) ISPN000152: Passivation configured
>>>> without an eviction policy being selected. Only manually evicted
>>>> entities will be pasivated.
>>>> 18:58:41,220 INFO [org.jboss.web] (MSC service thread 1-8)
>>>> JBAS018210: Registering web context: /my-clustered-webapp
>>>> 18:58:41,230 INFO [org.jboss.as.server] (Controller Boot Thread)
>>>> JBAS018559: Deployed "my-clustered-webapp.war"
>>>> 18:58:41,333 INFO [org.jboss.as] (Controller Boot Thread)
>>>> JBAS015951: Admin console listening on
http://192.168.0.103:9990
>>>> 18:58:41,334 INFO [org.jboss.as] (Controller Boot Thread)
>>>> JBAS015874: JBoss AS 7.1.1.Final-SNAPSHOT "Thunder" started in
>>>> 6719ms - Started 175 of 305 services (129 services are passive or
>>>> on-demand)
>>>> 18:58:50,673 INFO
>>>>
[org.jboss.as.clustering.impl.CoreGroupCommunicationService.lifecycle.web]
>>>> (Incoming-1,null) JBAS010247: New cluster view for partition web
>>>> (id: 1, delta: 1, merge: false) : [A/web, B/web]
>>>> 18:58:50,687 INFO
>>>> [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
>>>> (Incoming-1,null) ISPN000094: Received new cluster view: [A/web|1]
>>>> [A/web, B/web]
>>>>
>>>> So, it seems as though things work as expected.
>>>>
>>>> I also tried to achieve the same effect by not deploying the web
>>>> app and setting the attribute start="EAGER" on a clustered
>>>> Infinispan cache (in this case, cache "repl" in the
"web" cache
>>>> container) to force them to start as active services. When the
>>>> application servers were started this time, same result:
>>>> clustering between the two nodes was visible.
>>>>
>>>> If I instead set the start="EAGER" attribute on a cache
container,
>>>> however, the channels do not get started automatically and I need
>>>> to deploy a web app as before to trigger starting of the transport.
>>>>
>>>> So, are you using start=EAGER to start the cache instances, or
>>>> deploying an app to trigger them? The organization of Infinispan
>>>> services has changed a lot since 7.1.0. In particular, a cache
>>>> container needs to specify the <transport/> element in order for a
>>>> transport to be used.
>>>>
>>>> Richard
>>>>
>>>> On 03/14/2012 05:47 PM, Brian Stansberry wrote:
>>>>> Hi Paul, Richard:
>>>>>
>>>>> Do you guys see anything wrong here?
>>>>>
>>>>> - Brian
>>>>>
>>>>> On 3/14/12 4:44 PM, William DeCoste wrote:
>>>>>> Hi Brian,
>>>>>>
>>>>>> Has anything changed in the config of AS7.1 from 7.0.x? I am not
>>>>>> seeing
>>>>>> any jgroups traffic or discovery. This configuration worked fine
in
>>>>>> 7.0.x. When JGroups is loaded it just creates 2 1-node clusters.
>>>>>> They
>>>>>> don't seem to see each other or even be trying.
>>>>>>
>>>>>> Thanks -Bill
>>>>>>
>>>>>> <subsystem xmlns="urn:jboss:domain:jgroups:1.1"
>>>>>> default-stack="tcp">
>>>>>> <stack name="tcp">
>>>>>> <transport type="TCP"
socket-binding="jgroups-tcp"/>
>>>>>> <protocol type="TCPPING">
>>>>>> <property name="timeout">
>>>>>> 3000
>>>>>> </property>
>>>>>> <property name="initial_hosts">
>>>>>> 127.0.250.1[7600],127.0.251.1[7600]
>>>>>> </property>
>>>>>> <property name="port_range">
>>>>>> 1
>>>>>> </property>
>>>>>> <property name="num_initial_members">
>>>>>> 2
>>>>>> </property>
>>>>>> </protocol>
>>>>>> <protocol type="MERGE2"/>
>>>>>> <protocol type="FD_SOCK"
socket-binding="jgroups-tcp-fd"/>
>>>>>> <protocol type="FD"/>
>>>>>> <protocol type="VERIFY_SUSPECT"/>
>>>>>> <protocol type="BARRIER"/>
>>>>>> <protocol type="pbcast.NAKACK"/>
>>>>>> <protocol type="UNICAST2"/>
>>>>>> <protocol type="pbcast.STABLE"/>
>>>>>> <protocol type="pbcast.GMS"/>
>>>>>> <protocol type="UFC"/>
>>>>>> <protocol type="MFC"/>
>>>>>> <protocol type="FRAG2"/>
>>>>>> </stack>
>>>>>> </subsystem>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
--
Bill DeCoste
Principal Software Engineer, Red Hat
978-204-0920
wdecoste(a)redhat.com