[infinispan-dev] JGroups Error: JGRP000029 observed with "JGroups 3.6.8 + Infinispan 8.2.2":
Manohar SL
manohar.sl at ericsson.com
Mon Aug 22 14:22:23 EDT 2016
Yes, we need some quick help from Infinispan Folks on the needed changes to infinispan-config.xml and jgroups.xml to
get the combination of "Infinispan 8.2.2 with JGroups 3.6.7" working for Replication mode of Infinispan usage.
Both file content are pasted below:
================================================================================================
infinispan-config.xml
<infinispan>
<jgroups>
<stack-file name="configurationFile" path="config/jgroups.xml"/>
</jgroups>
<cache-container>
<transport cluster="x-cluster" stack="configurationFile" />
<replicated-cache name="transactional-type" mode="SYNC">
<transaction mode="NON_XA" locking="OPTIMISTIC" transaction-manager-lookup="org.infinispan.transaction.lookup.GenericTransactionManagerLookup" auto-commit="true" />
<locking acquire-timeout="60000"/>
<expiration lifespan="43200000"/>
<state-transfer enabled="true" timeout="240000" chunk-size="10000" />
<locking isolation="READ_COMMITTED" acquire-timeout="20000" write-skew="false" concurrency-level="5000" striping="false" />
</replicated-cache>
</cache-container>
</infinispan>
================================================================================================
jgroups.xml
<!--
TCP based stack, with flow control and message bundling. This is usually used when IP
multicasting cannot be used in a network, e.g. because it is disabled (routers discard multicast).
Note that TCP.bind_addr and TCPPING.initial_hosts should be set, possibly via system properties, e.g.
-Djgroups.bind_addr=192.168.5.2 and -Djgroups.tcpping.initial_hosts=192.168.5.2[7800]".
author: Bela Ban
-->
<config xmlns="urn:org:jgroups"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups-3.6.xsd">
<TCP loopback="true"
bind_addr="${jgroups.tcp.address:127.0.0.1}"
bind_port="${jgroups.tcp.port:7800}"
recv_buf_size="${tcp.recv_buf_size:130k}"
send_buf_size="${tcp.send_buf_size:130k}"
discard_incompatible_packets="true"
max_bundle_size="64K"
max_bundle_timeout="30"
enable_bundling="true"
use_send_queues="true"
sock_conn_timeout="300"
timer_type="new"
timer.min_threads="4"
timer.max_threads="10"
timer.keep_alive_time="3000"
timer.queue_max_size="500"
thread_pool.enabled="true"
thread_pool.min_threads="2"
thread_pool.max_threads="30"
thread_pool.keep_alive_time="60000"
thread_pool.queue_enabled="false"
thread_pool.queue_max_size="100"
thread_pool.rejection_policy="discard"
oob_thread_pool.enabled="true"
oob_thread_pool.min_threads="2"
oob_thread_pool.max_threads="30"
oob_thread_pool.keep_alive_time="60000"
oob_thread_pool.queue_enabled="false"
oob_thread_pool.queue_max_size="100"
oob_thread_pool.rejection_policy="discard"
/>
<!-- <TCP_NIO -->
<!-- bind_port="7800" -->
<!-- bind_interface="${jgroups.tcp_nio.bind_interface:bond0}" -->
<!-- use_send_queues="true" -->
<!-- sock_conn_timeout="300" -->
<!-- reader_threads="3" -->
<!-- writer_threads="3" -->
<!-- processor_threads="0" -->
<!-- processor_minThreads="0" -->
<!-- processor_maxThreads="0" -->
<!-- processor_queueSize="100" -->
<!-- processor_keepAliveTime="9223372036854775807"/> -->
<TCPGOSSIP initial_hosts="${jgroups.tcpgossip.initial_hosts}"/>
<!-- <TCPPING async_discovery="true" initial_hosts="${jgroups.tcpping.initial_hosts}"
port_range="2" timeout="3000" /> -->
<MERGE2 max_interval="30000" min_interval="10000"/>
<FD_SOCK/>
<FD timeout="3000" max_tries="3"/>
<VERIFY_SUSPECT timeout="1500"/>
<pbcast.NAKACK
use_mcast_xmit="false"
retransmit_timeout="300,600,1200,2400,4800"
discard_delivered_msgs="true"/>
<UNICAST2 timeout="300,600,1200"
stable_interval="5000"
log_not_found_msgs="true"
max_bytes="400000"/>
<pbcast.STABLE stability_delay="500" desired_avg_gossip="5000" max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="5000" merge_timeout="5000" log_collect_msgs="true" log_view_warnings="true" max_join_attempts="0" view_bundling="true"/>
<UFC max_credits="200k" min_threshold="0.20"/>
<MFC max_credits="200k" min_threshold="0.20"/>
<FRAG2 frag_size="35000"/>
<RSVP timeout="60000" resend_interval="500" ack_on_delivery="false" throw_exception_on_timeout="true" />
</config>
================================================================================================
@ Bela Ban:
We could not add the print the stack trace.
Since we do not build the JGroups code and directly use it from Infinispan library, we could not attempt this.
Regs,
Manohar.
-----Original Message-----
From: infinispan-dev-bounces at lists.jboss.org [mailto:infinispan-dev-bounces at lists.jboss.org] On Behalf Of Bela Ban
Sent: Monday, August 22, 2016 4:57 PM
To: Manohar SL <manohar.sl at ericsson.com>; infinispan -Dev List <infinispan-dev at lists.jboss.org>
Subject: Re: [infinispan-dev] JGroups Error: JGRP000029 observed with "JGroups 3.6.8 + Infinispan 8.2.2":
On 22/08/16 12:56, Manohar SL wrote:
> Hi Bela Ben,
>
> We just now retested the combination of "Infinispan 8.2.2 + JGroups
> 3.6.7", we face the same issue, data replication across Nodes does not go through.
If this is a supported configuration, then I suggest post this to the infinispan-dev list. Oops, this *is* the list :-) Any takers?
Did you print the stack trace?
> Highlighted below are configurations used (derived from the Base configurations):
> Infinispan-config.xml:
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ------
> <infinispan>
> <jgroups>
> <stack-file name="configurationFile" path="config/jgroups.xml"/>
> </jgroups>
> <cache-container>
> <transport cluster="x-cluster" stack="configurationFile" />
> <replicated-cache name="transactional-type" mode="SYNC">
> <transaction mode="NON_XA" locking="OPTIMISTIC" transaction-manager-lookup="org.infinispan.transaction.lookup.GenericTransactionManagerLookup" auto-commit="true" />
> <locking acquire-timeout="60000"/>
> <expiration lifespan="43200000"/>
> <state-transfer enabled="true" timeout="240000" chunk-size="10000" />
> <locking isolation="READ_COMMITTED" acquire-timeout="20000" write-skew="false" concurrency-level="5000" striping="false" />
> </replicated-cache>
> </cache-container>
> </infinispan>
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ------
>
> JGroups.xml
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ------
>
> <!--
> TCP based stack, with flow control and message bundling. This is usually used when IP
> multicasting cannot be used in a network, e.g. because it is disabled (routers discard multicast).
> Note that TCP.bind_addr and TCPPING.initial_hosts should be set, possibly via system properties, e.g.
> -Djgroups.bind_addr=192.168.5.2 and -Djgroups.tcpping.initial_hosts=192.168.5.2[7800]".
> author: Bela Ban
> -->
> <config xmlns="urn:org:jgroups"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="urn:org:jgroups
> http://www.jgroups.org/schema/jgroups-3.6.xsd">
>
> <TCP loopback="true"
> bind_addr="${jgroups.tcp.address:127.0.0.1}"
> bind_port="${jgroups.tcp.port:7800}"
> recv_buf_size="${tcp.recv_buf_size:130k}"
> send_buf_size="${tcp.send_buf_size:130k}"
> discard_incompatible_packets="true"
> max_bundle_size="64K"
> max_bundle_timeout="30"
> enable_bundling="true"
> use_send_queues="true"
> sock_conn_timeout="300"
> timer_type="new"
> timer.min_threads="4"
> timer.max_threads="10"
> timer.keep_alive_time="3000"
> timer.queue_max_size="500"
> thread_pool.enabled="true"
> thread_pool.min_threads="2"
> thread_pool.max_threads="30"
> thread_pool.keep_alive_time="60000"
> thread_pool.queue_enabled="false"
> thread_pool.queue_max_size="100"
> thread_pool.rejection_policy="discard"
> oob_thread_pool.enabled="true"
> oob_thread_pool.min_threads="2"
> oob_thread_pool.max_threads="30"
> oob_thread_pool.keep_alive_time="60000"
> oob_thread_pool.queue_enabled="false"
> oob_thread_pool.queue_max_size="100"
> oob_thread_pool.rejection_policy="discard"
> />
>
> <!-- <TCP_NIO -->
> <!-- bind_port="7800" -->
> <!-- bind_interface="${jgroups.tcp_nio.bind_interface:bond0}" -->
> <!-- use_send_queues="true" -->
> <!-- sock_conn_timeout="300" -->
> <!-- reader_threads="3" -->
> <!-- writer_threads="3" -->
> <!-- processor_threads="0" -->
> <!-- processor_minThreads="0" -->
> <!-- processor_maxThreads="0" -->
> <!-- processor_queueSize="100" -->
> <!-- processor_keepAliveTime="9223372036854775807"/> -->
> <TCPGOSSIP initial_hosts="${jgroups.tcpgossip.initial_hosts}"/>
> <!-- <TCPPING async_discovery="true" initial_hosts="${jgroups.tcpping.initial_hosts}"
> port_range="2" timeout="3000" /> -->
> <MERGE2 max_interval="30000" min_interval="10000"/>
> <FD_SOCK/>
> <FD timeout="3000" max_tries="3"/>
> <VERIFY_SUSPECT timeout="1500"/>
> <pbcast.NAKACK
> use_mcast_xmit="false"
> retransmit_timeout="300,600,1200,2400,4800"
> discard_delivered_msgs="true"/>
> <UNICAST2 timeout="300,600,1200"
> stable_interval="5000"
> log_not_found_msgs="true"
> max_bytes="400000"/>
> <pbcast.STABLE stability_delay="500" desired_avg_gossip="5000" max_bytes="400000"/>
> <pbcast.GMS print_local_addr="true" join_timeout="5000" merge_timeout="5000" log_collect_msgs="true" log_view_warnings="true" max_join_attempts="0" view_bundling="true"/>
> <UFC max_credits="200k" min_threshold="0.20"/>
> <MFC max_credits="200k" min_threshold="0.20"/>
> <FRAG2 frag_size="35000"/>
> <RSVP timeout="60000" resend_interval="500"
> ack_on_delivery="false" throw_exception_on_timeout="true" />
>
> </config>
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ------
>
> Would be very helpful if you could kindly provide us some working
> combination of xml configs with "Infinsipan 8.2.2 + JGroup 3.6.7" or let us know what configuration needs to be changed above.
>
> Regs,
> Manohar.
>
> -----Original Message-----
> From: Bela Ban [mailto:bban at redhat.com]
> Sent: Monday, August 22, 2016 1:58 PM
> To: Manohar SL <manohar.sl at ericsson.com>; infinispan -Dev List
> <infinispan-dev at lists.jboss.org>
> Cc: Sathish Kumar Tippeshappa <sathish.kumar.tippeshappa at ericsson.com>
> Subject: Re: JGroups Error: JGRP000029 observed with "JGroups 3.6.8 + Infinispan 8.2.2":
>
>
>
> On 22/08/16 10:23, Manohar SL wrote:
>> Hi Bela Ban,
>>
>> Thanks for the input points.
>>
>> Answers are highlighted below:
>>
>> 1: What does the NPE cause? Incorrect behavior? This is in the discovery protocol, so I don't think it is a severe error.
>> Actually, we need the Replication Mode of Infinispan, but after Upgrading to Infinispan 8.2.2, the replication mode is not working.
>> The data updated in Infinispan is getting written to only the local node and not replicated to other nodes in the Cluster (we have deployed 3 Nodes).
>> Since we saw the NPE in JGroups, we suspected this could be the reason for the Replication not functioning as expected, as it also complains of not
>> being able to send data to the other nodes in the Cluster.
>
> OK
>
>> 2: If you change the JGroups code to print the exception (or set a breakpoint and debug) we'd know what the cause is.
>> Our understating based on the JGroups code analysis, in the context of this issue, is:
>> => In the BaseServer.java file, the below method tries to send data to the other Nodes:
>> public void send(Address dest, byte[] data, int offset, int length) throws Exception
>> The below logic is not able to retrieve the Connection Object:
>> ---------------------------------------------------------------------
>> -
>> --------------------------------------------------
>> // Get a connection (or create one if not yet existent) and send the data
>> Connection conn=null;
>> try {
>> conn=getConnection(dest);
>> conn.send(data, offset, length);
>> }
>> catch(Exception ex) {
>> removeConnectionIfPresent(dest, conn);
>> throw ex;
>> }
>> ------------------------------------------------------------------------------------------------------------------------
>> This eventually leads to the NPE.
>
>
> Insert a ex.printStackTrace() statement in the catch clause.
>
>
>> 3: Does Infinispan 8.2.2 use JGroups 3.6.8, or did you upgrade JGroups?
>> Infinispan 8.2.2 by default uses JGroups 3.6.7, but we tried upgrading to JGroups 3.6.8.
>
> Then 3.6.8 may not be supported. I'm sure REPL mode wa tested before releasing 8.2.2, so perhaps 3.6.8 doesn't work with 8.2.2...
>
>> 4: Is this reproduceable? If yes, goto step #2
>> Yes, this is consistently reproducible.
>>
>> It would be really helpful if you could kindly mail us working JGroups configuration with Infinispan 8.2.2 version.
>
> Well, I suggest take the TCP based config shipped with 8.2.2 and use it as base, ie. make changes to it.
>
>> Also, if you see any issues with our current set of configurations, highlighted in the below mail, please let us know the needed changes to this.
>>
>> Thanks again for all the help.
>>
>> Regs,
>> Manohar
>>
>>
>> -----Original Message-----
>> From: Bela Ban [mailto:bban at redhat.com]
>> Sent: Monday, August 22, 2016 1:31 PM
>> To: Manohar SL <manohar.sl at ericsson.com>; infinispan -Dev List
>> <infinispan-dev at lists.jboss.org>
>> Cc: Sathish Kumar Tippeshappa
>> <sathish.kumar.tippeshappa at ericsson.com>
>> Subject: Re: JGroups Error: JGRP000029 observed with "JGroups 3.6.8 + Infinispan 8.2.2":
>>
>> 1: What does the NPE cause? Incorrect behavior? This is in the discovery protocol, so I don't think it is a severe error.
>>
>> 2: If you change the JGroups code to print the exception (or set a breakpoint and debug) we'd know what the cause is.
>>
>> 3: Does Infinispan 8.2.2 use JGroups 3.6.8, or did you upgrade JGroups?
>>
>> 4: Is this reproduceable? If yes, goto step #2
>>
>>
>>
>> On 20/08/16 09:38, Manohar SL wrote:
>>> Hi Bela Ban,
>>>
>>> We have been trying to use JGroups 3.6.8 with Infinispan 8.2.2., in this context we are observing an issue with the Replication Mode of Infinispan usage.
>>> We see the below exception from Jgroups:
>>> JGRP000029: failed sending message to <>:7800 (100 bytes):
>>> java.lang.NullPointerException, headers: TCPGOSSIP:
>>> [type=GET_MBRS_REQ, cluster=x-cluster], TP: [cluster_name=x-cluster]
>>>
>>> The configurations used are highlighted below:
>>>
>>> Infinispan config
>>>
>>> <infinispan>
>>>
>>> <jgroups>
>>>
>>> <stack-file name="configurationFile"
>>> path="config/jgroups.xml"/>
>>>
>>> </jgroups>
>>>
>>> <cache-container>
>>>
>>> <transport cluster="x-cluster" stack="configurationFile"
>>> />
>>>
>>> <replicated-cache name="transactional-type" mode="SYNC">
>>>
>>> <transaction mode="NON_XA" locking="OPTIMISTIC"
>>> transaction-manager-lookup="org.infinispan.transaction.lookup.JBossS
>>> t a ndaloneJTAManagerLookup" auto-commit="true" />
>>>
>>> <locking acquire-timeout="60000"/>
>>>
>>> <expiration lifespan="43200000"/>
>>>
>>> </replicated-cache>
>>>
>>> </cache-container>
>>>
>>> </infinispan>
>>>
>>>
>>>
>>> Jgroups configuration
>>>
>>> <!--
>>>
>>> TCP based stack, with flow control and message bundling. This
>>> is usually used when IP
>>>
>>> multicasting cannot be used in a network, e.g. because it is disabled (routers discard multicast).
>>>
>>> Note that TCP.bind_addr and TCPPING.initial_hosts should be set, possibly via system properties, e.g.
>>>
>>> -Djgroups.bind_addr=192.168.5.2 and -Djgroups.tcpping.initial_hosts=192.168.5.2[7800]".
>>>
>>> author: Bela Ban
>>>
>>> -->
>>>
>>> <config xmlns="urn:org:jgroups"
>>>
>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>
>>> xsi:schemaLocation="urn:org:jgroups
>>> http://www.jgroups.org/schema/jgroups-3.6.xsd">
>>>
>>>
>>>
>>> <TCP loopback="true"
>>>
>>> bind_addr="${jgroups.tcp.address:127.0.0.1}"
>>>
>>> bind_port="${jgroups.tcp.port:7800}"
>>>
>>> recv_buf_size="${tcp.recv_buf_size:20M}"
>>>
>>> send_buf_size="${tcp.send_buf_size:640K}"
>>>
>>> discard_incompatible_packets="true"
>>>
>>> max_bundle_size="64K"
>>>
>>> max_bundle_timeout="30"
>>>
>>> enable_bundling="true"
>>>
>>> use_send_queues="true"
>>>
>>> sock_conn_timeout="300"
>>>
>>> timer_type="new"
>>>
>>> timer.min_threads="4"
>>>
>>> timer.max_threads="10"
>>>
>>> timer.keep_alive_time="3000"
>>>
>>> timer.queue_max_size="500"
>>>
>>> thread_pool.enabled="true"
>>>
>>> thread_pool.min_threads="2"
>>>
>>> thread_pool.max_threads="30"
>>>
>>> thread_pool.keep_alive_time="60000"
>>>
>>> thread_pool.queue_enabled="false"
>>>
>>> thread_pool.queue_max_size="100"
>>>
>>> thread_pool.rejection_policy="discard"
>>>
>>> oob_thread_pool.enabled="true"
>>>
>>> oob_thread_pool.min_threads="2"
>>>
>>> oob_thread_pool.max_threads="30"
>>>
>>> oob_thread_pool.keep_alive_time="60000"
>>>
>>> oob_thread_pool.queue_enabled="false"
>>>
>>> oob_thread_pool.queue_max_size="100"
>>>
>>> oob_thread_pool.rejection_policy="discard"/>
>>>
>>>
>>>
>>> <!-- <TCP_NIO -->
>>>
>>> <!-- bind_port="7800" -->
>>>
>>> <!-- bind_interface="${jgroups.tcp_nio.bind_interface:bond0}" -->
>>>
>>> <!-- use_send_queues="true" -->
>>>
>>> <!-- sock_conn_timeout="300" -->
>>>
>>> <!-- reader_threads="3" -->
>>>
>>> <!-- writer_threads="3" -->
>>>
>>> <!-- processor_threads="0" -->
>>>
>>> <!-- processor_minThreads="0" -->
>>>
>>> <!-- processor_maxThreads="0" -->
>>>
>>> <!-- processor_queueSize="100" -->
>>>
>>> <!-- processor_keepAliveTime="9223372036854775807"/> -->
>>>
>>> <TCPGOSSIP
>>> initial_hosts="${jgroups.tcpgossip.initial_hosts}"/>
>>>
>>> <!-- <TCPPING async_discovery="true" initial_hosts="${jgroups.tcpping.initial_hosts}"
>>>
>>> port_range="2" timeout="3000" /> -->
>>>
>>> <MERGE2 max_interval="30000" min_interval="10000"/>
>>>
>>> <FD_SOCK/>
>>>
>>> <FD timeout="3000" max_tries="3"/>
>>>
>>> <VERIFY_SUSPECT timeout="1500"/>
>>>
>>> <pbcast.NAKACK
>>>
>>> use_mcast_xmit="false"
>>>
>>> retransmit_timeout="300,600,1200,2400,4800"
>>>
>>> discard_delivered_msgs="false"/>
>>>
>>> <UNICAST2 timeout="300,600,1200"
>>>
>>> stable_interval="5000"
>>>
>>> max_bytes="1m"/>
>>>
>>> <pbcast.STABLE stability_delay="500" desired_avg_gossip="5000"
>>> max_bytes="1m"/>
>>>
>>> <pbcast.GMS print_local_addr="false" join_timeout="3000"
>>> view_bundling="true"/>
>>>
>>> <UFC max_credits="200k" min_threshold="0.20"/>
>>>
>>> <MFC max_credits="200k" min_threshold="0.20"/>
>>>
>>> <FRAG2 frag_size="60000"/>
>>>
>>> <RSVP timeout="60000" resend_interval="500"
>>> ack_on_delivery="false" />
>>>
>>> </config>
>>>
>>> Any help on this would be really great.
>>> Kindly let us know if you would need any further information on this.
>>>
>>> Regs,
>>> Manohar.
>>>
>>
>> --
>> Bela Ban, JGroups lead (http://www.jgroups.org)
>>
>
> --
> Bela Ban, JGroups lead (http://www.jgroups.org)
>
--
Bela Ban, JGroups lead (http://www.jgroups.org)
_______________________________________________
infinispan-dev mailing list
infinispan-dev at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
More information about the infinispan-dev
mailing list