[jboss-user] [JBossCache] - getCacheFromCoordinator received null cache

Thu May 10 10:00:06 EDT 2007

Hi All,

I am trying to setup a tomcat cluster with 5 servers and my application uses jBoss pojo cache. Some of my servers (lets call it web5, web8 and web10)  had some problems finding each other in the cluster and we found that there were some issues with multicast packets not reaching the server. Servers are all multi-homed and so we decided to use GossipRouter and we started it in one of the nodes and used all the configurations that were mentioned in the article. (http://www.jgroups.org/javagroupsnew/docs/manual/html/user-advanced.html).

Now all the servers started talking to each other, but session replication is still not working in web5, web8 and web10. When I start the server, I am getting the following console output

-------------------------------------------------------
GMS: address is 10.5.108.78:36970
-------------------------------------------------------
INFO : [2007 05 10, 08-37:09(880)] : org.jboss.cache.TreeCache.viewAccepted(TreeCache.java:5342)- viewAccepted(): [10.5.108.80:33011|1] [10.5.108.80:33011, 10.5.108.78:36970]
INFO : [2007 05 10, 08-37:09(889)] : org.jboss.cache.TreeCache.startService(TreeCache.java:1426)- TreeCache local address is 10.5.108.78:36970
ERROR: [2007 05 10, 08-37:12(882)] : org.jgroups.protocols.FD_SOCK.getCacheFromCoordinator(FD_SOCK.java:684)- received null cache; retrying
org.jboss.cache.CacheException: Initial state transfer failed: Channel.getState() returned false
        at org.jboss.cache.TreeCache.fetchStateOnStartup(TreeCache.java:3191)
        at org.jboss.cache.TreeCache.startService(TreeCache.java:1429)
        at org.jboss.cache.aop.PojoCache.startService(PojoCache.java:94)
        at com.xminds.SessionTracker.createCache(SessionTracker.java:42)
        at com.xminds.SessionTracker.StartCache(SessionTracker.java:27)
        at com.xminds.servlets.BaseServlet.(BaseServlet.java:20)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
        at java.lang.Class.newInstance0(Class.java:350)
        at java.lang.Class.newInstance(Class.java:303)
        at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1055)
        at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:932)
        at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:3951)
        at org.apache.catalina.core.StandardContext.start(StandardContext.java:4225)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:759)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:739)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:524)
        at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:809)
        at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:698)
        at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:472)
        at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1122)
        at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:310)
        at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
        at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1021)
        at org.apache.catalina.core.StandardHost.start(StandardHost.java:718)
        at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1013)
        at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:442)
        at org.apache.catalina.core.StandardService.start(StandardService.java:450)
        at org.apache.catalina.core.StandardServer.start(StandardServer.java:709)
        at org.apache.catalina.startup.Catalina.start(Catalina.java:551)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:294)
        at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:432)
May 10, 2007 8:37:15 AM org.apache.catalina.cluster.session.DeltaManager start
INFO: Register manager /SessionTest to cluster element Host with name localhost
May 10, 2007 8:37:15 AM org.apache.catalina.cluster.session.DeltaManager start
INFO: Starting clustering manager at /SessionTest
May 10, 2007 8:37:15 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage
INFO: SEND May 10, 2007:8:37:15 AM 1 10.5.108.80:4,010 GET-ALL-/SessionTest
May 10, 2007 8:37:15 AM org.apache.catalina.cluster.session.DeltaManager getAllClusterSessions
WARNING: Manager [/SessionTest], requesting session state from org.apache.catalina.cluster.mcast.McastMember[tcp://10.5.108.80:4010,TreeCache-Cluster,10.5.108.80,4010, alive=11440]. This operation will timeout if no session state has been received within 60 seconds.
ERROR: [2007 05 10, 08-37:16(390)] : org.jgroups.protocols.FD_SOCK.getCacheFromCoordinator(FD_SOCK.java:684)- received null cache; retrying
ERROR: [2007 05 10, 08-37:19(899)] : org.jgroups.protocols.FD_SOCK.getCacheFromCoordinator(FD_SOCK.java:684)- received null cache; retrying
INFO : [2007 05 10, 08-37:20(426)] : org.jboss.cache.TreeCache._setState(TreeCache.java:2622)- received the state (size=1024 bytes)
May 10, 2007 8:38:15 AM org.apache.catalina.cluster.session.DeltaManager waitForSendAllSessions
SEVERE: Manager [/SessionTest]: No session state send at 5/10/07 8:37 AM received, timing out after 60,025 ms.
May 10, 2007 8:38:15 AM org.apache.coyote.http11.Http11BaseProtocol start
INFO: Starting Coyote HTTP/1.1 on http-8080
May 10, 2007 8:38:15 AM org.apache.coyote.http11.Http11BaseProtocol start
INFO: Starting Coyote HTTP/1.1 on http-8443
May 10, 2007 8:38:15 AM org.apache.jk.common.ChannelSocket init
INFO: JK: ajp13 listening on /0.0.0.0:8009
May 10, 2007 8:38:15 AM org.apache.jk.server.JkMain start
INFO: Jk running ID=0 time=0/18  config=null
May 10, 2007 8:38:15 AM org.apache.catalina.storeconfig.StoreLoader load
INFO: Find registry server-registry.xml at classpath resource
May 10, 2007 8:38:15 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 69672 ms
May 10, 2007 8:38:20 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage
INFO: SEND May 10, 2007:8:38:20 AM 0 - 445B819C79A10F527B0A419D2D276B85.node3-1178804300523
May 10, 2007 8:38:20 AM org.apache.catalina.cluster.tcp.SimpleTcpCluster logSendMessage
INFO: SEND May 10, 2007:8:38:20 AM 2 - 445B819C79A10F527B0A419D2D276B85.node3-1178804300580
INFO : [2007 05 10, 08-38:30(036)] : com.xminds.servlets.AddPersonServlet.doService(AddPersonServlet.java:31)- Receiving add person request from : 61.17.42.35
INFO : [2007 05 10, 08-38:30(154)] : com.xminds.servlets.AddPersonServlet.doService(AddPersonServlet.java:78)- Adding person : 123, 123 [123<1111> : 123 ] to cache.
Adding person : 123, 123 [123<1111> : 123 ] to cache against key : 123
ERROR: [2007 05 10, 08-38:30(156)] : org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:260)- Servlet.service() for servlet addperson threw exception
java.lang.NullPointerException
        at com.xminds.SessionTracker.put(SessionTracker.java:64)
        at com.xminds.servlets.AddPersonServlet.doService(AddPersonServlet.java:80)
        at com.xminds.servlets.AddPersonServlet.doPost(AddPersonServlet.java:27)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
        at org.apache.catalina.cluster.session.JvmRouteBinderValve.invoke(JvmRouteBinderValve.java:209)
        at org.apache.catalina.cluster.tcp.ReplicationValve.invoke(ReplicationValve.java:346)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
        at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:199)
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:282)
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:767)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:697)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:889)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
        at java.lang.Thread.run(Thread.java:595)
May 10, 2007 8:38:35 AM org.apache.catalina.cluster.deploy.WarWatcher check
INFO: check cluster wars at /cluster/apache-tomcat-5.5.20/war-listen

NullpointerException is due to cache being not started owing to the first exception.

Please find below my services xml file

<?xml version="1.0" encoding="UTF-8" ?>

                jboss:service=TransactionManager

                <!--         Configure the TransactionManager -->

                        org.jboss.cache.DummyTransactionManagerLookup

                <!--             Isolation level : SERIALIZABLE
                        REPEATABLE_READ (default)
                        READ_COMMITTED
                        READ_UNCOMMITTED
                        NONE
                -->
                REPEATABLE_READ

                <!--              Valid modes are LOCAL, REPL_ASYNC and REPL_SYNC -->
                REPL_SYNC

                <!--         Just used for async repl: use a replication queue -->
                false

                <!--             Replication interval for replication queue (in ms) -->
                0

                <!--             Max number of elements which trigger replication -->
                0

                <!--  Name of cluster. Needs to be the same for all clusters, in order
                        to find each other
                -->
                Sample-Cache

                <!--  JGroups protocol stack properties. Can also be a URL,
                        e.g. file:/home/bela/default.xml

                -->

                <!--bind_addr="75.126.68.196" -->

                                <!--  UDP: if you have a multihomed machine,
                                        set the bind_addr attribute to the appropriate NIC IP address, e.g bind_addr="192.168.0.2"
                                -->
                                <!--  UDP: On Windows machines, because of the media sense feature
                                        being broken with multicast (even after disabling media sense)
                                        set the loopback attribute to true
                                -->
                                <UDP mcast_addr="228.1.2.3" mcast_port="48866" bind_addr="10.5.108.80"
                                        ip_ttl="64" ip_mcast="true" mcast_send_buf_size="150000"
                                        mcast_recv_buf_size="80000" ucast_send_buf_size="150000"
                                        ucast_recv_buf_size="80000" loopback="false" />
                                <PING up_thread="false" down_thread="false" gossip_host="75.126.68.195" gossip_port="5555" gossip_refresh="15000" timeout="2000" num_initial_members="3"/>
                                <MERGE2 min_interval="10000" max_interval="20000" />
                                <FD_SOCK />
                                <VERIFY_SUSPECT timeout="1500" up_thread="false"
                                        down_thread="false" />
                                <pbcast.NAKACK gc_lag="50"
                                        retransmit_timeout="600,1200,2400,4800" max_xmit_size="8192"
                                        up_thread="false" down_thread="false" />
                                <UNICAST timeout="600,1200,2400" window_size="100"
                                        min_threshold="10" down_thread="false" />
                                <pbcast.STABLE desired_avg_gossip="20000"
                                        up_thread="false" down_thread="false" />
                                <FRAG frag_size="8192" down_thread="false"
                                        up_thread="false" />
                                <pbcast.GMS join_timeout="5000"
                                        join_retry_timeout="2000" shun="true" print_local_addr="true" />
                                <pbcast.STATE_TRANSFER up_thread="true"
                                        down_thread="true" />

                <!--         Whether or not to fetch state on joining a cluster -->
                true

                <!--             The max amount of time (in milliseconds) we wait until the
                        initial state (ie. the contents of the cache) are retrieved from
                        existing members in a clustered environment

                -->
                5000

                <!--             Number of milliseconds to wait until all responses for a
                        synchronous call have been received.
                -->
                15000

                <!--  Max number of milliseconds to wait for a lock acquisition -->
                10000

                <!--  Name of the eviction policy class. -->

Any idea why this is happening with the 3 servers. I am getting the application to work in web6 and web9 without any issues and session replication is also working fine.

Any help will be greatly appreciated.

Thanks
Jugs

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4044678#4044678

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4044678