[JBoss JIRA] (ISPN-5208) Avoid invalid topology
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-5208?page=com.atlassian.jira.plugin.... ]
Work on ISPN-5208 started by Galder Zamarreño.
----------------------------------------------
> Avoid invalid topology
> ----------------------
>
> Key: ISPN-5208
> URL: https://issues.jboss.org/browse/ISPN-5208
> Project: Infinispan
> Issue Type: Enhancement
> Components: Server
> Reporter: Takayoshi Kimura
> Assignee: Galder Zamarreño
> Labels: jdg641
> Fix For: 7.2.0.Final
>
>
> We've seen some invalid topology propagated to client and it causes ArrayIndexOutOfBoundsException:
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
> at org.infinispan.client.hotrod.impl.transport.tcp.RoundRobinBalancingStrategy.getServerByIndex(RoundRobinBalancingStrategy.java:68) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.impl.transport.tcp.RoundRobinBalancingStrategy.nextServer(RoundRobinBalancingStrategy.java:44) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.nextServer(TcpTransportFactory.java:220) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.getTransport(TcpTransportFactory.java:194) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.impl.operations.FaultTolerantPingOperation.getTransport(FaultTolerantPingOperation.java:27) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:48) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.impl.RemoteCacheImpl.ping(RemoteCacheImpl.java:535) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.RemoteCacheManager.ping(RemoteCacheManager.java:635) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.RemoteCacheManager.createRemoteCache(RemoteCacheManager.java:616) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.RemoteCacheManager.getCache(RemoteCacheManager.java:527) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> at org.infinispan.client.hotrod.RemoteCacheManager.getCache(RemoteCacheManager.java:523) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
> {noformat}
> {noformat}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
> at org.infinispan.client.hotrod.impl.consistenthash.SegmentConsistentHash.getServer(SegmentConsistentHash.java:33)
> at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.getTransport(TcpTransportFactory.java:204)
> at org.infinispan.client.hotrod.impl.operations.AbstractKeyOperation.getTransport(AbstractKeyOperation.java:40)
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:48)
> at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:237)
> at org.infinispan.client.hotrod.impl.RemoteCacheSupport.put(RemoteCacheSupport.java:79)
> at sample.Main.main(Main.java:16)
> {noformat}
> It happens on both Hot Rod 2 and 1.3 clients.
> It's really hard to reproduce this state and we don't have a consistent way to reproduce it. However when this happens there is always view change happening so it's related to view change.
> Judging from the stack trace, the client receives numOwners=0 or numSegments=0 topology from the server.
> Also we are unable to find to recover this situation. Rebooting random nodes don't help and keep getting this exceptions on client side.
> Until we can find the root cause, I think it's better to add a guard to avoid this kind invalid topology stored in the server side and propagated to the clients.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5295) Remote cache entries with sub-second lifetime never expire
by Ion Savin (JIRA)
[ https://issues.jboss.org/browse/ISPN-5295?page=com.atlassian.jira.plugin.... ]
Ion Savin updated ISPN-5295:
----------------------------
Status: Open (was: New)
> Remote cache entries with sub-second lifetime never expire
> ----------------------------------------------------------
>
> Key: ISPN-5295
> URL: https://issues.jboss.org/browse/ISPN-5295
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Protocols
> Affects Versions: 7.2.0.Alpha1
> Reporter: Ion Savin
>
> remoteCache.put("key", "value", 999, TimeUnit.MILLISECOND)
> Expected: the entry expires after 999 millis
> Actual: the entry never expires
> HotRod lifespan is specified with second granularity but during conversion the subsecond part is truncated:
> remoteCache.put("key", "value", 1999, TimeUnit.MILLISECOND)
> Expected: after 1998 millis the entry is still present
> Actual: after 1000 millis the entry is expired
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5254) Server not always stopped properly with the IBM JDK
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5254?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5254:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Server not always stopped properly with the IBM JDK
> ---------------------------------------------------
>
> Key: ISPN-5254
> URL: https://issues.jboss.org/browse/ISPN-5254
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Server
> Affects Versions: 7.2.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Vitalii Chepeliuk
> Fix For: 7.2.0.Beta1
>
>
> Because of WFLY-3549, the Infinispan/Wildfly server doesn't always stop properly and it needs to be killed with {{kill -9}}.
> AFAICT Arquillian doesn't always handle this when there is a startup problem, because it uses {{Process.destroy()}} instead of {{Process.destroyForcefully()}}, and I believe it doesn't go through {{InfinispanServerKillProcessor}}. However, the server seems to be properly started in this case.
> We have two Ant scripts that kill any running server: {{kill-jbossas.xml}} in {{server/integration/testsuite}} and {{build.xml}} in {{integrationtests/as-integration-client}}. However, the IBM JDK installed on the CI agent machines doesn't have a {{jps}} command, so the script doesn't work:
> {noformat}
> [04:47:58]E: [org.infinispan:infinispan-as-module-client-integrationtests] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.8:run (infinispan-server-shutdown) on project infinispan-as-module-client-integrationtests: An Ant BuildException has occured: The following error occurred while executing this line:
> /mnt/persistent_storage/cloud-user/ispn/buildAgent/work/64255532d1f9a010/integrationtests/as-integration-client/build.xml:55: Execute failed: java.io.IOException: Cannot run program "/opt/ibm/java-x86_64-71/bin/jps" (in directory "/mnt/persistent_storage/cloud-user/ispn/buildAgent/work/64255532d1f9a010/integrationtests/as-integration-client"): error=2, No such file or directory
> around Ant part ...<ant antfile="build.xml" target="kill_server"/>... @ 4:50 in /mnt/persistent_storage/cloud-user/ispn/buildAgent/work/64255532d1f9a010/integrationtests/as-integration-client/target/antrun/build-main.xml
> {noformat}
> I suggest using {{ps -o pid=,cmd= -Cjava}} to check for running processes instead of {{jps}}.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years
[JBoss JIRA] (ISPN-5174) Transaction cannot be recommitted after ownership changes
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5174?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5174:
-------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request: https://github.com/infinispan/infinispan/pull/3305
> Transaction cannot be recommitted after ownership changes
> ---------------------------------------------------------
>
> Key: ISPN-5174
> URL: https://issues.jboss.org/browse/ISPN-5174
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.1.0.CR2
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
>
> Once transaction is completed, it cannot commit again. If it should commit more keys since it has become an owner of some new keys modified in this transaction, it just ignores the further commit.
> There is a race with state transfer which can bring an old value (with StateResponseCommand sent before it is commited) but the value is not set by the ongoing transaction either.
> This results with stale value stored on one node.
> In my case, The problematic part is transaction <edg-perf01-62141>:15066 (consisting of 10 modifications) which got prepared and committed on edg-perf04 in topology 25. Before the originator finishes, topology changes and 04 requests ongoing transactions:
> {code}
> 11:06:11,369 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (transport-thread-17) Replication task sending StateRequestCommand{cache=testCache, origin=edg-perf04-35097, type=GET_TRANSACTIONS, topologyId=28, segments=[275, 1, 278, 9, 282, 286, 17, 259, 25, 267, 171, 169, 33, 306, 175, 173, 310, 172, 314, 41, 167, 165, 318, 187, 290, 49, 185, 191, 294, 189, 179, 298, 57, 177, 183, 302, 181, 343, 205, 201, 338, 203, 336, 351, 197, 349, 199, 347, 193, 345, 195, 326, 85, 87, 322, 93, 332, 95, 330, 89, 91, 103, 101, 99, 506, 97, 105, 357, 359, 353, 355, 361]} to single recipient edg-perf01-62141 with response mode GET_ALL
> 11:06:11,495 DEBUG [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread-17) Applying 6 transactions for cache testCache transferred from node edg-perf01-62141
> {code}
> However I don't see how these are applied, since PrepareCommand is not created again - from the code I see only that backup locks are added. Not sure if the transaction is registered at all, since it was already completed on this node (but at that time it did not own key_00000000000002EB).
> After originator stores the entry, it sends one more CommitCommand with topology 28:
> {code}
> 11:06:11,619 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (DefaultStressor-2) Replication task sending CommitCommand {gtx=GlobalTransaction:<edg-perf01-62141>:15066:local, cacheName='testCache', topologyId=28} to addresses [edg-perf03-20530, edg-perf04-35097] with response mode GET_ALL
> {code}
> 04 receives several CommitCommands (both from originator and forwards), but all of them are ignored as the transaction is completed.
> I don't see the logs where state transfer is assembled, but it's probably before the entry is stored on originator as the state transfer contains the old entry:
> {code}
> 11:06:13,449 TRACE [org.infinispan.statetransfer.StateConsumerImpl] (remote-thread-91) Received chunk with keys [key_000000000000065B, key_00000000000006BE, key_FFFFFFFFFFFFE62F, key_0000000000001F42, key_000000000000027B, key_000000000000159D, key_00000000000002EB, key_00000000000002BB] for segment 343 of cache testCache from node edg-perf01-62141
> 11:06:13,454 TRACE [org.infinispan.container.DefaultDataContainer] (remote-thread-91) Store ImmortalCacheEntry{key=key_00000000000002EB, value=[2 #7: 366, 544, 576, 804, 1061, 1181, 1290, ]} in container
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years