[JBoss JIRA] (ISPN-6402) Default GMS.join_timeout is too long
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-6402?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec closed ISPN-6402.
-------------------------------------
> Default GMS.join_timeout is too long
> ------------------------------------
>
> Key: ISPN-6402
> URL: https://issues.jboss.org/browse/ISPN-6402
> Project: Infinispan
> Issue Type: Task
> Components: Core, Server, Test Suite - Server
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Fix For: 8.2.1.Final, 9.0.0.Alpha1, 9.0.0.Final
>
>
> {{GMS.join_timeout}} is used by JGroups for two purposes:
> # Wait for {{FIND_INITIAL_MBRS}} responses. If other nodes are running, but they don't answer within {{join_timeout}} ms, the node will start a new partition by itself.
> # If no other nodes are running when the request is sent, but another node starts and sends its own discovery request within {{join_timeout}}, the initial cluster view will contain both nodes, but this isn't really useful in Infinispan (we have {{gcb.transport().initialClusterSize()}} instead).
> # Once a coordinator is located, the node sends a join request and waits for a response for {{join_timeout}} ms. After a timeout, the node re-sends the join request (up to a maximum of {{max_join_attempts}}, which defaults to 10).
> The default {{GMS.join_timeout}} in Infinispan is 15000, vs. 2000 in JGroups (actually 3000 in {{GMS}} itself, but 2000 in the example configurations).
> The higher timeout will only help us when a node is running, but it's inaccessible (e.g. because of a long GC) at the exact time a node is joining. I'd argue that applications that can tolerate multi-second pauses would be better served by {{gcb.transport().initialClusterSize(2)}} and/or an external discovery mechanism (e.g. {{FILE_PING}}, or something based on the WildFly domain controller). For most applications, the current default means just a 15s delay every time the cluster is (re)started.
> In particular, because our integration tests use the default configuration, it means a delay of 15s for every test that starts a cluster.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 8 months
[JBoss JIRA] (ISPN-6409) NPE in ChannelMetric for non-master nodes
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-6409?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec closed ISPN-6409.
-------------------------------------
> NPE in ChannelMetric for non-master nodes
> -----------------------------------------
>
> Key: ISPN-6409
> URL: https://issues.jboss.org/browse/ISPN-6409
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.0.Final
> Reporter: Tristan Tarrant
> Assignee: Tristan Tarrant
> Fix For: 8.2.1.Final, 9.0.0.Alpha1, 9.0.0.Final
>
>
> Attempting to retrieve the jgroups subsystem attributes of a non-master node on a RELAY channel causes an NPE.
> [Server:earth-one] 18:36:34,055 ERROR [org.jboss.as.controller.management-operation] (ServerService Thread Pool -- 35) WFLYCTL0013: Operation ("read-attribute") failed - address: ([
> [Server:earth-one] ("subsystem" => "datagrid-jgroups"),
> [Server:earth-one] ("channel" => "xsite")
> [Server:earth-one] ]): java.lang.IllegalArgumentException: value is null
> [Server:earth-one] at org.jboss.dmr.ModelNode.<init>(ModelNode.java:162)
> [Server:earth-one] at org.infinispan.server.jgroups.subsystem.ChannelMetric$2.execute(ChannelMetric.java:46)
> [Server:earth-one] at org.infinispan.server.jgroups.subsystem.ChannelMetric$2.execute(ChannelMetric.java:43)
> [Server:earth-one] at org.infinispan.server.jgroups.subsystem.ChannelMetricExecutor.execute(ChannelMetricExecutor.java:47)
> [Server:earth-one] at org.infinispan.server.commons.controller.MetricHandler.executeRuntimeStep(MetricHandler.java:70)
> [Server:earth-one] at org.jboss.as.controller.AbstractRuntimeOnlyHandler$1.execute(AbstractRuntimeOnlyHandler.java:53)
> [Server:earth-one] at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:890)
> [Server:earth-one] at org.jboss.as.controller.AbstractOperationContext.processStages(AbstractOperationContext.java:659)
> [Server:earth-one] at org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:370)
> [Server:earth-one] at org.jboss.as.controller.OperationContextImpl.executeOperation(OperationContextImpl.java:1344)
> [Server:earth-one] at org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:392)
> [Server:earth-one] at org.jboss.as.controller.ModelControllerImpl.execute(ModelControllerImpl.java:217)
> [Server:earth-one] at org.jboss.as.controller.remote.TransactionalProtocolOperationHandler.internalExecute(TransactionalProtocolOperationHandler.java:247)
> [Server:earth-one] at org.jboss.as.controller.remote.TransactionalProtocolOperationHandler$ExecuteRequestHandler.doExecute(TransactionalProtocolOperationHandler.java:185)
> [Server:earth-one] at org.jboss.as.controller.remote.TransactionalProtocolOperationHandler$ExecuteRequestHandler$1.run(TransactionalProtocolOperationHandler.java:138)
> [Server:earth-one] at org.jboss.as.controller.remote.TransactionalProtocolOperationHandler$ExecuteRequestHandler$1.run(TransactionalProtocolOperationHandler.java:134)
> [Server:earth-one] at java.security.AccessController.doPrivileged(Native Method)
> [Server:earth-one] at javax.security.auth.Subject.doAs(Subject.java:360)
> [Server:earth-one] at org.jboss.as.controller.AccessAuditContext.doAs(AccessAuditContext.java:81)
> [Server:earth-one] at org.jboss.as.controller.remote.TransactionalProtocolOperationHandler$ExecuteRequestHandler$2$1.run(TransactionalProtocolOperationHandler.java:157)
> [Server:earth-one] at org.jboss.as.controller.remote.TransactionalProtocolOperationHandler$ExecuteRequestHandler$2$1.run(TransactionalProtocolOperationHandler.java:153)
> [Server:earth-one] at java.security.AccessController.doPrivileged(Native Method)
> [Server:earth-one] at org.jboss.as.controller.remote.TransactionalProtocolOperationHandler$ExecuteRequestHandler$2.execute(TransactionalProtocolOperationHandler.java:153)
> [Server:earth-one] at org.jboss.as.protocol.mgmt.AbstractMessageHandler$ManagementRequestContextImpl$1.doExecute(AbstractMessageHandler.java:363)
> [Server:earth-one] at org.jboss.as.protocol.mgmt.AbstractMessageHandler$AsyncTaskRunner.run(AbstractMessageHandler.java:472)
> [Server:earth-one] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [Server:earth-one] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [Server:earth-one] at java.lang.Thread.run(Thread.java:745)
> [Server:earth-one] at org.jboss.threads.JBossThread.run(JBossThread.java:320)
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 8 months
[JBoss JIRA] (ISPN-6093) When infinispan-remote and infinispan-embedded are deployed together we get an error
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-6093?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec closed ISPN-6093.
-------------------------------------
> When infinispan-remote and infinispan-embedded are deployed together we get an error
> ------------------------------------------------------------------------------------
>
> Key: ISPN-6093
> URL: https://issues.jboss.org/browse/ISPN-6093
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 8.1.0.Final, 8.2.0.Final
> Reporter: Sebastian Łaskawiec
> Assignee: Tristan Tarrant
> Priority: Minor
> Fix For: 8.1.3.Final, 8.2.1.Final, 9.0.0.Alpha1
>
>
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: org.infinispan.commons.logging.BasicLogFactory.getLog(Ljava/lang/Class;)Lorg/jboss/logging/BasicLogger;
> at org.infinispan.client.hotrod.impl.operations.PingOperation.<clinit>(PingOperation.java:25)
> at org.infinispan.client.hotrod.impl.transport.tcp.TransportObjectFactory.ping(TransportObjectFactory.java:51)
> at org.infinispan.client.hotrod.impl.transport.tcp.TransportObjectFactory.makeObject(TransportObjectFactory.java:45)
> at org.infinispan.client.hotrod.impl.transport.tcp.TransportObjectFactory.makeObject(TransportObjectFactory.java:16)
> at infinispan.org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1220)
> at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.pingServersIgnoreException(TcpTransportFactory.java:177)
> at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.start(TcpTransportFactory.java:148)
> at org.infinispan.client.hotrod.RemoteCacheManager.start(RemoteCacheManager.java:579)
> at org.infinispan.client.hotrod.RemoteCacheManager.<init>(RemoteCacheManager.java:380)
> at org.infinispan.client.hotrod.RemoteCacheManager.<init>(RemoteCacheManager.java:387)
> at org.infinispan.data.RemoteWordCount.main(RemoteWordCount.java:25)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
> {code}
> The main cause is that in embedded BasicLogger is relocated whereas in remote it's not.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 8 months
[JBoss JIRA] (ISPN-6357) Deadlock during server start
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-6357?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec closed ISPN-6357.
-------------------------------------
> Deadlock during server start
> ----------------------------
>
> Key: ISPN-6357
> URL: https://issues.jboss.org/browse/ISPN-6357
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Server
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 8.2.1.Final, 9.0.0.Alpha1
>
> Attachments: s0.txt, s1.txt, server1.txt, server2.txt
>
>
> This happens frequently when starting servers in parallel, the more servers, the easier to reproduce.
> Attached the stack trace of server1 and server2 after hanging.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 8 months