[JBoss JIRA] (ISPN-9541) Module initialization is not thread-safe
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9541?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-9541:
------------------------------------
These server startup errors are probably related:
{noformat}
16:57:41,261 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-2) MSC000001: Failed to start service jboss.datagrid-infinispan-endpoint.hotrod.hotrod-connector: org.jboss.msc.service.StartException in service jboss.datagrid-infinispan-endpoint.hotrod.hotrod-connector: DGENDPT10004: Failed to start HotRodServer
at org.infinispan.server.endpoint.subsystem.ProtocolServerService.start(ProtocolServerService.java:160)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1736)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.execute(ServiceControllerImpl.java:1698)
at org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1556)
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1985)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1487)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1378)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.infinispan.commons.CacheConfigurationException: ISPN005029: No task manager available to register the admin operations handler
at org.infinispan.server.core.AbstractProtocolServer.registerAdminOperationsHandler(AbstractProtocolServer.java:70)
at org.infinispan.server.core.AbstractProtocolServer.startInternal(AbstractProtocolServer.java:55)
at org.infinispan.server.hotrod.HotRodServer.startInternal(HotRodServer.java:250)
at org.infinispan.server.hotrod.HotRodServer.startInternal(HotRodServer.java:106)
at org.infinispan.server.core.AbstractProtocolServer.start(AbstractProtocolServer.java:79)
at org.infinispan.server.endpoint.subsystem.SecurityActions$6.run(SecurityActions.java:136)
at org.infinispan.server.endpoint.subsystem.SecurityActions$6.run(SecurityActions.java:133)
at org.infinispan.security.Security.doPrivileged(Security.java:44)
at org.infinispan.server.endpoint.subsystem.SecurityActions.doPrivileged(SecurityActions.java:42)
at org.infinispan.server.endpoint.subsystem.SecurityActions.startProtocolServer(SecurityActions.java:140)
at org.infinispan.server.endpoint.subsystem.ProtocolServerService.startProtocolServer(ProtocolServerService.java:194)
at org.infinispan.server.endpoint.subsystem.ProtocolServerService.start(ProtocolServerService.java:152)
... 8 more
{noformat}
{noformat}
16:57:42,424 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC000001: Failed to start service jboss.datagrid-infinispan-endpoint.rest.rest-connector: org.jboss.msc.service.StartException in service jboss.datagrid-infinispan-endpoint.rest.rest-connector: DGENDPT10016: Could not start the web context for the REST Server
at org.infinispan.server.endpoint.subsystem.RestService.start(RestService.java:158)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1736)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.execute(ServiceControllerImpl.java:1698)
at org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1556)
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1985)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1487)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1378)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at org.infinispan.server.core.AbstractProtocolServer.registerTransportMBean(AbstractProtocolServer.java:115)
at org.infinispan.server.core.AbstractProtocolServer.startTransport(AbstractProtocolServer.java:92)
at org.infinispan.server.core.AbstractProtocolServer.startInternal(AbstractProtocolServer.java:61)
at org.infinispan.rest.RestServer.startInternal(RestServer.java:85)
at org.infinispan.rest.RestServer.startInternal(RestServer.java:23)
at org.infinispan.server.core.AbstractProtocolServer.start(AbstractProtocolServer.java:79)
at org.infinispan.server.endpoint.subsystem.RestService.start(RestService.java:155)
... 8 more
{noformat}
https://ci.infinispan.org/job/Infinispan/job/PR-6247/1/testReport/junit/o...
> Module initialization is not thread-safe
> ----------------------------------------
>
> Key: ISPN-9541
> URL: https://issues.jboss.org/browse/ISPN-9541
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Server
> Affects Versions: 9.4.0.CR3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.4.0.Final
>
>
> In my ISPN-9127 fix I created a {{BasicComponentRegistry}} interface that represents a mostly-read-only collection of components. It has {{replaceComponent()}} method and a {{rewire()}} method for testing purposes, but it turns out modules were also relying on the ability to replace existing components in order to work.
> Replacing global components is normally safe during the {{ModuleLifecycle.cacheManagerStarting()}}, because none of the components are started yet, so when a component starts later we can still start its dependencies first. But because some modules starts some global components, e.g. by calling {{manager.getCache(name)}}, that assumption breaks.
> The {{infinispan-server-event-logger}} module is a bit more sneaky: it doesn't replace a component, instead it replaces the actual implementation of the event logger in the {{EventLogManager}} component. Events that happen before the module's {{cacheManagerStarting()}} or after {{cacheManagerStopping()}} will be silently dropped from the persistent event log.
> I am investigating making the module a factory of factories. Instead of having a monolitic {{cacheManagerStarting()}} method, it could define a set of components that it can create, and a set of components that should be started before the cache manager is "running". We probably need a way to depend on other modules as well, maybe reusing the {{@Inject}} and {{@ComponentName}} annotations.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9127) Remote commands can access components before they are started
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9127?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-9127:
------------------------------------
https://issues.jboss.org/browse/ISPN-9127
I extracted the main functionality of the component registries in a {{BasicComponentRegistry}} interface, implemented by {{BasicComponentRegistryImpl}}.
* It has no cache-specific logic or references to specific components.
* All the dependencies are wired before being injected in another component. {{ComponentRef<T>}} fields or injection method parameters can be used as lazy dependencies to break dependency cycles. Lazy dependencies are only created and wired when the {{wired()}} method is called.
* Start/stop priorities only have a meaning within a component, if a component has multiple start/stop methods. A component's start methods are always invoked after the start methods of its non-lazy dependencies.
* {{registerComponent()}} doesn't overwrite existing components.
* {{replaceComponent()}} replaces existing components, but it is intended only for tests.
* {{getComponent()}} and {{registerComponent()}} return a {{ComponentRef<T>}}, and the component is always at least wired (all the dependencies have been injected). Use {{running()}} to get the actual instance, or {{wired()}} iff {{running()}} would create a cycle.
I also changed the behaviour of {{[Global]ComponentRegistry}} a bit:
* {{getComponent()}} always tries to create and start the component (skips the start if the registry is only INSTANTIATED)
* {{registerComponent()}} doesn't overwrite existing components
> Remote commands can access components before they are started
> -------------------------------------------------------------
>
> Key: ISPN-9127
> URL: https://issues.jboss.org/browse/ISPN-9127
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.2.Final, 9.3.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Attachments: server0, server1, server2, trace.tar.gz
>
>
> {{PerCacheInboundInvocationHandler.handle()}} may be called before the component was started, because {{GlobalInboundInvocationHandler}} fetches it from the component registry without any checks. {{CommandsFactoryImpl.initializeReplicableCommand()}} doesn't wait for the components that it injects into remote commands to be started, either.
> This started causing random test failures in {{ConcurrentStartForkChannelTest}} after ISPN-8515, which moved most initialization work from {{init()}} methods to {{start()}} methods. Because {{StateProviderImpl}} starts after {{StateTransferManagerImpl}}, it's possible for a node to receive a {{StateRequestCommand}} before {{StateProviderImpl}} has initialized:
> {noformat}
> 16:15:09,549 TRACE (remote-thread-Test-NodeB-p51957-t2:[org.infinispan.CONFIG]) [StateProviderImpl] Starting outbound transfer to node Test-NodeA for cache null, topology id 2, segments {0-255}
> 16:15:09,551 WARN (remote-thread-Test-NodeB-p51957-t2:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command StateRequestCommand{cache=org.infinispan.CONFIG, origin=Test-NodeA, type=START_STATE_TRANSFER, topologyId=2, segments={0-255}}
> java.lang.IllegalArgumentException: chunkSize must be greater than 0
> at org.infinispan.statetransfer.OutboundTransferTask.<init>(OutboundTransferTask.java:114) ~[classes/:?]
> at org.infinispan.statetransfer.StateProviderImpl.startOutboundTransfer(StateProviderImpl.java:273) ~[classes/:?]
> at org.infinispan.statetransfer.StateRequestCommand.invokeAsync(StateRequestCommand.java:101) ~[classes/:?]
> at org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokeCommand(BasePerCacheInboundInvocationHandler.java:94) ~[classes/:?]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9127) Remote commands can access components before they are started
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9127?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9127:
-------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request: https://github.com/infinispan/infinispan/pull/5965, https://github.com/infinispan/infinispan/pull/6232 (was: https://github.com/infinispan/infinispan/pull/5965)
> Remote commands can access components before they are started
> -------------------------------------------------------------
>
> Key: ISPN-9127
> URL: https://issues.jboss.org/browse/ISPN-9127
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.2.Final, 9.3.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Attachments: server0, server1, server2, trace.tar.gz
>
>
> {{PerCacheInboundInvocationHandler.handle()}} may be called before the component was started, because {{GlobalInboundInvocationHandler}} fetches it from the component registry without any checks. {{CommandsFactoryImpl.initializeReplicableCommand()}} doesn't wait for the components that it injects into remote commands to be started, either.
> This started causing random test failures in {{ConcurrentStartForkChannelTest}} after ISPN-8515, which moved most initialization work from {{init()}} methods to {{start()}} methods. Because {{StateProviderImpl}} starts after {{StateTransferManagerImpl}}, it's possible for a node to receive a {{StateRequestCommand}} before {{StateProviderImpl}} has initialized:
> {noformat}
> 16:15:09,549 TRACE (remote-thread-Test-NodeB-p51957-t2:[org.infinispan.CONFIG]) [StateProviderImpl] Starting outbound transfer to node Test-NodeA for cache null, topology id 2, segments {0-255}
> 16:15:09,551 WARN (remote-thread-Test-NodeB-p51957-t2:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command StateRequestCommand{cache=org.infinispan.CONFIG, origin=Test-NodeA, type=START_STATE_TRANSFER, topologyId=2, segments={0-255}}
> java.lang.IllegalArgumentException: chunkSize must be greater than 0
> at org.infinispan.statetransfer.OutboundTransferTask.<init>(OutboundTransferTask.java:114) ~[classes/:?]
> at org.infinispan.statetransfer.StateProviderImpl.startOutboundTransfer(StateProviderImpl.java:273) ~[classes/:?]
> at org.infinispan.statetransfer.StateRequestCommand.invokeAsync(StateRequestCommand.java:101) ~[classes/:?]
> at org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokeCommand(BasePerCacheInboundInvocationHandler.java:94) ~[classes/:?]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9517) State transfer times out if initiated with yet to be verified suspected member and reincarnated member
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/ISPN-9517?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on ISPN-9517:
--------------------------------
[~pferraro] Can you resolve this issue? It flags JGRP-2294 as a warning...
> State transfer times out if initiated with yet to be verified suspected member and reincarnated member
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-9517
> URL: https://issues.jboss.org/browse/ISPN-9517
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer
> Affects Versions: 9.3.3.Final
> Reporter: Paul Ferraro
> Assignee: Paul Ferraro
> Attachments: Test.java, node-1.zip, node-2.zip
>
>
> Here's the scenario:
> 1. Cluster contains caches on 2 members, node-1 and node-2
> 2. node-2 is killed
> 3. node-2 is restarted (using same physical address)
> 4. State transfer initiates, view contains node-1, suspected node-2, and reincarnated node-2
> 5. State transfer times out
> Log of node-1 includes:
> {noformat}
> 12:09:51,882 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p14-t4) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3 from node-2
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_181]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [rt.jar:1.8.0_181]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.8.0_181]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_181]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_181]
> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_181]
> Suppressed: org.infinispan.util.logging.TraceException
> at org.infinispan.remoting.transport.Transport.invokeRemotely(Transport.java:75)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:525)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:508)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:321)
> at org.infinispan.topology.ClusterTopologyManagerImpl.access$500(ClusterTopologyManagerImpl.java:87)
> at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:731)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_181]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_181]
> at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> ... 1 more
> {noformat}
> I've attached trace logs from node-1 and node-2.
> Changing ClusterTopologyManagerImpl.confirmMembersAvailable() to use ResponseMode.SYNCHRONOUS_IGNORE_LEAVERS instead of ResponseMode.SYNCHRONOUS allows state transfer to complete successfully.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-4075) State transfer should preserve the creation timestamp of entries
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4075?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-4075:
-------------------------------
Fix Version/s: 10.0.0.Final
(was: 9.4.0.Final)
> State transfer should preserve the creation timestamp of entries
> ----------------------------------------------------------------
>
> Key: ISPN-4075
> URL: https://issues.jboss.org/browse/ISPN-4075
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 6.0.1.Final
> Reporter: Dan Berindei
> Fix For: 10.0.0.Final
>
>
> State transfer inserts values with the current time as the creation time. Since the entries store the expected lifespan and not the expected expiration time, entries on the receiving node could expire much later than intended.
> The argument probably doesn't apply to the timestamp of the last usage. Since state transfer process could be interpreted as a reader, it should be fine to extend the update the time of the last usage both on the sending node and on the receiving node.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9541) Module initialization is not thread-safe
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9541?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9541:
-------------------------------
Summary: Module initialization is not thread-safe (was: Modules should not replace components after they were registered)
> Module initialization is not thread-safe
> ----------------------------------------
>
> Key: ISPN-9541
> URL: https://issues.jboss.org/browse/ISPN-9541
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Server
> Affects Versions: 9.4.0.CR3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.4.0.Final
>
>
> In my ISPN-9127 fix I created a {{BasicComponentRegistry}} interface that represents a mostly-read-only collection of components. It has {{replaceComponent()}} method and a {{rewire()}} method for testing purposes, but it turns out modules were also relying on the ability to replace existing components in order to work.
> Replacing global components is normally safe during the {{ModuleLifecycle.cacheManagerStarting()}}, because none of the components are started yet, so when a component starts later we can still start its dependencies first. But because some modules starts some global components, e.g. by calling {{manager.getCache(name)}}, that assumption breaks.
> The {{infinispan-server-event-logger}} module is a bit more sneaky: it doesn't replace a component, instead it replaces the actual implementation of the event logger in the {{EventLogManager}} component. Events that happen before the module's {{cacheManagerStarting()}} or after {{cacheManagerStopping()}} will be silently dropped from the persistent event log.
> I am investigating making the module a factory of factories. Instead of having a monolitic {{cacheManagerStarting()}} method, it could define a set of components that it can create, and a set of components that should be started before the cache manager is "running". We probably need a way to depend on other modules as well, maybe reusing the {{@Inject}} and {{@ComponentName}} annotations.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months