[JBoss JIRA] (ISPN-12224) Cluster in a confusing state after restarted from graceful shutdown - no hint for waiting on complete restarted
by Wolf-Dieter Fink (Jira)
[ https://issues.redhat.com/browse/ISPN-12224?page=com.atlassian.jira.plugi... ]
Wolf-Dieter Fink updated ISPN-12224:
------------------------------------
Priority: Critical (was: Major)
> Cluster in a confusing state after restarted from graceful shutdown - no hint for waiting on complete restarted
> ---------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-12224
> URL: https://issues.redhat.com/browse/ISPN-12224
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 12.0.0.Dev01, 11.0.3.Final
> Reporter: Wolf-Dieter Fink
> Priority: Critical
>
> After a cluster is stopped with "shutdown cluster" and incomplete restart there is no WARN or INFO message that the cluster is in an incomplete state if not all nodes are back.
> If there is a single node started it is still possible to add new entries!!
> As well as entries can be read.
> But the server will throw Exceptions.
> The expectation is to have log messages with a statement that the cluster of (a,b, ...) is incomplete started after graceful shutdown and the missing nodes are (x,y,...)
> It should not be possible to access caches.
> There should be a CLI/JMX option to interrupt the graceful start and set the cluster to a working state - even if there is a possible loss of data in this case.
>
>
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12262) After a "cluster shutdown" there is no way to bring nodes simple down
by Wolf-Dieter Fink (Jira)
[ https://issues.redhat.com/browse/ISPN-12262?page=com.atlassian.jira.plugi... ]
Wolf-Dieter Fink updated ISPN-12262:
------------------------------------
Priority: Critical (was: Major)
> After a "cluster shutdown" there is no way to bring nodes simple down
> ---------------------------------------------------------------------
>
> Key: ISPN-12262
> URL: https://issues.redhat.com/browse/ISPN-12262
> Project: Infinispan
> Issue Type: Bug
> Reporter: Wolf-Dieter Fink
> Priority: Critical
>
> If a 3 node cluster is brough down with "shutdown cluster" the state files are created in <node>/data
> After restart it successfully and shutdown all nodes individually the expectation is that atart one node will work.
> But as the files are still there there is the
> Caused by: java.lang.IllegalArgumentException: Command does not have a topology id
> error until all nodes are back.
> Also if starting a NEW node the state is messed up as well.
> The expected behavior here, as the cluster has been scaled down to 0 (one by one) is that it must be possible to start one node as the 'new' cluster in the same way as if there was no "shutdown cluster" before.
> Note if two nodes are stopped the last node is consistent and contains all entries (as expected) and is working properly. After restarting it (which can be a long period after the others) the behavior is similar to a cluster-shutdown
> which is completely unexpected (as the other nodes might be deleted completely on purpose)
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12256) InfinispanServerExtensionContainerTest fails to start
by Katia Aresti (Jira)
[ https://issues.redhat.com/browse/ISPN-12256?page=com.atlassian.jira.plugi... ]
Katia Aresti updated ISPN-12256:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> InfinispanServerExtensionContainerTest fails to start
> -----------------------------------------------------
>
> Key: ISPN-12256
> URL: https://issues.redhat.com/browse/ISPN-12256
> Project: Infinispan
> Issue Type: Bug
> Components: Server, Test Suite
> Affects Versions: 12.0.0.Dev02
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 12.0.0.Dev03
>
>
> The server fails to start with {{ISPN080028: Infinispan Server failed to start org.infinispan.commons.CacheConfigurationException: ISPN000540: No such JGroups stack 'test-udp'}}, and the test times out after 45 seconds.
> The failure is hidden in Jenkins with surefire 3.0.0-M4 (Maven only says that the 0 tests ran), but appears (without any description) after the upgrade to 3.0.0-M5.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12255) spring-boot-starter tests do not run
by Katia Aresti (Jira)
[ https://issues.redhat.com/browse/ISPN-12255?page=com.atlassian.jira.plugi... ]
Katia Aresti updated ISPN-12255:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> spring-boot-starter tests do not run
> ------------------------------------
>
> Key: ISPN-12255
> URL: https://issues.redhat.com/browse/ISPN-12255
> Project: Infinispan
> Issue Type: Bug
> Components: Spring Integration, Test Suite
> Affects Versions: 12.0.0.Dev02
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 12.0.0.Dev03
>
>
> The tests fail to start because log4j2 logging is redirected to slf4j and slf4j is redirected back to log4j2:
> {noformat}
> [ERROR] test.infinispan.autoconfigure.CacheManagerTest Time elapsed: 0.316 s <<< ERROR!
> java.lang.ExceptionInInitializerError
> at org.junit.jupiter.engine.execution.ExtensionValuesStore.lambda$getOrComputeIfAbsent$0(ExtensionValuesStore.java:81)
> at org.junit.jupiter.engine.execution.ExtensionValuesStore$MemoizingSupplier.get(ExtensionValuesStore.java:182)
> at org.junit.jupiter.engine.execution.ExtensionValuesStore.getOrComputeIfAbsent(ExtensionValuesStore.java:84)
> at org.junit.jupiter.engine.execution.ExtensionValuesStore.getOrComputeIfAbsent(ExtensionValuesStore.java:88)
> at org.junit.jupiter.engine.execution.NamespaceAwareStore.getOrComputeIfAbsent(NamespaceAwareStore.java:61)
> at org.springframework.test.context.junit.jupiter.SpringExtension.getTestContextManager(SpringExtension.java:213)
> at org.springframework.test.context.junit.jupiter.SpringExtension.beforeAll(SpringExtension.java:77)
> at org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$invokeBeforeAllCallbacks$7(ClassBasedTestDescriptor.java:359)
> at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> at org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeBeforeAllCallbacks(ClassBasedTestDescriptor.java:359)
> ...
> Caused by: org.apache.logging.log4j.LoggingException: log4j-slf4j-impl cannot be present with log4j-to-slf4j
> at org.apache.logging.slf4j.Log4jLoggerFactory.validateContext(Log4jLoggerFactory.java:49)
> at org.apache.logging.slf4j.Log4jLoggerFactory.newLogger(Log4jLoggerFactory.java:39)
> at org.apache.logging.slf4j.Log4jLoggerFactory.newLogger(Log4jLoggerFactory.java:30)
> at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:54)
> at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:30)
> at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:358)
> at org.apache.commons.logging.LogAdapter$Slf4jAdapter.createLocationAwareLog(LogAdapter.java:130)
> at org.apache.commons.logging.LogAdapter.createLog(LogAdapter.java:91)
> at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:67)
> at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:59)
> at org.springframework.test.context.TestContextManager.<clinit>(TestContextManager.java:92)
> {noformat}
> The failure is hidden in Jenkins with surefire 3.0.0-M4 (Maven only says that the 0 tests ran), but appears (without any description) after the upgrade to 3.0.0-M5.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12265) State after cluster shutdown should be cleared by CLI/REST/JMX command
by Wolf-Dieter Fink (Jira)
Wolf-Dieter Fink created ISPN-12265:
---------------------------------------
Summary: State after cluster shutdown should be cleared by CLI/REST/JMX command
Key: ISPN-12265
URL: https://issues.redhat.com/browse/ISPN-12265
Project: Infinispan
Issue Type: Feature Request
Affects Versions: 11.0.3.Final, 12.0.0.Dev03
Reporter: Wolf-Dieter Fink
After a cluster is brought down with "shutdown cluster" command the restart need to have all the known nodes back to prevent from any data lost.
If there is an issue bringing the nodes back it should be possible to trigger the state-transfer and bring the cluster back online.
It should have two options
* keep the data
* clear the cache and start empty
This would cause a loss of data if not all segments are available for a cache.
In this case a WARN message should be logged to notify the user that data is lost.
It will be possible to reset the state by removing the data/*.state files, but in this case another restart is needed and more data is lost as the first node will use the (incomplete) store and other local stores are not used to recover.
Reset the state when having all remaining nodes up will keep more data as some segments are still complete or have at least the primary or one backup owner.
Best case if if less nodes than numOwner are missed, in that case the data is completely available and con be sucessfully rebalanced.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12262) After a "cluster shutdown" there is no way to bring nodes simple down
by Wolf-Dieter Fink (Jira)
[ https://issues.redhat.com/browse/ISPN-12262?page=com.atlassian.jira.plugi... ]
Wolf-Dieter Fink updated ISPN-12262:
------------------------------------
Security: (was: Red Hat Internal)
> After a "cluster shutdown" there is no way to bring nodes simple down
> ---------------------------------------------------------------------
>
> Key: ISPN-12262
> URL: https://issues.redhat.com/browse/ISPN-12262
> Project: Infinispan
> Issue Type: Bug
> Reporter: Wolf-Dieter Fink
> Priority: Major
>
> If a 3 node cluster is brough down with "shutdown cluster" the state files are created in <node>/data
> After restart it successfully and shutdown all nodes individually the expectation is that atart one node will work.
> But as the files are still there there is the
> Caused by: java.lang.IllegalArgumentException: Command does not have a topology id
> error until all nodes are back.
> Also if starting a NEW node the state is messed up as well.
> The expected behavior here, as the cluster has been scaled down to 0 (one by one) is that it must be possible to start one node as the 'new' cluster in the same way as if there was no "shutdown cluster" before.
> Note if two nodes are stopped the last node is consistent and contains all entries (as expected) and is working properly. After restarting it (which can be a long period after the others) the behavior is similar to a cluster-shutdown
> which is completely unexpected (as the other nodes might be deleted completely on purpose)
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12264) Template or custom configuration should be required to create a cache
by Dan Berindei (Jira)
Dan Berindei created ISPN-12264:
-----------------------------------
Summary: Template or custom configuration should be required to create a cache
Key: ISPN-12264
URL: https://issues.redhat.com/browse/ISPN-12264
Project: Infinispan
Issue Type: Bug
Components: Console
Affects Versions: 12.0.0.Dev02
Reporter: Dan Berindei
Assignee: Katia Aresti
If I click only write a cache name and click the {{Create}} button, a {{LOCAL}} cache is created, which is not at all useful on the server. I should be required to either select a template or supply a custom configuration in order to create the cache.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12263) RemoteCacheMetricBinderTest fails
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-12263?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-12263:
--------------------------------
Status: Open (was: New)
> RemoteCacheMetricBinderTest fails
> ---------------------------------
>
> Key: ISPN-12263
> URL: https://issues.redhat.com/browse/ISPN-12263
> Project: Infinispan
> Issue Type: Bug
> Components: Spring Integration
> Affects Versions: 12.0.0.Dev02
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 12.0.0.Dev03
>
>
> {{RemoteCacheMetricBinderTest}} currently doesn't run at all, because of ISPN-12255. After fixing ISPN-12255, it still doesn't pass, but with Surefire the setup failure is reported just as
> {noformat}
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 70.094 s - in test.org.infinispan.spring.starter.remote.actuator.RemoteCacheMetricBinderTest
> {noformat}
> When run in the IDE, multiple problems become apparent:
> * {{InfinispanServerExtension.beforeTestExecution()}} runs too late, so the {{TestClient}} is not available in {{InfinispanCacheMetricBinderTest.binder()}}
> * {{InfinispanServerExtension.afterTestExecution()}} runs too early, so the {{TestClient}} is no longer available in {{InfinispanCacheMetricBinderTest.cleanCache()}}
> * The {{mycache}} cache is not defined on the server
> * The {{RemoteCacheManager}} does not have statistics enabled
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months
[JBoss JIRA] (ISPN-12263) RemoteCacheMetricBinderTest fails
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-12263?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-12263:
--------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/8655
> RemoteCacheMetricBinderTest fails
> ---------------------------------
>
> Key: ISPN-12263
> URL: https://issues.redhat.com/browse/ISPN-12263
> Project: Infinispan
> Issue Type: Bug
> Components: Spring Integration
> Affects Versions: 12.0.0.Dev02
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 12.0.0.Dev03
>
>
> {{RemoteCacheMetricBinderTest}} currently doesn't run at all, because of ISPN-12255. After fixing ISPN-12255, it still doesn't pass, but with Surefire the setup failure is reported just as
> {noformat}
> [INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 70.094 s - in test.org.infinispan.spring.starter.remote.actuator.RemoteCacheMetricBinderTest
> {noformat}
> When run in the IDE, multiple problems become apparent:
> * {{InfinispanServerExtension.beforeTestExecution()}} runs too late, so the {{TestClient}} is not available in {{InfinispanCacheMetricBinderTest.binder()}}
> * {{InfinispanServerExtension.afterTestExecution()}} runs too early, so the {{TestClient}} is no longer available in {{InfinispanCacheMetricBinderTest.cleanCache()}}
> * The {{mycache}} cache is not defined on the server
> * The {{RemoteCacheManager}} does not have statistics enabled
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 4 months