[JBoss JIRA] (ISPN-9291) BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
by Katia Aresti (JIRA)
[ https://issues.jboss.org/browse/ISPN-9291?page=com.atlassian.jira.plugin.... ]
Katia Aresti updated ISPN-9291:
-------------------------------
Fix Version/s: 9.4.0.Final
(was: 9.4.0.CR1)
> BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-9291
> URL: https://issues.jboss.org/browse/ISPN-9291
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.4.0.Final
>
>
> The partition handling tests use {{BasePartitionHandlingTest.Partition.installMergeView(view1, view2)}} to install the merge view without waiting for {{MERGE3}} to run, making them much faster. Unfortunately, the implementation is incorrect: {{GMS.installView(view)}} only works for regular views, merge views need to be installed with {{GMS.installView(mergeView, digest)}}.
> The result is that the nodes that got isolated from the coordinator request the retransmission of all the {{NAKACK2}} messages (including view updates) since the cluster first started. The isolated nodes cannot install the merge view until they deliver all the older messages (even without knowing whether they're OOB or not). But if {{STABLE}} ran and cleared a range of messages already, the retransmission request cannot be satisfied, so the view updates will never be delivered.
> This is easily reproducible in {{CrashedNodeDuringConflictResolutionTest}} if we add a delay before updating the topology in {{StateConsumerImpl}}. The test installs the merge view manually, but then kills NodeC and expects the cluster to install the new view automatically. NodeD can't install the new view because it's waiting for earlier messages from NodeA:
> {noformat}
> 18:27:13,054 INFO (testng-test:[]) [TestSuiteProgress] Test starting: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> 18:27:13,640 DEBUG (testng-test:[]) [GMS] test-NodeA-39513: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,674 DEBUG (testng-test:[]) [GMS] test-NodeD-59078: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,828 TRACE (jgroups-7,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((1): {50}) to test-NodeA-39513
> 18:27:13,966 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((49): {1-49}) to test-NodeA-39513
> 18:27:14,067 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:14,504 DEBUG (testng-test:[]) [DefaultCacheManager] Stopping cache manager ISPN on test-NodeC-43706
> 18:27:18,642 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: joiners=[], suspected=[test-NodeC-43706], leaving=[], new view: [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,643 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: mcasting view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,646 DEBUG (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: installing view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,652 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: sending msg to null, src=test-NodeA-39513, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=63], TP: [cluster_name=ISPN]
> 18:27:18,656 TRACE (jgroups-20,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: received [dst: test-NodeA-39513, src: test-NodeB-9439 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=100, TP: [cluster_name=ISPN]
> 18:27:20,554 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,653 WARN (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: failed to collect all ACKs (expected=2) for view [test-NodeA-39513|11] after 2000ms, missing 1 ACKs from (1) test-NodeD-59078
> 18:27:20,656 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,756 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> ...
> 18:28:14,412 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,513 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,589 ERROR (testng-test:[]) [TestSuiteProgress] Test failed: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> java.lang.RuntimeException: Cache ___defaultcache timed out waiting for rebalancing to complete on node test-NodeA-39513, current topology is CacheTopology{id=21, phase=CONFLICT_RESOLUTION, rebalanceId=7, currentCH=PartitionerConsistentHash:DefaultConsistentHash{ns=256, owners = (3)[test-NodeD-59078: 256+0, test-NodeA-39513: 0+256, test-NodeB-9439: 0+256]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeD-59078, test-NodeA-39513, test-NodeB-9439], persistentUUIDs=[828108c4-4251-49fc-9481-ff6392bea9fb, 1d4b6f07-b71b-41a1-adfb-abbe68944a9f, 3a1ece05-c282-433e-9eb5-7b3e0f1932aa]}. rebalanceInProgress=true, currentChIsBalanced=true
> at org.infinispan.test.TestingUtil.waitForNoRebalance(TestingUtil.java:392) ~[test-classes/:?]
> at org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.performMerge(CrashedNodeDuringConflictResolutionTest.java:113) ~[test-classes/:?]
> at org.infinispan.conflict.impl.BaseMergePolicyTest.testPartitionMergePolicy(BaseMergePolicyTest.java:137) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
5 years, 8 months
[JBoss JIRA] (ISPN-9072) Document Protobuf annotated collection null set callbacks
by Katia Aresti (JIRA)
[ https://issues.jboss.org/browse/ISPN-9072?page=com.atlassian.jira.plugin.... ]
Katia Aresti updated ISPN-9072:
-------------------------------
Fix Version/s: (was: 9.4.0.CR1)
> Document Protobuf annotated collection null set callbacks
> ---------------------------------------------------------
>
> Key: ISPN-9072
> URL: https://issues.jboss.org/browse/ISPN-9072
> Project: Infinispan
> Issue Type: Task
> Components: Documentation-Query
> Affects Versions: 9.2.1.Final
> Reporter: Galder Zamarreño
> Assignee: Adrian Nistor
> Fix For: 9.4.0.Final
>
>
> When using collection fields in Protobuf annotated classes, empty collections are marshalled into the same value as {{null}}, because Protobuf only has repeated fields and no fields is represented as {{null}}.
> This means that if you have an entity with an empty collection, when it's deserialized the collection will be null. This can be confusing for users and should be documented.
> [~anistor] had some ideas on how to improve this:
> {code}
> <anistor> I'm thinking of a way to make this easier for users that
> would prefer an empty collection being set instead of a
> null. would be possible by adding a new attribute for this in
> @ProtoField anotation
> > that'd be more predictable
> <anistor> would still not give you at deserializtion what was written
> during serialization. we do not have a null marker
> <anistor> it would just give you an empty collection if you prefer
> <anistor> instead of null
> > that option should be enabled by default
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
5 years, 8 months
[JBoss JIRA] (ISPN-9031) Add CodedInputStream.setsize functionality
by Katia Aresti (JIRA)
[ https://issues.jboss.org/browse/ISPN-9031?page=com.atlassian.jira.plugin.... ]
Katia Aresti updated ISPN-9031:
-------------------------------
Fix Version/s: 9.4.0.Final
(was: 9.4.0.CR1)
> Add CodedInputStream.setsize functionality
> ------------------------------------------
>
> Key: ISPN-9031
> URL: https://issues.jboss.org/browse/ISPN-9031
> Project: Infinispan
> Issue Type: Feature Request
> Reporter: Lena Herrmann
> Fix For: 9.4.0.Final
>
>
> If you want to use the Protobufmarchaller for larger objects the following exception occurs:
> {code:java}
> at org.infinispan.client.hotrod.marshall.MarshallerUtil.bytes2obj(MarshallerUtil.java:49)
> at org.infinispan.client.hotrod.impl.protocol.CodecUtils.readUnmarshallByteArray(CodecUtils.java:38)
> at org.infinispan.client.hotrod.impl.protocol.Codec20.readUnmarshallByteArray(Codec20.java:54)
> at org.infinispan.client.hotrod.impl.operations.GetOperation.executeOperation(GetOperation.java:36)
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:56)
> at org.infinispan.client.hotrod.impl.RemoteCacheImpl.get(RemoteCacheImpl.java:367)
> at com.channelpilot.api.tabledata.RemoteCacheConnector.get(RemoteCacheConnector.java:49)
> at com.channelpilot.api.tabledata.TableDataService.get(TableDataService.java:91)
> at com.channelpilot.api.tabledata.build.ControlDataBuilder.build(ControlDataBuilder.java:48)
> at com.channelpilot.api.frontend.jobs.threads.BuildTableDataJob.execute(BuildTableDataJob.java:72)
> at com.channelpilot.api.frontend.jobs.threads.BuildTableDataJob.execute(BuildTableDataJob.java:31)
> at com.channelpilot.utils.concurrent.CPCallable.call(CPCallable.java:105)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: protostream.com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit.
> at protostream.com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:122)
> at protostream.com.google.protobuf.CodedInputStream.readRawBytesSlowPath(CodedInputStream.java:1166)
> at protostream.com.google.protobuf.CodedInputStream.readByteArray(CodedInputStream.java:535)
> at org.infinispan.protostream.impl.RawProtoStreamReaderImpl.readByteArray(RawProtoStreamReaderImpl.java:105)
> at org.infinispan.protostream.WrappedMessage.readMessage(WrappedMessage.java:232)
> at org.infinispan.protostream.ProtobufUtil.fromWrappedByteArray(ProtobufUtil.java:122)
> at org.infinispan.query.remote.client.BaseProtoStreamMarshaller.objectFromByteBuffer(BaseProtoStreamMarshaller.java:32)
> at org.infinispan.commons.marshall.AbstractMarshaller.objectFromByteBuffer(AbstractMarshaller.java:82)
> at org.infinispan.client.hotrod.marshall.MarshallerUtil.bytes2obj(MarshallerUtil.java:33)
> {code}
> According to the official protobuf-documentation one should increase the size-limit, if this exception is thrown. Within infinispan there is no possibility to change this size in some way.
> Adding the to for example a configurationbuilder or similar, would be a good improvement.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
5 years, 8 months
[JBoss JIRA] (ISPN-8813) org.infinispan.server.test.query.RemoteQueryStringIT.testFullTextTermRightOperandAnalyzed fails randomly
by Katia Aresti (JIRA)
[ https://issues.jboss.org/browse/ISPN-8813?page=com.atlassian.jira.plugin.... ]
Katia Aresti updated ISPN-8813:
-------------------------------
Fix Version/s: 9.4.0.Final
(was: 9.4.0.CR1)
> org.infinispan.server.test.query.RemoteQueryStringIT.testFullTextTermRightOperandAnalyzed fails randomly
> --------------------------------------------------------------------------------------------------------
>
> Key: ISPN-8813
> URL: https://issues.jboss.org/browse/ISPN-8813
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Querying
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Fix For: 9.4.0.Final
>
>
> To be fair, this failure happens almost all the time. But I've seen some rare builds where it did not fail, so I'm marking it as a random failure.
> {code}
> &#27;[0m&#27;[33m10:10:13,714 WARN [org.infinispan.factories.ComponentRegistry] (MSC service thread 1-2) ISPN000189: While stopping a cache or cache manager, one of its components failed to stop: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.notifications.cachelistener.CacheNotifierImpl.stop() on object of type CacheNotifierImpl
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:172)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:878)
> at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:679)
> at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:581)
> at org.infinispan.factories.ComponentRegistry.stop(ComponentRegistry.java:256)
> at org.infinispan.cache.impl.CacheImpl.performImmediateShutdown(CacheImpl.java:937)
> at org.infinispan.cache.impl.CacheImpl.stop(CacheImpl.java:901)
> at org.infinispan.cache.impl.AbstractDelegatingCache.stop(AbstractDelegatingCache.java:420)
> at org.infinispan.server.infinispan.SecurityActions$6.run(SecurityActions.java:148)
> at org.infinispan.server.infinispan.SecurityActions$6.run(SecurityActions.java:145)
> at org.infinispan.security.Security.doPrivileged(Security.java:76)
> at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:69)
> at org.infinispan.server.infinispan.SecurityActions.stopCache(SecurityActions.java:152)
> at org.jboss.as.clustering.infinispan.subsystem.CacheService.stop(CacheService.java:103)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2150)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2101)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.infinispan.query.dsl.embedded.impl.BaseJPAFilterIndexingServiceProvider.stop(BaseJPAFilterIndexingServiceProvider.java:68)
> at org.infinispan.notifications.cachelistener.CacheNotifierImpl.stop(CacheNotifierImpl.java:275)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> ... 18 more
> &#27;[0m&#27;[0m10:10:13,728 INFO [org.jboss.as.clustering.infinispan] (MSC service thread 1-2) DGISPN0002: Stopped default cache from local container
> &#27;[0m&#27;[33m10:10:13,748 WARN [org.infinispan.factories.ComponentRegistry] (MSC service thread 1-2) ISPN000189: While stopping a cache or cache manager, one of its components failed to stop: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.notifications.cachelistener.CacheNotifierImpl.stop() on object of type CacheNotifierImpl
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:172)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:878)
> at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:679)
> at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:581)
> at org.infinispan.factories.ComponentRegistry.stop(ComponentRegistry.java:256)
> at org.infinispan.cache.impl.CacheImpl.performImmediateShutdown(CacheImpl.java:937)
> at org.infinispan.cache.impl.CacheImpl.stop(CacheImpl.java:901)
> at org.infinispan.cache.impl.AbstractDelegatingCache.stop(AbstractDelegatingCache.java:420)
> at org.infinispan.manager.DefaultCacheManager.terminate(DefaultCacheManager.java:679)
> at org.infinispan.manager.DefaultCacheManager.stopCaches(DefaultCacheManager.java:719)
> at org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:696)
> at org.infinispan.manager.impl.AbstractDelegatingEmbeddedCacheManager.stop(AbstractDelegatingEmbeddedCacheManager.java:190)
> at org.infinispan.server.infinispan.SecurityActions$2.run(SecurityActions.java:98)
> at org.infinispan.server.infinispan.SecurityActions$2.run(SecurityActions.java:94)
> at org.infinispan.security.Security.doPrivileged(Security.java:76)
> at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:69)
> at org.infinispan.server.infinispan.SecurityActions.stopAndUnregisterContainer(SecurityActions.java:106)
> at org.jboss.as.clustering.infinispan.subsystem.CacheContainerBuilder.stop(CacheContainerBuilder.java:106)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2150)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2101)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.infinispan.query.dsl.embedded.impl.BaseJPAFilterIndexingServiceProvider.stop(BaseJPAFilterIndexingServiceProvider.java:68)
> at org.infinispan.notifications.cachelistener.CacheNotifierImpl.stop(CacheNotifierImpl.java:275)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> ... 22 more
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
5 years, 8 months
[JBoss JIRA] (ISPN-9415) Client topology is not updated after cache becomes degraded
by Katia Aresti (JIRA)
[ https://issues.jboss.org/browse/ISPN-9415?page=com.atlassian.jira.plugin.... ]
Katia Aresti updated ISPN-9415:
-------------------------------
Fix Version/s: 9.4.0.Final
(was: 9.4.0.CR1)
> Client topology is not updated after cache becomes degraded
> -----------------------------------------------------------
>
> Key: ISPN-9415
> URL: https://issues.jboss.org/browse/ISPN-9415
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 9.4.0.Beta1, 9.3.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.4.0.Final
>
>
> When a new server is started, or after a merge, the other servers may see it as a an owner in the consistent hash before the other servers see its server address in the address cache ({{___hotRodTopologyCache}}). When a server needs to send a topology update but some of the servers are missing from the address cache, it can't send the topology update, so it tries to send a "partial update" that excludes the missing servers from the segment owners. In order to send the full topology update when the address cache is populated, the partial topology update has to be sent a smaller topology id, and that means it is only send if {{serverTopologyId >= clientTopologyId + 2}}.
> When the cluster splits and the cache becomes degraded, the servers in the other partition are removed from the address cache, but the list of segment owners is not updated, and the topology id is only incremented by 1. The address cache is incomplete, but a partial update cannot be sent, so the client keeps the old topology and keeps trying to connect to the servers in the other partition.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
5 years, 8 months