December 2010 - infinispan-issues

[JBoss JIRA] Created: (ISPN-425) Stale data read when L1 invalidation happens while UnionConsistentHash is in use

by Galder Zamarreno (JIRA)

Stale data read when L1 invalidation happens while UnionConsistentHash is in use -------------------------------------------------------------------------------- Key: ISPN-425 URL: https://jira.jboss.org/jira/browse/ISPN-425 Project: Infinispan Issue Type: Bug Components: Distributed Cache Affects Versions: 4.1.0.BETA1 Reporter: Galder Zamarreno Assignee: Galder Zamarreno Fix For: 4.1.0.CR1 See below: ----- "Manik Surtani" <manik(a)jboss.org> wrote: > On 3 May 2010, at 08:51, Galder Zamarreno wrote: > > > Resending without log until the message is approved. > > > > -- > > Galder Zamarreño > > Sr. Software Engineer > > Infinispan, JBoss Cache > > > > ----- Forwarded Message ----- > > From: galder(a)redhat.com > > To: "infinispan -Dev List" <infinispan-dev(a)lists.jboss.org> > > Sent: Friday, April 30, 2010 6:30:05 PM GMT +01:00 Amsterdam / > Berlin / Bern / Rome / Stockholm / Vienna > > Subject: Stale data read when L1 invalidation happens while > UnionConsistentHash is in use > > > > Hi, > > > > I've spent all day chasing down a random Hot Rod testsuite failure > related to distribution. This is the last hurdle to close > https://jira.jboss.org/jira/browse/ISPN-411. In > HotRodDistributionTest, which is still to be committed, I test adding > a new node, doing a put on this node, and then doing a get in a > different node and making sure that I get what was put. The test > randomly fails saying that the get returns the old value. The failure > is nothing to do with Hot Rod itself but rather a race condition where > union consistent hash is used. Let me explain: > > > > 1. An earlier operation had set > "k-testDistributedPutWithTopologyChanges" key to > "v5-testDistributedPutWithTopologyChanges". > > 2. Start a new hot rod server in eq-7969. > > 2. eq-7969 node calls a put on that key with > "v6-testDistributedPutWithTopologyChanges". Recipients for the put > are: eq-7969 and eq-61332. > > 3. eq-7969 sends an invalidate L1 to all, including eq-13415 > > 4. eq-13415 should invalidate > "k-testDistributedPutWithTopologyChanges" but it doesn't, since it > considers that "k-testDistributedPutWithTopologyChanges" is local to > eq-13415: > > > > 2010-04-30 18:02:19,907 6046 TRACE > [org.infinispan.distribution.DefaultConsistentHash] > (OOB-2,Infinispan-Cluster,eq-13415:) Hash code for key > CacheKey{data=ByteArray{size=39, hashCode=17b1683, array=[107, 45, > 116, 101, 115, 116, 68, 105, 115, 116, ..]}} is 344897059 > > 2010-04-30 18:02:19,907 6046 TRACE > [org.infinispan.distribution.DefaultConsistentHash] > (OOB-2,Infinispan-Cluster,eq-13415:) Candidates for key > CacheKey{data=ByteArray{size=39, hashCode=17b1683, array=[107, 45, > 116, 101, 115, 116, 68, 105, 115, 116, ..]}} are {5458=eq-7969, > 6831=eq-61332} > > 2010-04-30 18:02:19,907 6046 TRACE > [org.infinispan.distribution.DistributionManagerImpl] > (OOB-2,Infinispan-Cluster,eq-13415:) Is local > CacheKey{data=ByteArray{size=39, hashCode=17b1683, array=[107, 45, > 116, 101, 115, 116, 68, 105, 115, 116, ..]}} to eq-13415 query returns > true and consistentHash is > org.infinispan.distribution.UnionConsistentHash@10747b4 > > > > This is a log with log messages that I added to debug it. The key > factor here is that UnionConsistentHash is in use, probably due to > rehashing not having fully finished. > > > > 5. The end result is that a read of > "k-testDistributedPutWithTopologyChanges" in eq-13415 returns > "v5-testDistributedPutWithTopologyChanges". > > > > I thought that maybe we could be more conservative here and if > rehashing is in progress (or UnionConsistentHash is in use) invalidate > regardless. Assuming that a put always follows an invalidation in > distribution and not viceversa, that would be fine. The only downside > is that you'd be invalidating too much but put would replace the data > in the node where invalidation should not have happened but it did, so > not a problem. > > > > Thoughts? Alternatively, maybe I need to shape my test so that I > wait for rehashing to finish, but the problem would still be there. > > Yes, this seems to be a bug with concurrent rehashing and invalidation > rather than HotRod. > > Could you modify your test to so the following: > > 1. start 2 caches C1 and C2. > 2. put a key K such that K maps on to C1 and C2 > 3. add a new node, C3. K should now map to C1 and C3. > 4. Modify the value on C1 *before* rehashing completes. > 5. See if we see the stale value on C2. > > To do this you would need a custom object for K that hashes the way > you would expect (this could be hardcoded) and a value which blocks > when serializing so we can control how long rehashing takes. Since logical addresses are used underneath and these change from one run to the other, I'm not sure how I can generate such key programatically. It's even more complicated to figure out a key that will later, when C3 starts, map to it. Without having these addresses locked somehow, or their hash codes, I can't see how this is doable. IOW, to be able to do this, I need to mock these addresses into giving fixed as hash codes. I'll dig further into this. > > I never promised the test would be simple! :) > > Cheers > Manik > -- > Manik Surtani > manik(a)jboss.org > Lead, Infinispan > Lead, JBoss Cache > http://www.infinispan.org > http://www.jbosscache.org > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

3
11
0 / 0

[JBoss JIRA] Created: (ISPN-799) JoinTask as it invalidates L1 entries should be given precedence in acquiring locks

by Vladimir Blagojevic (JIRA)

JoinTask as it invalidates L1 entries should be given precedence in acquiring locks ------------------------------------------------------------------------------------ Key: ISPN-799 URL: https://jira.jboss.org/browse/ISPN-799 Project: Infinispan Issue Type: Bug Components: Locking and Concurrency Affects Versions: 4.2.0.CR1 Reporter: Vladimir Blagojevic Assignee: Manik Surtani Fix For: 4.2.0.Final, 5.0.0.Final The SingleJoinTest transaction test failure itself is intermittent due to the way addresses are organised in the hash wheel, so you are correct that it is a timing issue. Anyway, it still is a very real problem. Just to re-iterate and to make sure we are talking about the same thing: 1. View is {A, B, C} 2. K is mapped to {A, B} 3. A tx starts to update K, and is prepared. Locks now held for K on {A, B} 4. D joins. D is placed on the hash wheel between A and B. So the new view is {A, D, B, C} 5. As per the test (artificial, I know, but could still happen), the tx waits for a long time before committing. In the case of the test, artificially waits until D has finished joining before committing, by use of a latch. 6. D never joins as even though it receives the prepare for the tx and could potentially commit itself (as a new owner), it fails as it is unable to invalidate K on B. There are a few solutions here: 1) This is pretty easy to detect. Attempt to acquire the lock with a smaller lock acquisition timeout and if the transaction is still stuck, abort the transaction and proceed with the join. 2) If the blocking node is *not* the transaction originator (as in this case: the tx was started on A), then just force lock removal and tx rollback on B *only*. Let the tx complete on A, since the new joiner will receive the transactional event and will be able to apply it as a new owner. My vote is to go for solution 1 - a bit more crude, but 2 would be very complex to implement. And even then, would only solve for the invalidation being blocked on a node that did not originate the transaction. E.g., the tx originated on A but the lock issue was on B. If, however, the tx originated on B, *and* B no longer owns the entry in question, then 2 is no longer a solution and the only solution would be 1. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

2
4
0 / 0

[JBoss JIRA] Created: (ISPN-823) Create a jgroups-ec2.xml config file to be shipped with Infinispan

by Manik Surtani (JIRA)

Create a jgroups-ec2.xml config file to be shipped with Infinispan ------------------------------------------------------------------ Key: ISPN-823 URL: https://jira.jboss.org/browse/ISPN-823 Project: Infinispan Issue Type: Feature Request Components: Configuration Affects Versions: 4.2.0.CR3 Reporter: Manik Surtani Assignee: Manik Surtani Fix For: 4.2.0.CR4, 4.2.0.Final As per the subject. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

1
3
0 / 0

[JBoss JIRA] Created: (ISPN-244) Enable external user/framework defined Externalizers

by Galder Zamarreno (JIRA)

Enable external user/framework defined Externalizers ---------------------------------------------------- Key: ISPN-244 URL: https://jira.jboss.org/jira/browse/ISPN-244 Project: Infinispan Issue Type: Feature Request Components: RPC Reporter: Galder Zamarreno Fix For: 4.1.0.BETA1 Create an internal magic number (i.e. -1 or 255) for user defined externalizers. This is done to avoid users using our number space. So, internally: <magic_number> <stream> Users: <magic_number><user defined magic number (int)> <stream> Mandate unsigned ints so that we can optimise by sending them as variable lenght Internal frameworks could use high enough numbers for example up to 2 bytes: 5000, 7000, 20000 1 byte: 128 2 bytes: 32767 3 bytes: ... GlobalConfiguration.registerMarshallable(Class type, Externalizer ext, int id); Maybe CacheManager better? CacheManager.registerMarshallable(Class type, Externalizer ext, int id); Future improvement, maybe generate ids automatically for user defined classes? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

3
13
0 / 0

[JBoss JIRA] Created: (ISPN-829) MemcachedValue externalizer should send byte array length as int

by Galder Zamarreño (JIRA)

MemcachedValue externalizer should send byte array length as int ---------------------------------------------------------------- Key: ISPN-829 URL: https://issues.jboss.org/browse/ISPN-829 Project: Infinispan Issue Type: Bug Components: Cache Server, Marshalling Affects Versions: 4.2.0.CR3, 4.1.0.Final Reporter: Galder Zamarreño Assignee: Galder Zamarreño Fix For: 4.2.0.CR4, 4.2.0.Final Use writeInt()/readInt() instead of write()/read() in MemcacheValue externalizer -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

1
3
0 / 0

[JBoss JIRA] Created: (ISPN-826) RemoteLockCleanupStressTest throws EOFException when reading state

by Galder Zamarreño (JIRA)

RemoteLockCleanupStressTest throws EOFException when reading state ------------------------------------------------------------------ Key: ISPN-826 URL: https://jira.jboss.org/browse/ISPN-826 Project: Infinispan Issue Type: Bug Components: Marshalling, State transfer Affects Versions: 4.2.0.CR3, 4.1.0.Final Reporter: Galder Zamarreño Assignee: Galder Zamarreño Priority: Blocker Fix For: 4.2.0.CR4, 4.2.0.Final While doing some work to verify ISPN-244, I've spotted that RemoteLockCleanupStressTest throws. Note that this is not related to the ISPN-244 cos the issue is present in 4.2.x as well where ISPN-244 has no bearing: 2010-12-08 18:16:25,945 13951 ERROR [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,Infinispan-Cluster,NodeC-2057:) Caught while requesting or applying state org.infinispan.statetransfer.StateTransferException: java.io.EOFException: Read past end of file at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:331) at org.infinispan.remoting.InboundInvocationHandlerImpl.applyState(InboundInvocationHandlerImpl.java:102) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.setState(JGroupsTransport.java:598) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:712) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:772) at org.jgroups.JChannel.up(JChannel.java:1422) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:954) at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:478) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:525) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:464) at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:225) at org.jgroups.protocols.FRAG2.up(FRAG2.java:190) at org.jgroups.protocols.FC.up(FC.java:483) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:888) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234) at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:615) at org.jgroups.protocols.UNICAST.up(UNICAST.java:295) at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:707) at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:132) at org.jgroups.protocols.FD.up(FD.java:266) at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:269) at org.jgroups.protocols.MERGE2.up(MERGE2.java:210) at org.jgroups.protocols.Discovery.up(Discovery.java:292) at org.jgroups.protocols.TP.passMessageUp(TP.java:1093) at org.jgroups.protocols.TP.access$100(TP.java:56) at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1633) at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1615) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.io.EOFException: Read past end of file at org.jboss.marshalling.AbstractUnmarshaller.eofOnRead(AbstractUnmarshaller.java:184) at org.jboss.marshalling.AbstractUnmarshaller.readUnsignedByteDirect(AbstractUnmarshaller.java:319) at org.jboss.marshalling.AbstractUnmarshaller.readUnsignedByte(AbstractUnmarshaller.java:280) at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:207) at org.jboss.marshalling.AbstractUnmarshaller.readObject(AbstractUnmarshaller.java:85) at org.infinispan.marshall.jboss.GenericJBossMarshaller.objectFromObjectStream(GenericJBossMarshaller.java:162) at org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:184) at org.infinispan.statetransfer.StateTransferManagerImpl.processCommitLog(StateTransferManagerImpl.java:228) at org.infinispan.statetransfer.StateTransferManagerImpl.applyTransactionLog(StateTransferManagerImpl.java:250) at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:320) ... 29 more Now, in spite of this issue the test passes!!! (WTF?). So, first things first, the test needs improving to send these issues back up. That probably means changing to a Callable rather than Runnable, and will see which exceptions we wanna swallow and which ones we wanna propagate. Then I'll get going to figure out the cause. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

1
9
0 / 0

[JBoss JIRA] Created: (ISPN-827) DefaultCacheManager don't fallback gracefully to file system based configuration when initiated from within glassfish.

by Jonas Lasson (JIRA)

DefaultCacheManager don't fallback gracefully to file system based configuration when initiated from within glassfish. ---------------------------------------------------------------------------------------------------------------------- Key: ISPN-827 URL: https://jira.jboss.org/browse/ISPN-827 Project: Infinispan Issue Type: Bug Components: Configuration Affects Versions: 4.2.0.CR3, 4.1.0.Final Environment: Windows 7 64 bit. Reporter: Jonas Lasson Assignee: Manik Surtani When initiating the DefaultCacheManager with an absolute windows path from within Glassfish, a IllegalArgumentException is thrown before the configuration is even tried to be read with FileInputStream. SEVERE: org.infinispan.config.ConfigurationException: java.lang.IllegalArgumentException: name at org.infinispan.manager.DefaultCacheManager.<init>(DefaultCacheManager.java:256) The problem is easily reproducible by doing: new DefaultCacheManager("c:\\cachetest.xml"); >From within a servlet. The reason for this problem is found in FileLookup.java line 68, and happens because the classloader throws an IllegalArgumentException for the path provided. A simple solution would be to catch runtime exceptions when we try to load a resource from the cache loader, and then fall back to reading the config file with FileInputStream -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

2
4
0 / 0

[JBoss JIRA] Created: (ISPN-824) TestingUtil#clearContent creates caches during test cleanup

by Vladimir Blagojevic (JIRA)

TestingUtil#clearContent creates caches during test cleanup ----------------------------------------------------------- Key: ISPN-824 URL: https://jira.jboss.org/browse/ISPN-824 Project: Infinispan Issue Type: Bug Affects Versions: 4.2.0.BETA1, 5.0.0.ALPHA1 Reporter: Vladimir Blagojevic Assignee: Manik Surtani Fix For: 4.2.0.CR4, 5.0.0.ALPHA1 getRunningCaches invoked from TestingUtil#clearContent which is in turn invoked during each test cleanup unnecessarily creates default caches. Impact is probably minimal, nonetheless this should be corrected! -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

2
1
0 / 0

[JBoss JIRA] Created: (ISPN-828) Puts during preloading should not perform any remote lookups

by Manik Surtani (JIRA)

Puts during preloading should not perform any remote lookups ------------------------------------------------------------ Key: ISPN-828 URL: https://issues.jboss.org/browse/ISPN-828 Project: Infinispan Issue Type: Bug Components: Distributed Cache, Loaders and Stores Affects Versions: 4.1.0.Final Reporter: Manik Surtani Assignee: Manik Surtani Fix For: 4.2.0.CR4, 4.2.0.Final When entries are stored in the cache (using a cache.put()) during preloading, the SKIP_REMOTE_LOOKUP flag should be used to prevent an unnecessary remote GET in the case of a distributed cache. https://github.com/infinispan/infinispan/blob/master/core/src/main/java/o... -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years

1
2
0 / 0

[JBoss JIRA] Created: (ISPN-558) XAResource implementation in ISPN to become fully stateless

by Mircea Markus (JIRA)

XAResource implementation in ISPN to become fully stateless ----------------------------------------------------------- Key: ISPN-558 URL: https://jira.jboss.org/browse/ISPN-558 Project: Infinispan Issue Type: Feature Request Reporter: Mircea Markus Assignee: Mircea Markus Fix For: 5.0.0.Final This is needed as XAResource.prepare/commit/rollback might be called even if XAResource is currently associated to a transaction. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira

15 years, 1 month

2
7
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues December 2010