[JBoss JIRA] (ISPN-3613) Stored entries are deleted from table in rebalance
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3613?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3613:
--------------------------------
Labels: 620 nbst (was: 620)
> Stored entries are deleted from table in rebalance
> --------------------------------------------------
>
> Key: ISPN-3613
> URL: https://issues.jboss.org/browse/ISPN-3613
> Project: Infinispan
> Issue Type: Bug
> Reporter: Mircea Markus
> Assignee: William Burns
> Labels: 620, nbst
> Fix For: 6.0.0.Final
>
>
> Description of problem:
> When passivation value is false, stored entries are deleted from table in rebalance.
> clustered.xml
> ------------
> <distributed-cache name="myCache" mode="SYNC" start="EAGER">
> <locking isolation="READ_COMMITTED" acquire-timeout="30000" concurrency-level="1000" striping="false"/>
> <transaction mode="NONE"/>
> <eviction strategy="LIRS" max-entries="10000"/>
> <string-keyed-jdbc-store datasource="java:jboss/datasources/InfinispanDS" passivation="false" preload="true" purge="false" shared="true" fetch-state="false">
> ...
> Version-Release number of selected component (if applicable):
> JDG 6.1
> How reproducible:
> I will attache the clustered.xml and trace logs.
> Steps to Reproduce:
> 1.start node1
> 2.put 300 entries
> 3.start node2
> check entries:
> select count(*) from table;
> 300
> 4.start node3
> check entries:
> select count(*) from table;
> 0
> Actual results:
> In step 4, number of entries are 0 in DB table.
> Expected results:
> In step 4, number of entries are 300 in DB table.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3645) StateTransferLargeObjectTest hangs randomly
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3645?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3645:
--------------------------------
Labels: 620 nbst (was: 620)
> StateTransferLargeObjectTest hangs randomly
> -------------------------------------------
>
> Key: ISPN-3645
> URL: https://issues.jboss.org/browse/ISPN-3645
> Project: Infinispan
> Issue Type: Bug
> Components: RPC
> Affects Versions: 6.0.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: 620, nbst
> Fix For: 6.0.0.Final
>
> Attachments: stlot.stack
>
>
> StateTransferLargeObject sometimes hangs in the second part of the test, when it checks that all the nodes in the cluster can read the inserted values. I was able to make it hang reliably when run separately, by increasing the number of keys from 1000 to 5000.
> The cause is probably JGRP-1675, as many OOB threads appear to be stuck in FlowControl.decrementIfEnoughCredits.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3051) Allow configuring the number of segments per node
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3051?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3051:
--------------------------------
Labels: (was: nbst)
> Allow configuring the number of segments per node
> -------------------------------------------------
>
> Key: ISPN-3051
> URL: https://issues.jboss.org/browse/ISPN-3051
> Project: Infinispan
> Issue Type: Feature Request
> Components: Distributed Cache
> Reporter: Guillermo GARCIA OCHOA
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 6.0.0.Beta1, 6.0.0.Final
>
>
> This should allow for the following use cases:
> - a node to take more load
> - a node to take no load
> A simple way for specifying this would be to configure a load factor per node, e.g. more powerful machine would be 2*x and that would mean that it would take twice the load of an "ordinary" machine in the cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3140) JMX operation to suppress state transfer
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3140?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3140:
--------------------------------
Labels: (was: nbst)
> JMX operation to suppress state transfer
> ----------------------------------------
>
> Key: ISPN-3140
> URL: https://issues.jboss.org/browse/ISPN-3140
> Project: Infinispan
> Issue Type: Feature Request
> Components: Distributed Cache, State transfer
> Affects Versions: 5.2.6.Final
> Reporter: Manik Surtani
> Assignee: Dan Berindei
> Fix For: 5.2.7.Final, 5.3.0.CR2, 5.3.0.Final
>
>
> This feature request is to expose a JMX operation on each node, to suppress state transfer for a period of time. This flag would be {{false}} by default.
> The use case of this flag would be to ease bringing down (and up) a cluster for maintenance work. A typical workflow would be:
> 1) Shut down application requests to the data grid
> 2) Suppress state transfer on all nodes via JMX
> 3) Bring down all nodes
> 4) Perform maintenance work
> 5) Bring up nodes, one at a time. As each node comes up, disable state transfer for the node via JMX.
> 6) Once all nodes are up, enable state transfer for each node again via JMX
> 7) Allow application requests to reach the grid again.
> The purpose of this is to allow smooth and fast shutdown and startup, remove the risk of OOM errors (when bringing a grid down).
> This is a small but useful subset of full manual state transfer as defined in ISPN-1394.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3422) In non-tx caches, write operations may not be atomic during rebalance
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3422?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3422:
--------------------------------
Labels: 620 nbst (was: 620)
> In non-tx caches, write operations may not be atomic during rebalance
> ---------------------------------------------------------------------
>
> Key: ISPN-3422
> URL: https://issues.jboss.org/browse/ISPN-3422
> Project: Infinispan
> Issue Type: Bug
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 620, nbst
> Fix For: 6.1.0.Final
>
>
> If the cache topology changes while a write command is running and before it has actually committed the entry to the data container, we retry the command (see ISPN-3366 and ISPN-3357). But before we detect the topology change, one or more of the backup owners may have already applied the modification.
> Retrying the command re-acquires the key lock on the primary owner (even if the primary owner didn't change). That means another command could have modified the same key in the meantime, but the retried command is going to ignore any changes and is going to return the value before the first attempt. Obviously, the command is not retried if the first attempt is not successful, but scenarios like this are possible:
> {code}
> thread 1: putIfAbsent(k, v1) -> null
> thread 2: putIfAbsent(k, v2) -> null
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3773) State transfer thread can stop even though there are pending transfer tasks
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3773?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3773:
--------------------------------
Labels: nbst (was: )
> State transfer thread can stop even though there are pending transfer tasks
> ---------------------------------------------------------------------------
>
> Key: ISPN-3773
> URL: https://issues.jboss.org/browse/ISPN-3773
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 6.0.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: nbst
> Fix For: 7.0.0.Final
>
>
> Noticed in NonTxOriginatorBecomingPrimaryOwnerTest. The state transfer thread finished the last inbound transfer task, but just before stopping another task is started. The new task doesn't prevent the state transfer thread from stopping, and the node will never request those segments (thus blocking the rebalance from ending).
> {noformat}
> 15:28:31,033 TRACE (asyncTransportThread-1,NodeC:) [InboundTransferTask] Successfully requested segments [33, 6, 7, 8, 9, 11, 13, 50, 54, 20, 52, 22, 59, 25, 24, 27, 26, 29, 28, 31] of cache ___defaultcache from node NodeA-49040
> 15:28:31,264 TRACE (remote-thread-1,NodeC:___defaultcache) [StateConsumerImpl] Adding transfer from NodeA-49040 for segments [32, 5, 6, 7, 8, 10, 12, 51, 49, 19, 21, 53, 23, 59, 25, 24, 27, 26, 28, 30]
> 15:28:31,264 TRACE (remote-thread-1,NodeC:___defaultcache) [StateConsumerImpl] Starting transfer thread: false
> 15:28:31,264 DEBUG (remote-thread-1,NodeC:___defaultcache) [StateConsumerImpl] Finished adding inbound state transfer for segments [5, 6, 7, 8, 10, 12, 19, 21, 23, 25, 24, 27, 26, 28, 30, 32, 51, 49, 53, 59] of cache ___defaultcache
> 15:28:31,264 TRACE (remote-thread-1,NodeC:___defaultcache) [StateTransferLockImpl] Signalling transaction data received for topology 41
> 15:28:31,264 TRACE (asyncTransportThread-1,NodeC:) [StateConsumerImpl] Stopping state transfer thread
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3366:
--------------------------------
Labels: nbst (was: )
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Labels: nbst
> Fix For: 5.2.8.Final, 6.0.0.Alpha3, 6.0.0.CR1
>
> Attachments: ISPN-3366-full-logs-3rd.zip, ISPN-3366-full-logs-4th.zip, ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3287) Possible inconsistency with concurrent transactions during state transfer
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3287?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3287:
--------------------------------
Labels: 620 nbst (was: 620)
> Possible inconsistency with concurrent transactions during state transfer
> -------------------------------------------------------------------------
>
> Key: ISPN-3287
> URL: https://issues.jboss.org/browse/ISPN-3287
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency, State transfer
> Affects Versions: 5.3.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 620, nbst
> Fix For: 6.0.0.CR1, 6.0.0.Final
>
>
> It looks like there is a data race between the state transfer thread and a concurrent transaction in EntryWrappingInterceptor.commitEntryIfNeeded:
> tx: commitContextEntry()
> ST: stateConsumer.isKeyUpdated(k)? false
> tx: stateConsumer.addUpdatedKey(k)
> ST: commitContextEntry()
> We probably need some synchronization here, maybe using EquivalentConcurrentHashMapV8.computeIfAbsent().
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3389) Forwarded transactions can remain stale after state transfer
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3389?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3389:
--------------------------------
Labels: 5.2.x nbst (was: 5.2.x)
> Forwarded transactions can remain stale after state transfer
> ------------------------------------------------------------
>
> Key: ISPN-3389
> URL: https://issues.jboss.org/browse/ISPN-3389
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.7.Final
> Reporter: Erik Salter
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 5.2.x, nbst
> Fix For: 6.0.0.Alpha4, 6.0.0.CR1
>
>
> There is a scenario where a tx started on one node, moved during state transfer, and committed on the originating node won't be removed from the new owner's tx table.
> The chain of events is as follows:
> 1. New topology comes in as part of a view change.
> 2. Local transaction started with the new topology ID. This transaction was started due to a LockControlCommand and has no modifications. Also important, it only has local locks.
> 3. Tx forwarded to new owner before the local lock is acquired and registered with the transaction.
> 4. Since the tx has only local locks and no modifications, it is only removed locally. No TxCompletion or Rollback are broadcast to the new owners.
> This key becomes unusable not due to stale locks, but because the waitForTransaction() code will see that the old tx can "potentially" lock the key.
> This easily happens with pessimistic caches, though I have seen it happen with optimistic caches (there is a delta between the transaction being created and the lock registration).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months
[JBoss JIRA] (ISPN-3443) WriteCommand may be ignored during state transfer
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3443?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3443:
--------------------------------
Labels: 620 nbst (was: 620)
> WriteCommand may be ignored during state transfer
> -------------------------------------------------
>
> Key: ISPN-3443
> URL: https://issues.jboss.org/browse/ISPN-3443
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency, State transfer
> Affects Versions: 6.0.0.Alpha3
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Blocker
> Labels: 620, nbst
> Fix For: 6.0.0.CR1
>
>
> Distributed sync non-tx cache.
> Situation:
> 1) A node is joining the cluster, requesting some segment
> 2) RemoveCommand is sent to backup owner with ignorePreviousValue=true
> 3) It looks up the entry and finds null
> 4) State transfer invokes the PutKeyValueCommand and sets the value for removed entry (updateKeys has not the key yet)
> 5) RemoveCommand adds its key to updateKeys set, but it does not remove the value as it is already null (in its context)
> Result: the value is removed on primary but on backup this is still present
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
9 years, 10 months