Slight change to NBST
by Vladimir Blagojevic
Guys,
Since we had some problems with state transfer and udp exposed recently
[1] I've been experimenting with various solutions ever since. The
problem is essentially rooted at the fact that we are not respecting
JGroups state transfer callback contract by invoking rpcs from setState.
Rpc in question is flipping the gate in JGroupsDistSync(see
StateTransferManagerImpl#mimicPartialFlushViaRPC). I think we can safely
move this rpc to just prior state request is sent to state provider and
to right after state transfer is completed [2].
I wanted to check with people who designed NBST if this change would be
causing a problem. With this change I am getting clean runs in both udp
and tcp testsuite.
Let me know,
Vladimir
[1] https://issues.jboss.org/browse/ISPN-1160
[2]
https://github.com/vblagoje/infinispan/commit/c88fb6a12379b809c49d59fcdec...
13 years, 6 months
frequent stack in testsuite
by Sanne Grinovero
Hello all,
if I happen to look at the console while the tests are running, I see
this exception popup very often:
2011-06-09 15:32:18,092 ERROR [JGroupsTransport]
(Incoming-1,Infinispan-Cluster,NodeB-32230) ISPN00096: Caught while
requesting or applying state
org.infinispan.statetransfer.StateTransferException:
java.io.EOFException: Read past end of file
at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:333)
at org.infinispan.remoting.InboundInvocationHandlerImpl.applyState(InboundInvocationHandlerImpl.java:230)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.setState(JGroupsTransport.java:602)
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:711)
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)
at org.jgroups.JChannel.up(JChannel.java:1441)
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)
at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:523)
at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:462)
at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:223)
at org.jgroups.protocols.FRAG2.up(FRAG2.java:189)
at org.jgroups.protocols.FC.up(FC.java:479)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:891)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246)
at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:613)
at org.jgroups.protocols.UNICAST.up(UNICAST.java:294)
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:703)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:133)
at org.jgroups.protocols.FD.up(FD.java:275)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:275)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:209)
at org.jgroups.protocols.Discovery.up(Discovery.java:291)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1102)
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1658)
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1640)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException: Read past end of file
at org.jboss.marshalling.SimpleDataInput.eofOnRead(SimpleDataInput.java:126)
at org.jboss.marshalling.SimpleDataInput.readUnsignedByteDirect(SimpleDataInput.java:263)
at org.jboss.marshalling.SimpleDataInput.readUnsignedByte(SimpleDataInput.java:224)
at org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
at org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:37)
at org.infinispan.marshall.jboss.GenericJBossMarshaller.objectFromObjectStream(GenericJBossMarshaller.java:192)
at org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:190)
at org.infinispan.statetransfer.StateTransferManagerImpl.processCommitLog(StateTransferManagerImpl.java:230)
at org.infinispan.statetransfer.StateTransferManagerImpl.applyTransactionLog(StateTransferManagerImpl.java:252)
at org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:322)
... 27 more
But I'm not sure if it's an issue, as it seems tests are not failing.
I consider a "Read past end of file" quite suspiciously looking; would
it be possible to think that some internal Externalizer is writing
less bytes than what it's attempting to read?
Is there something clever I could do to understand which object the
marshaller is trying to read when something like this is happening?
I've found debugging this quite hard.
Also, it doesn't look like our externalizers have a good test
coverage; They are likely implicitly tested as I assume that nothing
would work if they aren't, but still it looks like we have no explicit
tests for them?
Cheers,
Sanne
13 years, 6 months
Adaptive marshaller buffer sizes - ISPN-1102
by Galder Zamarreño
Hi all,
Re: https://issues.jboss.org/browse/ISPN-1102
First of all thanks to Dan for his suggestion on reservoir sampling+percentiles, very good suggestion:). So, I'm looking into this and Trustin's http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/AdaptiveRecei... but in this email I wanted to discuss the reservoir sampling mechanism (http://en.wikipedia.org/wiki/Reservoir_sampling).
So, the way I look at it, to implement this you'd keep track of N buffer sizes used so far, and out of those chose K samples based on reservoir sampling, and then of those K samples left, take the 90th percentile.
Calculating the percentile is easy with those K samples stored in an ordered collection. Now, my problem with this is that reservoir sampling is an O(n) operation and you would not want to be doing that per each request for a buffer that comes in.
One option I can think of that instead of ever letting a user thread calculate this, the user thread could just feed the buffer size collection (a concurrent collection) and we could have a thread in the background that periodically or based on some threshold calculates the reservoir sample + percentile and this is what's used as next buffer size. My biggest problem here is the concurrent collection in the middle. You could have a priority queue ordered by buffer sizes but it's unbounded. The concurrent collection does not require to be ordered though, the reservoir sampling could do that, but you want it the collection bounded. But if bounded and the limit is hit, you would not want it to block but instead override values remove the last element and insert again. You only care about the last N relevant buffer sizes...
Another option that would avoid the use of a concurrent collection would be if this was calculated per thread and stored in a thread local. The calculation could be done every X requests still in the client thread, or could be sent to a separate thread wrapping it around a callable and keeping the future as thread local, you could query it next time the thread wants to marshall something.
I feel a bit more inclined towards the latter option although it limits the calculation to be per-thread for several reasons:
- We already have org.jboss.marshalling.Marshaller and org.jboss.marshalling.Unmarshaller instances as thread local which have proven to perform well.
- So we could tap into this set up to maintain not only the marshaller, but the size of the buffer too.
- It could offer the possibility of being extended further to avoid creating buffers all the time and instead reuse them as long as the size is constant. After a size recalculation we'd ditch it and create a new one.
Thoughts?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
13 years, 6 months
Write skew check implementation
by Pedro Ruivo
Hi,
I am progressing with my implementation of total-order-based commit
protocol for the partial replication mode, and I would be grateful if
you could help me understanding how the write skew check is implemented
in ISPN.
Assume the scenario in which a node, say n1, that owns a data item d
receives a prepare message for a transaction, say T, originated on
different replica n2.
Of course write skew check is enabled.
After n1 acquiring the lock on d, it should check whether the value of d
read during T's execution is still the most recently committed one.
I may be missing something here, but looking at [1] (lines 160 to 168)
and [2] at copyForUpdate method, n1 is reading twice the value of d that
is locally stored (once in line 160([1]) to create the
RepeatableReadEntry and a second time in line 52([2]) to do the write
skew check).
I would have expected that the node that originated the transaction
would send the values that it read (before writing them) in piggyback
with the prepare message, in order to allow the "owners" to perform the
write skew.
Thanks
Pedro
[1]
https://github.com/infinispan/infinispan/blob/master/core/src/main/java/o...
[2]
https://github.com/infinispan/infinispan/blob/master/core/src/main/java/o...
--
INESC-ID Lisboa, sala 511
gsd.inesc-id.pt/~pruivo
13 years, 6 months
Re: [infinispan-dev] some thoughts on mutual dependencies in Query/Search
by Sanne Grinovero
Actually I'm having a mild preference to solve the contingency using an
externalizer, as it won't need an urgent release of Search.
What bothers me about using a temporary solution is that people won't be
able to perform a rolling upgrade.
On 14 Jun 2011 12:07, "Manik Surtani" <manik(a)jboss.org> wrote:
13 years, 6 months
Infinispan Shell Console
by Michal Linhard
Hi all!
Announcing my little project Infinispan Console:
https://github.com/mlinhard/ispncon
I wanted to wait with publishing of this until I do some more testing,
but I don't know when I'll get to that, and it already helps me with
some testing and I thought that it might be useful to some of you too,
so I don't wanna keep it to myself any longer...
it basically enables you to do stuff like
ispncon put key value
ispncon get key
from your bash, using client access (hotrod, memcached, rest) - I.e. you
have to have these server modules running/deployed in a container.
feedback is appreciated.
m.
--
Michal Linhard
Quality Assurance Engineer
Red Hat Czech s.r.o.
Purkynova 99 612 45 Brno, Czech Republic
phone: +420 532 294 320 ext. 62320
mobile: +420 728 626 363
13 years, 6 months
Timing related tests
by Manik Surtani
Some tests - like ExpiryTest - rely on certain timings for the test to run, and due to thread scheduling on our parallel test suite, tend to occasionally fail on certain environments such as CloudBees:
https://infinispan.ci.cloudbees.com/job/Infinispan-master-JDK6-tcp/org.in...
In this example, entries are placed in the data container with a 30 second lifespan, tested for existence, wait 30s, and test for non-existence. The failure here is that the first test for existence fails since the thread is de-scheduled for a period of time between storing the entry and the first test.
Upping the lifespan just moves the problem - and makes the test suite run slower (got to wait for that lifespan before testing again).
How about we group such tests into a new group, "timeSensitiveTests", and *don't* run these on CI environments (but *do* run them on local environments where response times are more reasonable/predictable)?
Thoughts?
Cheers
Manik
--
Manik Surtani
manik(a)jboss.org
twitter.com/maniksurtani
Lead, Infinispan
http://www.infinispan.org
13 years, 6 months
discussion about impact of using TransactionSynchronizationRegistry in AS7...
by Scott Marlow
I posted a message on the as7-dev ml
(http://lists.jboss.org/pipermail/jboss-as7-dev/2011-May/002254.html),
about switching to use the TransactionSynchronizationRegistry.
Does Infinispan currently register Transaction synchronization objects?
Does Infinispan currently register synchronizations via
TransactionSynchronizationRegistry (TSR)?
I'm trying to get a sense for, what would happen if container managed
(AS7) session beans were registered with the active JTA transaction via
the TSR.
If AS7 switches to use the TSR, I think that Infinispan might need to
ensure that it doesn't attempt to register with the TX too late.
See http://pastie.org/1836698 for an example of what would happen if a
TSR synchronization object is already present and someone tries to
register a TX synchronization after tx.commit has been started.
13 years, 7 months
Ideas for locking improvements
by Sanne Grinovero
Hello all,
as some of us met recently at events, we've been discussing some
details about the current locking implementations and possible
improvements.
Some interesting ideas came out and we've been writing them down on
the wiki so that everyone can be involved:
http://community.jboss.org/wiki/PossibleLockingImprovements
as always, more feedback, comments and more suggestions are welcome.
Please bear with us if something is not very clear as they are drafts
of unimplemented ideas, so if you see something that you like to know
more about please start a discussion or comment about it and we'll try
to polish the explanation, refining the ideas.
Cheers,
Sanne
13 years, 7 months