Keeping track of locked nodes
by Sanne Grinovero
I just noticed that org.infinispan.transaction.LocalTransaction is
keeping track of Addresses on which locks where acquired.
That's surprising me .. why should it ever be interested in the
specific Address? I'd expect it to be able to figure that out when
needed, especially since the Address owning the lock might change over
time I don't understand to track for a specific node.
Cheers,
Sanne
12 years, 10 months
Proposal: ISPN-1394 Manual rehashing in 5.2
by Sanne Grinovero
I think this is an important feature to have soon;
My understanding of it:
We default with the feature off, and newly discovered nodes are
added/removed as usual. With a JMX operatable switch, one can disable
this:
If a remote node is joining the JGroups view, but rehash is off: it
will be added to a to-be-installed view, but this won't be installed
until rehash is enabled again. This gives time to add more changes
before starting the rehash, and would help a lot to start larger
clusters.
If the [self] node is booting and joining a cluster with manual rehash
off, the start process and any getCache() invocation should block and
wait for it to be enabled. This would need of course to override the
usually low timeouts.
When a node is suspected it's a bit a different story as we need to
make sure no data is lost. The principle is the same, but maybe we
should have two flags: one which is a "soft request" to avoid rehashes
of less than N members (and refuse N>=numOwners ?), one which is just
disable it and don't care: data might be in a cachestore, data might
not be important. Which reminds me, we should consider as well a JMX
command to flush the container to the CacheLoader.
--Sanne
12 years, 10 months
JBoss Libra
by Galder Zamarreño
Just saw this: https://github.com/wolfc/jboss-libra
We should investigate the possibility of adding this to Infinispan and provide memory size based eviction, WDYT?
The performance impact would need to be measured too.
EhCache has apparenlty done something similar but from what I heard, it's full of hacks to work on diff plattforms...
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
12 years, 10 months
DIST.retrieveFromRemoteSource
by Sanne Grinovero
Hello,
in the method:
org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(Object,
InvocationContext, boolean)
we have:
List<Address> targets = locate(key);
// if any of the recipients has left the cluster since the
command was issued, just don't wait for its response
targets.retainAll(rpcManager.getTransport().getMembers());
But then then we use ResponseMode.WAIT_FOR_VALID_RESPONSE, which means
we're not going to wait for all responses anyway, and I think we might
assume to get a reply by a node which actually is in the cluster.
So the retainAll method is unneeded and can be removed? I'm wondering,
because it's not safe anyway, actually it seems very unlikely to me
that just between a locate(key) and the retainAll the view is being
changed, so not something we should be relying on anyway.
I'd rather assume that such a get method might be checked and
eventually dropped by the receiver.
Cheers,
Sanne
12 years, 10 months
again: "no physical address"
by Sanne Grinovero
Hi Bela,
this is the same error we where having in Boston when preparing the
Infinispan nodes for some of the demos. So I didn't see it for a long
time, but today it returned especially to add a special twist to my
performance tests.
Dan,
when this happened it looked like I had a deadlock: the benchmark is
not making any more progress, it looks like they are all waiting for
answers. JConsole didn't detect a deadlock, and unfortunately I'm not
having more logs than this from nor JGroups nor Infinispan (since it
was supposed to be a performance test!).
I'm attaching a threaddump in case it interests you, but I hope not:
this is a DIST test with 12 nodes (in the same VM from this dump). I
didn't have time to inspect it myself as I have to run, and I think
the interesting news here is with the "no physical address"
ideas?
[org.jboss.logging] Logging Provider: org.jboss.logging.Log4jLoggerProvider
[org.jgroups.protocols.UDP] sanne-55119: no physical address for
sanne-53650, dropping message
[org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to
sanne-53650 timed out (after 3000 ms), retrying
[org.jgroups.protocols.pbcast.GMS] sanne-55119 already present;
returning existing view [sanne-53650|5] [sanne-53650, sanne-49978,
sanne-27401, sanne-4741, sanne-29196, sanne-55119]
[org.jgroups.protocols.UDP] sanne-39563: no physical address for
sanne-53650, dropping message
[org.jgroups.protocols.pbcast.GMS] JOIN(sanne-39563) sent to
sanne-53650 timed out (after 3000 ms), retrying
[org.jgroups.protocols.pbcast.GMS] sanne-39563 already present;
returning existing view [sanne-53650|6] [sanne-53650, sanne-49978,
sanne-27401, sanne-4741, sanne-29196, sanne-55119, sanne-39563]
[org.jgroups.protocols.UDP] sanne-18071: no physical address for
sanne-39563, dropping message
[org.jgroups.protocols.UDP] sanne-18071: no physical address for
sanne-55119, dropping message
12 years, 10 months
Don't forget to update the XSD when adding configuration
by Pete Muir
This is not done automatically, you'll need to do it yourself. Make sure to add docs too.
Please also remember to update src/test/resources/configs/all.xml with your new elements or attributes. A test validates this file against the schema.
12 years, 11 months
The need for a 5.1.1
by Manik Surtani
I really didn't want to do this, but it looks like a 5.1.1 will be necessary. The biggest (critical, IMO, for 5.1.1) issues I see are:
1. https://issues.jboss.org/browse/ISPN-1786 - I presume this has to do with a bug Mircea spotted that virtual nodes were not being enabled by the config parser. Which meant that even in the case of tests enabling virtual nodes, we still saw uneven distribution and hence poor performance (well spotted, Mircea).
2. Related to 1, I don't think there is a JIRA for this yet, to change the default number of virtual nodes from 1 to 100 or so. After we profile and analyse the impact of enabling this by default. I'm particularly concerned about (a) memory footprint and (b) effects on Hot Rod relaying topology information back to clients. Maybe 10 is a more sane default as a result.
3. https://issues.jboss.org/browse/ISPN-1788 - config parser out of sync with XSD!
4. https://issues.jboss.org/browse/ISPN-1798 - forceReturnValues parameter in the RemoteCacheManager.getCache() method is ignored!
In addition, we may as well have these "nice to have's" in as well:
https://issues.jboss.org/browse/ISPN-1787
https://issues.jboss.org/browse/ISPN-1793
https://issues.jboss.org/browse/ISPN-1795
https://issues.jboss.org/browse/ISPN-1789
https://issues.jboss.org/browse/ISPN-1784
What do you think? Anything else you feel that is crucial for a 5.1.1? I'd like to do this sooner rather than later, so we can still focus on 5.2.0. So please respond asap.
Paul, I'd also like your thoughts on this from an AS7 perspective.
Cheers
Manik
--
Manik Surtani
manik(a)jboss.org
twitter.com/maniksurtani
Lead, Infinispan
http://www.infinispan.org
12 years, 11 months
default value for virtualNodes
by Mircea Markus
Hi,
ATM the default value for virtualNodes is 1. This means that the wheel-share each node has can be very uneven[1] for smalls(up to 15 nodes) clusters.
Increasing this value even to a small number(10-30) would significantly improve each node's share of wheel and the chance for a well balanced data distribution over the cluster.
So I think that increasing the default value would make sense. What are the drawbacks though? I'm thinking performance and HR wise...
[1] a random example of uneven distribution obtained with radargun
Cluster size: 4 -> ( 15505 13698 5918 4482)
Cluster size: 6 -> ( 8761 7820 17145 8188 12827 4183)
Cluster size: 8 -> ( 8391 6302 10773 22068 3589 200 3050 25211)
12 years, 11 months