again: "no physical address"

DIST.retrieveFromRemoteSource

Write skew doubt

Sanne Grinovero

Tuesday, 31 January 2012 Tue, 31 Jan '12

2:45 a.m.

Hi Bela, this is the same error we where having in Boston when preparing the Infinispan nodes for some of the demos. So I didn't see it for a long time, but today it returned especially to add a special twist to my performance tests. Dan, when this happened it looked like I had a deadlock: the benchmark is not making any more progress, it looks like they are all waiting for answers. JConsole didn't detect a deadlock, and unfortunately I'm not having more logs than this from nor JGroups nor Infinispan (since it was supposed to be a performance test!). I'm attaching a threaddump in case it interests you, but I hope not: this is a DIST test with 12 nodes (in the same VM from this dump). I didn't have time to inspect it myself as I have to run, and I think the interesting news here is with the "no physical address" ideas? [org.jboss.logging] Logging Provider: org.jboss.logging.Log4jLoggerProvider [org.jgroups.protocols.UDP] sanne-55119: no physical address for sanne-53650, dropping message [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to sanne-53650 timed out (after 3000 ms), retrying [org.jgroups.protocols.pbcast.GMS] sanne-55119 already present; returning existing view [sanne-53650|5] [sanne-53650, sanne-49978, sanne-27401, sanne-4741, sanne-29196, sanne-55119] [org.jgroups.protocols.UDP] sanne-39563: no physical address for sanne-53650, dropping message [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-39563) sent to sanne-53650 timed out (after 3000 ms), retrying [org.jgroups.protocols.pbcast.GMS] sanne-39563 already present; returning existing view [sanne-53650|6] [sanne-53650, sanne-49978, sanne-27401, sanne-4741, sanne-29196, sanne-55119, sanne-39563] [org.jgroups.protocols.UDP] sanne-18071: no physical address for sanne-39563, dropping message [org.jgroups.protocols.UDP] sanne-18071: no physical address for sanne-55119, dropping message

Attachments:

threadDump.txt (text/plain — 244.8 KB)

Show replies by date

Manik Surtani

Tuesday, 31 January Tue, 31 Jan

7:29 p.m.

I have sporadically seen this before when running some perf tests as well … curious to know what's up. On 30 Jan 2012, at 17:45, Sanne Grinovero wrote:

...

-- Manik Surtani manik(a)jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org

Bela Ban

11:31 p.m.

This happens every now and then, when multiple nodes join at the same time, on the same host and PING has a small num_initial_mbrs. Since 2.8, the identity of a member is not an IP address:port anymore, but a UUID. The UUID has to be mapped to an IP address (and port), and every member maintains a table of UUIDs/IP addresses. This table is populated at startup, but the shipping of the IP address/UUID association is unreliable (in the case of UDP), so packets do get dropped when there are traffic spikes, like concurrent startup, or when the high CPU usage slows down things. If we need to send a unicast message to P, and the table doesn't have a mapping for P, PING multicasts a discovery request, and drops the message. Every member responds with the IP address of P, which is then added to the table. The next time the message is sent (through retransmission), P's IP address will be available, and the unicast send should succeed. Of course, if the multicast or unicast response is dropped too, we'll run this protocol again... and again ... and again, until we finally have a valid IP address for P. On 1/31/12 11:29 AM, Manik Surtani wrote:

...

I have sporadically seen this before when running some perf tests as well … curious to know what's up. On 30 Jan 2012, at 17:45, Sanne Grinovero wrote: > Hi Bela, > this is the same error we where having in Boston when preparing the > Infinispan nodes for some of the demos. So I didn't see it for a long > time, but today it returned especially to add a special twist to my > performance tests. > > Dan, > when this happened it looked like I had a deadlock: the benchmark is > not making any more progress, it looks like they are all waiting for > answers. JConsole didn't detect a deadlock, and unfortunately I'm not > having more logs than this from nor JGroups nor Infinispan (since it > was supposed to be a performance test!). > > I'm attaching a threaddump in case it interests you, but I hope not: > this is a DIST test with 12 nodes (in the same VM from this dump). I > didn't have time to inspect it myself as I have to run, and I think > the interesting news here is with the "no physical address" > > ideas? > > [org.jboss.logging] Logging Provider: org.jboss.logging.Log4jLoggerProvider > [org.jgroups.protocols.UDP] sanne-55119: no physical address for > sanne-53650, dropping message > [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to > sanne-53650 timed out (after 3000 ms), retrying > [org.jgroups.protocols.pbcast.GMS] sanne-55119 already present; > returning existing view [sanne-53650|5] [sanne-53650, sanne-49978, > sanne-27401, sanne-4741, sanne-29196, sanne-55119] > [org.jgroups.protocols.UDP] sanne-39563: no physical address for > sanne-53650, dropping message > [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-39563) sent to > sanne-53650 timed out (after 3000 ms), retrying > [org.jgroups.protocols.pbcast.GMS] sanne-39563 already present; > returning existing view [sanne-53650|6] [sanne-53650, sanne-49978, > sanne-27401, sanne-4741, sanne-29196, sanne-55119, sanne-39563] > [org.jgroups.protocols.UDP] sanne-18071: no physical address for > sanne-39563, dropping message > [org.jgroups.protocols.UDP] sanne-18071: no physical address for > sanne-55119, dropping message > <threadDump.txt>_______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat

Dan Berindei

Wednesday, 1 February Wed, 1 Feb

6:55 a.m.

Hi Bela I guess it's pretty clear now... In Sanne's thread dump the main thread is blocked in a cache.put() call after the cluster has supposedly already formed: "org.infinispan.benchmark.Transactional.main()" prio=10 tid=0x00007ff4045de000 nid=0x7c92 in Object.wait() [0x00007ff40919d000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000007f61997d0> (a org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator.getResponseList(CommandAwareRpcDispatcher.java:372) ... at org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(DistributionManagerImpl.java:169) ... at org.infinispan.CacheSupport.put(CacheSupport.java:52) at org.infinispan.benchmark.Transactional.start(Transactional.java:110) at org.infinispan.benchmark.Transactional.main(Transactional.java:70) State transfer was disabled, so during the cluster startup the nodes only had to communicate with the coordinator and not between them. The put command had to get the old value from another node, so it needed the physical address and had to block until PING would retrieve it. Does PING use RSVP or does it wait for the normal STABLE timeout for retransmission? Note that everything is blocked at this point, we won't send another message in the entire cluster until we got the physical address. I'm sure you've already considered it before, but why not make the physical addresses a part of the view installation message? This should ensure that every node can communicate with every other node by the time the view is installed. I'm also not sure what to make of these lines:

...

>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >> sanne-53650, dropping message >> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >> sanne-53650 timed out (after 3000 ms), retrying

It appears that sanne-55119 knows the logical name of sanne-53650, and the fact that it's coordinator, but not its physical address. Shouldn't all of this information have arrived at the same time? Cheers Dan On Tue, Jan 31, 2012 at 4:31 PM, Bela Ban <bban(a)redhat.com> wrote: > This happens every now and then, when multiple nodes join at the same > time, on the same host and PING has a small num_initial_mbrs. > > Since 2.8, the identity of a member is not an IP address:port anymore, > but a UUID. The UUID has to be mapped to an IP address (and port), and > every member maintains a table of UUIDs/IP addresses. This table is > populated at startup, but the shipping of the IP address/UUID > association is unreliable (in the case of UDP), so packets do get > dropped when there are traffic spikes, like concurrent startup, or when > the high CPU usage slows down things. > > If we need to send a unicast message to P, and the table doesn't have a > mapping for P, PING multicasts a discovery request, and drops the > message. Every member responds with the IP address of P, which is then > added to the table. The next time the message is sent (through > retransmission), P's IP address will be available, and the unicast send > should succeed. > > Of course, if the multicast or unicast response is dropped too, we'll > run this protocol again... and again ... and again, until we finally > have a valid IP address for P. > > > On 1/31/12 11:29 AM, Manik Surtani wrote: >> I have sporadically seen this before when running some perf tests as well … curious to know what's up. >> >> On 30 Jan 2012, at 17:45, Sanne Grinovero wrote: >> >>> Hi Bela, >>> this is the same error we where having in Boston when preparing the >>> Infinispan nodes for some of the demos. So I didn't see it for a long >>> time, but today it returned especially to add a special twist to my >>> performance tests. >>> >>> Dan, >>> when this happened it looked like I had a deadlock: the benchmark is >>> not making any more progress, it looks like they are all waiting for >>> answers. JConsole didn't detect a deadlock, and unfortunately I'm not >>> having more logs than this from nor JGroups nor Infinispan (since it >>> was supposed to be a performance test!). >>> >>> I'm attaching a threaddump in case it interests you, but I hope not: >>> this is a DIST test with 12 nodes (in the same VM from this dump). I >>> didn't have time to inspect it myself as I have to run, and I think >>> the interesting news here is with the "no physical address" >>> >>> ideas? >>> >>> [org.jboss.logging] Logging Provider: org.jboss.logging.Log4jLoggerProvider

...

>>> [org.jgroups.protocols.pbcast.GMS] sanne-55119 already present; >>> returning existing view [sanne-53650|5] [sanne-53650, sanne-49978, >>> sanne-27401, sanne-4741, sanne-29196, sanne-55119] >>> [org.jgroups.protocols.UDP] sanne-39563: no physical address for >>> sanne-53650, dropping message >>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-39563) sent to >>> sanne-53650 timed out (after 3000 ms), retrying >>> [org.jgroups.protocols.pbcast.GMS] sanne-39563 already present; >>> returning existing view [sanne-53650|6] [sanne-53650, sanne-49978, >>> sanne-27401, sanne-4741, sanne-29196, sanne-55119, sanne-39563] >>> [org.jgroups.protocols.UDP] sanne-18071: no physical address for >>> sanne-39563, dropping message >>> [org.jgroups.protocols.UDP] sanne-18071: no physical address for >>> sanne-55119, dropping message >>> <threadDump.txt>_______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Bela Ban > Lead JGroups (http://www.jgroups.org) > JBoss / Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev

Bela Ban

4:48 p.m.

On 1/31/12 10:55 PM, Dan Berindei wrote:

...

Hi Bela I guess it's pretty clear now... In Sanne's thread dump the main thread is blocked in a cache.put() call after the cluster has supposedly already formed: "org.infinispan.benchmark.Transactional.main()" prio=10 tid=0x00007ff4045de000 nid=0x7c92 in Object.wait() [0x00007ff40919d000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on<0x00000007f61997d0> (a org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator.getResponseList(CommandAwareRpcDispatcher.java:372) ... at org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(DistributionManagerImpl.java:169) ... at org.infinispan.CacheSupport.put(CacheSupport.java:52) at org.infinispan.benchmark.Transactional.start(Transactional.java:110) at org.infinispan.benchmark.Transactional.main(Transactional.java:70) State transfer was disabled, so during the cluster startup the nodes only had to communicate with the coordinator and not between them. The put command had to get the old value from another node, so it needed the physical address and had to block until PING would retrieve it.

That's not the way it works; at startup of F, it sends its IP address with the discovery request. Everybody returns its IP address with the discovery response, so even though we have F only talking to A (the coordinator) initially, F will also know the IP addresses of A,B,C,D and E.

...

Does PING use RSVP

No: (1) I don;'t want a dependency of Discovery on RSVP and (2) the discovery is unreliable; discovery requests or responses can get dropped.

...

or does it wait for the normal STABLE timeout for retransmission?

...

Note that everything is blocked at this point, we won't send another message in the entire cluster until we got the physical address.

As I said; this is an exceptional case, probably caused by Sanne starting 12 channels inside the same JVM, at the same time, therefore causing a traffic spike, which results in dropped discovery requests or responses. After than, when F wants to talk to C, it asks the cluster for C's IP address, and that should be a few ms at most.

...

I'm sure you've already considered it before, but why not make the physical addresses a part of the view installation message? This should ensure that every node can communicate with every other node by the time the view is installed.

There's a few reasons: - I don't want to make GMS dependent on logical addresses. GMS is completely independent and shouldn't know about physical addresses - At the time GMS kicks in, it's already too late. Remember, F needs to send a unicast JOIN request to A, but at this point it doesn't yet know A's address - MERGE{2,3} also use discovery to detect sub-partitions to be merged, so discovery needs to be a separate piece of functionality - A View is already big as it is, and I've managed to reduce its size even more, but adding physical addresses would blow up the size of View even more, especially in large clusters

...

I'm also not sure what to make of these lines: >>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>> sanne-53650, dropping message >>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>> sanne-53650 timed out (after 3000 ms), retrying It appears that sanne-55119 knows the logical name of sanne-53650, and the fact that it's coordinator, but not its physical address. Shouldn't all of this information have arrived at the same time?

Hmm, correct. However, the logical names are kept in (a static) UUID.cache and the IP addresses in TP.logical_addr_cache. I suggest to do the following when this happens (can you reproduce this ?): - Before: set enable_diagnostics=true in UDP - probe.sh op=UDP.printLogicalAddressCache // you can replace probe.sh with java -jar jgroups.jar org.jgroups.tests.Probe Here you can dump the logical caches, to see whether this information is absent. You could also enable tracing for PING: probe.sh op=PING.setLevel["trace"] -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat

Dan Berindei

6:25 p.m.

On Wed, Feb 1, 2012 at 9:48 AM, Bela Ban <bban(a)redhat.com> wrote:

...

On 1/31/12 10:55 PM, Dan Berindei wrote: > Hi Bela > > I guess it's pretty clear now... In Sanne's thread dump the main > thread is blocked in a cache.put() call after the cluster has > supposedly already formed: > > "org.infinispan.benchmark.Transactional.main()" prio=10 > tid=0x00007ff4045de000 nid=0x7c92 in Object.wait() > [0x00007ff40919d000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on<0x00000007f61997d0> (a > org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator) > at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator.getResponseList(CommandAwareRpcDispatcher.java:372) > ... > at org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(DistributionManagerImpl.java:169) > ... > at org.infinispan.CacheSupport.put(CacheSupport.java:52) > at org.infinispan.benchmark.Transactional.start(Transactional.java:110) > at org.infinispan.benchmark.Transactional.main(Transactional.java:70) > > State transfer was disabled, so during the cluster startup the nodes > only had to communicate with the coordinator and not between them. The > put command had to get the old value from another node, so it needed > the physical address and had to block until PING would retrieve it. That's not the way it works; at startup of F, it sends its IP address with the discovery request. Everybody returns its IP address with the discovery response, so even though we have F only talking to A (the coordinator) initially, F will also know the IP addresses of A,B,C,D and E.

Ok, I stand corrected... since we start all the nodes on the same thread, each of them should reply to the discovery request of the next nodes. However, num_initial_members was set to 3 (the Infinispan default). Could that make PING not wait for all the responses? If it's like that, then I suggest we set a (much) higher num_initial_members and a lower timeout in the default configuration.

...

> Does PING use RSVP No: (1) I don;'t want a dependency of Discovery on RSVP and (2) the discovery is unreliable; discovery requests or responses can get dropped.

Right, I keep forgetting that every protocol is optional!

...

> or does it wait for the normal STABLE timeout for retransmission? > Note that everything is blocked at this point, we > won't send another message in the entire cluster until we got the physical address. As I said; this is an exceptional case, probably caused by Sanne starting 12 channels inside the same JVM, at the same time, therefore causing a traffic spike, which results in dropped discovery requests or responses.

Bela, we create the caches on a single thread, so we never have more than one node joining at the same time. At most we could have some extra activity if one node can't join the existing cluster and starts a separate partition, but hardly enough to cause congestion.

...

After than, when F wants to talk to C, it asks the cluster for C's IP address, and that should be a few ms at most.

Ok, so when F wanted to send the ClusteredGetCommand request to C, PING got the physical address right away. But the ClusteredGetCommand had to wait for STABLE to kick in and for C to ask for retransmission (because we didn't send any other messages). Maybe *we* should use RSVP for our ClusteredGetCommands, since those can never block... Actually, we don't want to retransmit the request if we already got a response from another node, so it would be best if we could ask for retransmission of a particular request explicitly ;-) I wonder if we could also decrease desired_avg_gossip and stability_delay in STABLE. After all, an extra STABLE round can't slow us when we're not doing anything, and when we are busy we're going to hit the max_bytes limit much sooner than the desired_avg_gossip time limit anyway.

...

> I'm sure you've already considered it before, but why not make the > physical addresses a part of the view installation message? This > should ensure that every node can communicate with every other node by > the time the view is installed. There's a few reasons: - I don't want to make GMS dependent on logical addresses. GMS is completely independent and shouldn't know about physical addresses - At the time GMS kicks in, it's already too late. Remember, F needs to send a unicast JOIN request to A, but at this point it doesn't yet know A's address - MERGE{2,3} also use discovery to detect sub-partitions to be merged, so discovery needs to be a separate piece of functionality - A View is already big as it is, and I've managed to reduce its size even more, but adding physical addresses would blow up the size of View even more, especially in large clusters

Thanks for the explanation.

...

> I'm also not sure what to make of these lines: > >>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>> sanne-53650, dropping message >>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>> sanne-53650 timed out (after 3000 ms), retrying > > It appears that sanne-55119 knows the logical name of sanne-53650, and > the fact that it's coordinator, but not its physical address. > Shouldn't all of this information have arrived at the same time? Hmm, correct. However, the logical names are kept in (a static) UUID.cache and the IP addresses in TP.logical_addr_cache.

Ah, so if we have 12 nodes in the same VM they automatically know each other's logical name - they don't need PING at all! Does the logical cache get cleared on channel stop? I think that would explain another weird thing I was seeing in the test suite logs, sometimes everyone in a cluster would suddenly forget everyone else's logical name and start logging UUIDs.

...

I suggest to do the following when this happens (can you reproduce this ?): - Before: set enable_diagnostics=true in UDP - probe.sh op=UDP.printLogicalAddressCache // you can replace probe.sh with java -jar jgroups.jar org.jgroups.tests.Probe Here you can dump the logical caches, to see whether this information is absent. You could also enable tracing for PING: probe.sh op=PING.setLevel["trace"]

This is running the Transactional benchmark, so it would be simpler if we enabled PING trace in the configuration and disabled it before the actual benchmark starts. I'm going to try it myself :) Cheers Dan

Bela Ban

Thursday, 2 February Thu, 2 Feb

12:18 a.m.

On 2/1/12 10:25 AM, Dan Berindei wrote:

...

> That's not the way it works; at startup of F, it sends its IP address > with the discovery request. Everybody returns its IP address with the > discovery response, so even though we have F only talking to A (the > coordinator) initially, F will also know the IP addresses of A,B,C,D and E. > Ok, I stand corrected... since we start all the nodes on the same thread, each of them should reply to the discovery request of the next nodes.

Hmm, can you reproduce this every time ? If so, can you send me the program so I can run it here ?

...

However, num_initial_members was set to 3 (the Infinispan default). Could that make PING not wait for all the responses? If it's like that, then I suggest we set a (much) higher num_initial_members and a lower timeout in the default configuration.

Yes, the discovery could return quickly, but the responses would even be processed if they were received later, so I don't think that's the issue. The initial discovery should discover *all* IP addresses, later triggering a discovery because an IP address wasn't found should always be the exceptional case ! If you start members in turn, then they should easily form a cluster and not even merge. Here's what can happen on a merge: - The view is A|1={A,B}, both A and B have IP addresses for A and B - The view splits into A|2={A} and B|2={B} - A now marks B's IP address as removable and B marks A's IP address as removable - If the cache grows to over 500 entries (TP.logical_addr_cache_max_size) or TP.logical_addr_cache_expiration milliseconds elapse (whichever comes first), the entries marked as removable are removed - If, *before that* the merge view A|3={A,B} is installed, A unmarks B and B unmarks A, so the entries won't get removed So a hypothesis of how those IP addresses get removed could be that the cluster had a couple of merges, that didn't heal for 2 minutes (?) hard to believe though... We have to get to the bottom of this, so it would be great if you had a program that reproduced this, that I could send myself. The main question is why the IP address for the target is gone and/or why the IP address wasn't received in the first place. In any case, replacing MERGE2 with MERGE3 might help a bit, as MERGE3 [1] periodically broadcasts IP address/logical name and logical address: "An INFO message carries the logical name and physical address of a member. Compared to MERGE2, this allows us to immediately send messages to newly merged members, and not have to solicit this information first. " (copied from the documentation)

...

>> Note that everything is blocked at this point, we >> won't send another message in the entire cluster until we got the physical address.

Understood. Let me see if I can block sending of the message for a max time (say 2 seconds) until I get the IP address. Not very nice, and I prefer a different approach (plus we need to see why this happens in the first place anyway)...

...

> As I said; this is an exceptional case, probably caused by Sanne > starting 12 channels inside the same JVM, at the same time, therefore > causing a traffic spike, which results in dropped discovery requests or > responses. > Bela, we create the caches on a single thread, so we never have more than one node joining at the same time. At most we could have some extra activity if one node can't join the existing cluster and starts a separate partition, but hardly enough to cause congestion.

Hmm, does indeed not sound like an issue...

...

> After than, when F wants to talk to C, it asks the cluster for C's IP > address, and that should be a few ms at most. > Ok, so when F wanted to send the ClusteredGetCommand request to C, PING got the physical address right away. But the ClusteredGetCommand had to wait for STABLE to kick in and for C to ask for retransmission (because we didn't send any other messages).

Yep. Before I implement some blocking until we have the IP address, or a timeout elapses, I'd like to try to get to the bottom of this problem first !

...

Maybe *we* should use RSVP for our ClusteredGetCommands, since those can never block... Actually, we don't want to retransmit the request if we already got a response from another node, so it would be best if we could ask for retransmission of a particular request explicitly ;-)

I'd rather implement the blocking approach above ! :-)

...

I wonder if we could also decrease desired_avg_gossip and stability_delay in STABLE. After all, an extra STABLE round can't slow us when we're not doing anything, and when we are busy we're going to hit the max_bytes limit much sooner than the desired_avg_gossip time limit anyway.

I don't think this is a good idea as it will generate more traffic. The stable task is not skipped when we have a lot of traffic, so this will compound the issue.

...

>> I'm also not sure what to make of these lines: >> >>>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>>> sanne-53650, dropping message >>>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>>> sanne-53650 timed out (after 3000 ms), retrying >> >> It appears that sanne-55119 knows the logical name of sanne-53650, and >> the fact that it's coordinator, but not its physical address. >> Shouldn't all of this information have arrived at the same time? > > Hmm, correct. However, the logical names are kept in (a static) > UUID.cache and the IP addresses in TP.logical_addr_cache. > Ah, so if we have 12 nodes in the same VM they automatically know each other's logical name - they don't need PING at all!

Yes. Note that logical names are not the problem; even if we evict some logical name from the cache (and we do this only for removed members), JGroups will still work as it only needs UUIDs and IP addresses.

...

Does the logical cache get cleared on channel stop? I think that would explain another weird thing I was seeing in the test suite logs, sometimes everyone in a cluster would suddenly forget everyone else's logical name and start logging UUIDs.

On a view change, we remove all entries which are *not* in the new view. However, 'removing' is again simply marking those members as 'removable', and only if the cache grows beyond 500 (-Djgroups.uuid_cache.max_entries=500) entries will all entries older than 5 seconds (-Djgroups.uuid_cache.max_age=5000) be removed. (There is no separate reaper task running for this). So, yes, this can happen, but on the next discovery round, we'll have the correct values. Again, as I said, UUID.cache is not as important as TP.logical_addr_cache.

...

This is running the Transactional benchmark, so it would be simpler if we enabled PING trace in the configuration and disabled it before the actual benchmark starts. I'm going to try it myself :)

How do you run 12 instances ? Did you change something in the config ? I'd be interested in trying the *exact* same config you're running, to see what's going on ! [1] http://www.jgroups.org/manual-3.x/html/protlist.html#MERGE3 -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat

Sanne Grinovero

12:52 a.m.

On 1 February 2012 15:18, Bela Ban <bban(a)redhat.com> wrote:

...

On 2/1/12 10:25 AM, Dan Berindei wrote: >> That's not the way it works; at startup of F, it sends its IP address >> with the discovery request. Everybody returns its IP address with the >> discovery response, so even though we have F only talking to A (the >> coordinator) initially, F will also know the IP addresses of A,B,C,D and E. >> > > Ok, I stand corrected... since we start all the nodes on the same > thread, each of them should reply to the discovery request of the next nodes. Hmm, can you reproduce this every time ? If so, can you send me the program so I can run it here ? > However, num_initial_members was set to 3 (the Infinispan default). > Could that make PING not wait for all the responses? If it's like > that, then I suggest we set a (much) higher num_initial_members and a > lower timeout in the default configuration. Yes, the discovery could return quickly, but the responses would even be processed if they were received later, so I don't think that's the issue. The initial discovery should discover *all* IP addresses, later triggering a discovery because an IP address wasn't found should always be the exceptional case ! If you start members in turn, then they should easily form a cluster and not even merge. Here's what can happen on a merge: - The view is A|1={A,B}, both A and B have IP addresses for A and B - The view splits into A|2={A} and B|2={B} - A now marks B's IP address as removable and B marks A's IP address as removable - If the cache grows to over 500 entries (TP.logical_addr_cache_max_size) or TP.logical_addr_cache_expiration milliseconds elapse (whichever comes first), the entries marked as removable are removed - If, *before that* the merge view A|3={A,B} is installed, A unmarks B and B unmarks A, so the entries won't get removed So a hypothesis of how those IP addresses get removed could be that the cluster had a couple of merges, that didn't heal for 2 minutes (?) hard to believe though... We have to get to the bottom of this, so it would be great if you had a program that reproduced this, that I could send myself. The main question is why the IP address for the target is gone and/or why the IP address wasn't received in the first place. In any case, replacing MERGE2 with MERGE3 might help a bit, as MERGE3 [1] periodically broadcasts IP address/logical name and logical address: "An INFO message carries the logical name and physical address of a member. Compared to MERGE2, this allows us to immediately send messages to newly merged members, and not have to solicit this information first. " (copied from the documentation) >>> Note that everything is blocked at this point, we >>> won't send another message in the entire cluster until we got the physical address. Understood. Let me see if I can block sending of the message for a max time (say 2 seconds) until I get the IP address. Not very nice, and I prefer a different approach (plus we need to see why this happens in the first place anyway)... >> As I said; this is an exceptional case, probably caused by Sanne >> starting 12 channels inside the same JVM, at the same time, therefore >> causing a traffic spike, which results in dropped discovery requests or >> responses. >> > > Bela, we create the caches on a single thread, so we never have more > than one node joining at the same time. > At most we could have some extra activity if one node can't join the > existing cluster and starts a separate partition, but hardly enough to > cause congestion. Hmm, does indeed not sound like an issue... >> After than, when F wants to talk to C, it asks the cluster for C's IP >> address, and that should be a few ms at most. >> > > Ok, so when F wanted to send the ClusteredGetCommand request to C, > PING got the physical address right away. But the ClusteredGetCommand > had to wait for STABLE to kick in and for C to ask for retransmission > (because we didn't send any other messages). Yep. Before I implement some blocking until we have the IP address, or a timeout elapses, I'd like to try to get to the bottom of this problem first ! > Maybe *we* should use RSVP for our ClusteredGetCommands, since those > can never block... Actually, we don't want to retransmit the request > if we already got a response from another node, so it would be best if > we could ask for retransmission of a particular request explicitly ;-) I'd rather implement the blocking approach above ! :-) > I wonder if we could also decrease desired_avg_gossip and > stability_delay in STABLE. After all, an extra STABLE round can't slow > us when we're not doing anything, and when we are busy we're going to > hit the max_bytes limit much sooner than the desired_avg_gossip time > limit anyway. I don't think this is a good idea as it will generate more traffic. The stable task is not skipped when we have a lot of traffic, so this will compound the issue. >>> I'm also not sure what to make of these lines: >>> >>>>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>>>> sanne-53650, dropping message >>>>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>>>> sanne-53650 timed out (after 3000 ms), retrying >>> >>> It appears that sanne-55119 knows the logical name of sanne-53650, and >>> the fact that it's coordinator, but not its physical address. >>> Shouldn't all of this information have arrived at the same time? >> >> Hmm, correct. However, the logical names are kept in (a static) >> UUID.cache and the IP addresses in TP.logical_addr_cache. >> > > Ah, so if we have 12 nodes in the same VM they automatically know each > other's logical name - they don't need PING at all! Yes. Note that logical names are not the problem; even if we evict some logical name from the cache (and we do this only for removed members), JGroups will still work as it only needs UUIDs and IP addresses. > Does the logical cache get cleared on channel stop? I think that would > explain another weird thing I was seeing in the test suite logs, > sometimes everyone in a cluster would suddenly forget everyone else's > logical name and start logging UUIDs. On a view change, we remove all entries which are *not* in the new view. However, 'removing' is again simply marking those members as 'removable', and only if the cache grows beyond 500 (-Djgroups.uuid_cache.max_entries=500) entries will all entries older than 5 seconds (-Djgroups.uuid_cache.max_age=5000) be removed. (There is no separate reaper task running for this). So, yes, this can happen, but on the next discovery round, we'll have the correct values. Again, as I said, UUID.cache is not as important as TP.logical_addr_cache. > This is running the Transactional benchmark, so it would be simpler if > we enabled PING trace in the configuration and disabled it before the > actual benchmark starts. I'm going to try it myself :) How do you run 12 instances ? Did you change something in the config ? I'd be interested in trying the *exact* same config you're running, to see what's going on !

Pushed it for you, committing the exact configuration changes as well: git clone git://github.com/Sanne/InfinispanStartupBenchmark.git cd InfinispanStartupBenchmark git co my sh bench.sh If you look into bench.sh (as it is at https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/bench.sh ) the lines 9, 15, 28 should be the most interesting. You need to run with Infinispan's default configuration to reproduce the issue, but I wouldn't mind you commenting on https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/benchmark/src... as well. That's what I'm now using as default for further performance tests. Cheers, Sanne

...

[1] http://www.jgroups.org/manual-3.x/html/protlist.html#MERGE3 -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Bela Ban

1:40 a.m.

Your benchmark is giving me the creeps ! First, which version of JGroups / Infinispan does this use ? Second, is there a way to start this in an IDE rather than through maven ? Third, I don't think bench-jgroups.xml is picked up at all ! How do I make a change to bench-jgroups.xml and have Transactional *use* it ? This maven crap hides so much, I never know what's going on under the covers !@$@ Do I have to do mvn install or mvn package when I make a change to bench-jgroups.xml ? I don't think I'll ever switch to this f*cking piece of shit ! Please ping when as soon as you've calmed down... :-) Cheers, On 2/1/12 4:52 PM, Sanne Grinovero wrote:

...

On 1 February 2012 15:18, Bela Ban<bban(a)redhat.com> wrote: > > > On 2/1/12 10:25 AM, Dan Berindei wrote: > >>> That's not the way it works; at startup of F, it sends its IP address >>> with the discovery request. Everybody returns its IP address with the >>> discovery response, so even though we have F only talking to A (the >>> coordinator) initially, F will also know the IP addresses of A,B,C,D and E. >>> >> >> Ok, I stand corrected... since we start all the nodes on the same >> thread, each of them should reply to the discovery request of the next nodes. > > > Hmm, can you reproduce this every time ? If so, can you send me the > program so I can run it here ? > > >> However, num_initial_members was set to 3 (the Infinispan default). >> Could that make PING not wait for all the responses? If it's like >> that, then I suggest we set a (much) higher num_initial_members and a >> lower timeout in the default configuration. > > > Yes, the discovery could return quickly, but the responses would even be > processed if they were received later, so I don't think that's the issue. > > The initial discovery should discover *all* IP addresses, later > triggering a discovery because an IP address wasn't found should always > be the exceptional case ! > > If you start members in turn, then they should easily form a cluster and > not even merge. Here's what can happen on a merge: > - The view is A|1={A,B}, both A and B have IP addresses for A and B > - The view splits into A|2={A} and B|2={B} > - A now marks B's IP address as removable and B marks A's IP address as > removable > - If the cache grows to over 500 entries > (TP.logical_addr_cache_max_size) or TP.logical_addr_cache_expiration > milliseconds elapse (whichever comes first), the entries marked as > removable are removed > - If, *before that* the merge view A|3={A,B} is installed, A unmarks B > and B unmarks A, so the entries won't get removed > > So a hypothesis of how those IP addresses get removed could be that the > cluster had a couple of merges, that didn't heal for 2 minutes (?) hard > to believe though... > > We have to get to the bottom of this, so it would be great if you had a > program that reproduced this, that I could send myself. The main > question is why the IP address for the target is gone and/or why the IP > address wasn't received in the first place. > > In any case, replacing MERGE2 with MERGE3 might help a bit, as MERGE3 > [1] periodically broadcasts IP address/logical name and logical address: > "An INFO message carries the logical name and physical address of a > member. Compared to MERGE2, this allows us to immediately send messages > to newly merged members, and not have to solicit this information first. > " (copied from the documentation) > > > >>>> Â Note that everything is blocked at this point, we >>>> won't send another message in the entire cluster until we got the physical address. > > > Understood. Let me see if I can block sending of the message for a max > time (say 2 seconds) until I get the IP address. Not very nice, and I > prefer a different approach (plus we need to see why this happens in the > first place anyway)... > > >>> As I said; this is an exceptional case, probably caused by Sanne >>> starting 12 channels inside the same JVM, at the same time, therefore >>> causing a traffic spike, which results in dropped discovery requests or >>> responses. >>> >> >> Bela, we create the caches on a single thread, so we never have more >> than one node joining at the same time. >> At most we could have some extra activity if one node can't join the >> existing cluster and starts a separate partition, but hardly enough to >> cause congestion. > > > Hmm, does indeed not sound like an issue... > > >>> After than, when F wants to talk to C, it asks the cluster for C's IP >>> address, and that should be a few ms at most. >>> >> >> Ok, so when F wanted to send the ClusteredGetCommand request to C, >> PING got the physical address right away. But the ClusteredGetCommand >> had to wait for STABLE to kick in and for C to ask for retransmission >> (because we didn't send any other messages). > > > Yep. Before I implement some blocking until we have the IP address, or a > timeout elapses, I'd like to try to get to the bottom of this problem > first ! > > >> Maybe *we* should use RSVP for our ClusteredGetCommands, since those >> can never block... Actually, we don't want to retransmit the request >> if we already got a response from another node, so it would be best if >> we could ask for retransmission of a particular request explicitly ;-) > > > I'd rather implement the blocking approach above ! :-) > > >> I wonder if we could also decrease desired_avg_gossip and >> stability_delay in STABLE. After all, an extra STABLE round can't slow >> us when we're not doing anything, and when we are busy we're going to >> hit the max_bytes limit much sooner than the desired_avg_gossip time >> limit anyway. > > > I don't think this is a good idea as it will generate more traffic. The > stable task is not skipped when we have a lot of traffic, so this will > compound the issue. > > >>>> I'm also not sure what to make of these lines: >>>> >>>>>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>>>>> sanne-53650, dropping message >>>>>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>>>>> sanne-53650 timed out (after 3000 ms), retrying >>>> >>>> It appears that sanne-55119 knows the logical name of sanne-53650, and >>>> the fact that it's coordinator, but not its physical address. >>>> Shouldn't all of this information have arrived at the same time? >>> >>> Hmm, correct. However, the logical names are kept in (a static) >>> UUID.cache and the IP addresses in TP.logical_addr_cache. >>> >> >> Ah, so if we have 12 nodes in the same VM they automatically know each >> other's logical name - they don't need PING at all! > > > Yes. Note that logical names are not the problem; even if we evict some > logical name from the cache (and we do this only for removed members), > JGroups will still work as it only needs UUIDs and IP addresses. > > >> Does the logical cache get cleared on channel stop? I think that would >> explain another weird thing I was seeing in the test suite logs, >> sometimes everyone in a cluster would suddenly forget everyone else's >> logical name and start logging UUIDs. > > > On a view change, we remove all entries which are *not* in the new view. > However, 'removing' is again simply marking those members as > 'removable', and only if the cache grows beyond 500 > (-Djgroups.uuid_cache.max_entries=500) entries will all entries older > than 5 seconds (-Djgroups.uuid_cache.max_age=5000) be removed. (There is > no separate reaper task running for this). > > So, yes, this can happen, but on the next discovery round, we'll have > the correct values. Again, as I said, UUID.cache is not as important as > TP.logical_addr_cache. > > >> This is running the Transactional benchmark, so it would be simpler if >> we enabled PING trace in the configuration and disabled it before the >> actual benchmark starts. I'm going to try it myself :) > > > How do you run 12 instances ? Did you change something in the config ? > I'd be interested in trying the *exact* same config you're running, to > see what's going on ! Pushed it for you, committing the exact configuration changes as well: git clone git://github.com/Sanne/InfinispanStartupBenchmark.git cd InfinispanStartupBenchmark git co my sh bench.sh If you look into bench.sh (as it is at https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/bench.sh ) the lines 9, 15, 28 should be the most interesting. You need to run with Infinispan's default configuration to reproduce the issue, but I wouldn't mind you commenting on https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/benchmark/src... as well. That's what I'm now using as default for further performance tests. Cheers, Sanne > > > [1] http://www.jgroups.org/manual-3.x/html/protlist.html#MERGE3 > > -- > Bela Ban > Lead JGroups (http://www.jgroups.org) > JBoss / Red Hat > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat

Sanne Grinovero

2:12 a.m.

On 1 February 2012 16:40, Bela Ban <bban(a)redhat.com> wrote:

...

Your benchmark is giving me the creeps !

Manik was the original author, I've only been adapting it slightly to identify performance issues. I wouldn't have used Maven either, but it's serving me well especially since it turns out I have to frequently change JGroups version ;-)

...

First, which version of JGroups / Infinispan does this use ? Second, is there a way to start this in an IDE rather than through maven ? Third, I don't think bench-jgroups.xml is picked up at all ! How do I make a change to bench-jgroups.xml and have Transactional *use* it ?

1) The intention is to compare Infinispan versions, and see how we make progress, so the JGroups version is defined as each different Infinispan version was released with, or in case of a SNAPSHOT the version it's currently using. So, assuming you have a fresh snapshot of Infinispan 5.1.0-SNAPSHOT, it's JGroups 3.0.4.Final 2) As I pointed out in the previous mail, I've reconfigured it for you to not pick up the bench-jgroups.xml but go with Infinispan's default UDP configuration. You have to comment line 15 of bench.sh to use the bench-jgroups.xml, but then I can't reproduce the issue anymore.

...

This maven crap hides so much, I never know what's going on under the covers !@$@ Do I have to do mvn install or mvn package when I make a change to bench-jgroups.xml ?

Aren't you using bench.sh as I pointed out in the script in my previous mail? It does mvn install before running it. But it's also configured to NOT use bench-jgroups.xml, but rather jgroups-udp.xml.

...

I don't think I'll ever switch to this f*cking piece of shit !

I don't think Maven is to be blamed today! What's wrong? Anyway one of the nice things of this little benchmark is exactly that it it's a single class with a main file, so you can just import it in you IDE and run. Any IDE will pick the correct dependencies, thanks to Maven. Just that if you do, it will use the default test properties as hardcoded in the test class org.infinispan.benchmark.Transactional: please set the same environment variable as bench.sh does, unless you don't want to run my same configuration. Just ping me for any doubt. Cheers, Sanne

...

Please ping when as soon as you've calmed down... :-) Cheers, On 2/1/12 4:52 PM, Sanne Grinovero wrote: > > On 1 February 2012 15:18, Bela Ban<bban(a)redhat.com> wrote: >> >> >> >> On 2/1/12 10:25 AM, Dan Berindei wrote: >> >>>> That's not the way it works; at startup of F, it sends its IP address >>>> with the discovery request. Everybody returns its IP address with the >>>> discovery response, so even though we have F only talking to A (the >>>> coordinator) initially, F will also know the IP addresses of A,B,C,D >>>> and E. >>>> >>> >>> Ok, I stand corrected... since we start all the nodes on the same >>> thread, each of them should reply to the discovery request of the next >>> nodes. >> >> >> >> Hmm, can you reproduce this every time ? If so, can you send me the >> program so I can run it here ? >> >> >>> However, num_initial_members was set to 3 (the Infinispan default). >>> Could that make PING not wait for all the responses? If it's like >>> that, then I suggest we set a (much) higher num_initial_members and a >>> lower timeout in the default configuration. >> >> >> >> Yes, the discovery could return quickly, but the responses would even be >> processed if they were received later, so I don't think that's the issue. >> >> The initial discovery should discover *all* IP addresses, later >> triggering a discovery because an IP address wasn't found should always >> be the exceptional case ! >> >> If you start members in turn, then they should easily form a cluster and >> not even merge. Here's what can happen on a merge: >> - The view is A|1={A,B}, both A and B have IP addresses for A and B >> - The view splits into A|2={A} and B|2={B} >> - A now marks B's IP address as removable and B marks A's IP address as >> removable >> - If the cache grows to over 500 entries >> (TP.logical_addr_cache_max_size) or TP.logical_addr_cache_expiration >> milliseconds elapse (whichever comes first), the entries marked as >> removable are removed >> - If, *before that* the merge view A|3={A,B} is installed, A unmarks B >> and B unmarks A, so the entries won't get removed >> >> So a hypothesis of how those IP addresses get removed could be that the >> cluster had a couple of merges, that didn't heal for 2 minutes (?) hard >> to believe though... >> >> We have to get to the bottom of this, so it would be great if you had a >> program that reproduced this, that I could send myself. The main >> question is why the IP address for the target is gone and/or why the IP >> address wasn't received in the first place. >> >> In any case, replacing MERGE2 with MERGE3 might help a bit, as MERGE3 >> [1] periodically broadcasts IP address/logical name and logical address: >> "An INFO message carries the logical name and physical address of a >> member. Compared to MERGE2, this allows us to immediately send messages >> to newly merged members, and not have to solicit this information first. >> " (copied from the documentation) >> >> >> >>>>> Â Note that everything is blocked at this point, we >>>>> >>>>> won't send another message in the entire cluster until we got the >>>>> physical address. >> >> >> >> Understood. Let me see if I can block sending of the message for a max >> time (say 2 seconds) until I get the IP address. Not very nice, and I >> prefer a different approach (plus we need to see why this happens in the >> first place anyway)... >> >> >>>> As I said; this is an exceptional case, probably caused by Sanne >>>> starting 12 channels inside the same JVM, at the same time, therefore >>>> causing a traffic spike, which results in dropped discovery requests or >>>> responses. >>>> >>> >>> Bela, we create the caches on a single thread, so we never have more >>> than one node joining at the same time. >>> At most we could have some extra activity if one node can't join the >>> existing cluster and starts a separate partition, but hardly enough to >>> cause congestion. >> >> >> >> Hmm, does indeed not sound like an issue... >> >> >>>> After than, when F wants to talk to C, it asks the cluster for C's IP >>>> address, and that should be a few ms at most. >>>> >>> >>> Ok, so when F wanted to send the ClusteredGetCommand request to C, >>> PING got the physical address right away. But the ClusteredGetCommand >>> had to wait for STABLE to kick in and for C to ask for retransmission >>> (because we didn't send any other messages). >> >> >> >> Yep. Before I implement some blocking until we have the IP address, or a >> timeout elapses, I'd like to try to get to the bottom of this problem >> first ! >> >> >>> Maybe *we* should use RSVP for our ClusteredGetCommands, since those >>> can never block... Actually, we don't want to retransmit the request >>> if we already got a response from another node, so it would be best if >>> we could ask for retransmission of a particular request explicitly ;-) >> >> >> >> I'd rather implement the blocking approach above ! :-) >> >> >>> I wonder if we could also decrease desired_avg_gossip and >>> stability_delay in STABLE. After all, an extra STABLE round can't slow >>> us when we're not doing anything, and when we are busy we're going to >>> hit the max_bytes limit much sooner than the desired_avg_gossip time >>> limit anyway. >> >> >> >> I don't think this is a good idea as it will generate more traffic. The >> stable task is not skipped when we have a lot of traffic, so this will >> compound the issue. >> >> >>>>> I'm also not sure what to make of these lines: >>>>> >>>>>>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>>>>>> sanne-53650, dropping message >>>>>>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>>>>>> sanne-53650 timed out (after 3000 ms), retrying >>>>> >>>>> >>>>> It appears that sanne-55119 knows the logical name of sanne-53650, and >>>>> the fact that it's coordinator, but not its physical address. >>>>> Shouldn't all of this information have arrived at the same time? >>>> >>>> >>>> Hmm, correct. However, the logical names are kept in (a static) >>>> UUID.cache and the IP addresses in TP.logical_addr_cache. >>>> >>> >>> Ah, so if we have 12 nodes in the same VM they automatically know each >>> other's logical name - they don't need PING at all! >> >> >> >> Yes. Note that logical names are not the problem; even if we evict some >> logical name from the cache (and we do this only for removed members), >> JGroups will still work as it only needs UUIDs and IP addresses. >> >> >>> Does the logical cache get cleared on channel stop? I think that would >>> explain another weird thing I was seeing in the test suite logs, >>> sometimes everyone in a cluster would suddenly forget everyone else's >>> logical name and start logging UUIDs. >> >> >> >> On a view change, we remove all entries which are *not* in the new view. >> However, 'removing' is again simply marking those members as >> 'removable', and only if the cache grows beyond 500 >> (-Djgroups.uuid_cache.max_entries=500) entries will all entries older >> than 5 seconds (-Djgroups.uuid_cache.max_age=5000) be removed. (There is >> no separate reaper task running for this). >> >> So, yes, this can happen, but on the next discovery round, we'll have >> the correct values. Again, as I said, UUID.cache is not as important as >> TP.logical_addr_cache. >> >> >>> This is running the Transactional benchmark, so it would be simpler if >>> we enabled PING trace in the configuration and disabled it before the >>> actual benchmark starts. I'm going to try it myself :) >> >> >> >> How do you run 12 instances ? Did you change something in the config ? >> I'd be interested in trying the *exact* same config you're running, to >> see what's going on ! > > > Pushed it for you, committing the exact configuration changes as well: > > git clone git://github.com/Sanne/InfinispanStartupBenchmark.git > cd InfinispanStartupBenchmark > git co my > sh bench.sh > > If you look into bench.sh > (as it is at > https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/bench.sh > ) > the lines 9, 15, 28 should be the most interesting. > > You need to run with Infinispan's default configuration to reproduce the > issue, > but I wouldn't mind you commenting on > > https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/benchmark/src... > as well. That's what I'm now using as default for further performance > tests. > > Cheers, > Sanne > >> >> >> [1] http://www.jgroups.org/manual-3.x/html/protlist.html#MERGE3 >> >> -- >> Bela Ban >> Lead JGroups (http://www.jgroups.org) >> JBoss / Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat

Dan Berindei

3:46 p.m.

...

On 1 February 2012 16:40, Bela Ban <bban(a)redhat.com> wrote: > Your benchmark is giving me the creeps ! Manik was the original author, I've only been adapting it slightly to identify performance issues. I wouldn't have used Maven either, but it's serving me well especially since it turns out I have to frequently change JGroups version ;-) > First, which version of JGroups / Infinispan does this use ? Second, is > there a way to start this in an IDE rather than through maven ? Third, I > don't think bench-jgroups.xml is picked up at all ! How do I make a change > to bench-jgroups.xml and have Transactional *use* it ? 1) The intention is to compare Infinispan versions, and see how we make progress, so the JGroups version is defined as each different Infinispan version was released with, or in case of a SNAPSHOT the version it's currently using. So, assuming you have a fresh snapshot of Infinispan 5.1.0-SNAPSHOT, it's JGroups 3.0.4.Final 2) As I pointed out in the previous mail, I've reconfigured it for you to not pick up the bench-jgroups.xml but go with Infinispan's default UDP configuration. You have to comment line 15 of bench.sh to use the bench-jgroups.xml, but then I can't reproduce the issue anymore. > This maven crap hides so much, I never know what's going on under the covers > !@$@ Do I have to do mvn install or mvn package when I make a change to > bench-jgroups.xml ? Aren't you using bench.sh as I pointed out in the script in my previous mail? It does mvn install before running it. But it's also configured to NOT use bench-jgroups.xml, but rather jgroups-udp.xml. > I don't think I'll ever switch to this f*cking piece of shit ! I don't think Maven is to be blamed today! What's wrong? Anyway one of the nice things of this little benchmark is exactly that it it's a single class with a main file, so you can just import it in you IDE and run. Any IDE will pick the correct dependencies, thanks to Maven. Just that if you do, it will use the default test properties as hardcoded in the test class org.infinispan.benchmark.Transactional: please set the same environment variable as bench.sh does, unless you don't want to run my same configuration. Just ping me for any doubt. Cheers, Sanne > > Please ping when as soon as you've calmed down... :-) > Cheers, > > > > > On 2/1/12 4:52 PM, Sanne Grinovero wrote: >> >> On 1 February 2012 15:18, Bela Ban<bban(a)redhat.com> wrote: >>> >>> >>> >>> On 2/1/12 10:25 AM, Dan Berindei wrote: >>> >>>>> That's not the way it works; at startup of F, it sends its IP address >>>>> with the discovery request. Everybody returns its IP address with the >>>>> discovery response, so even though we have F only talking to A (the >>>>> coordinator) initially, F will also know the IP addresses of A,B,C,D >>>>> and E. >>>>> >>>> >>>> Ok, I stand corrected... since we start all the nodes on the same >>>> thread, each of them should reply to the discovery request of the next >>>> nodes. >>> >>> >>> >>> Hmm, can you reproduce this every time ? If so, can you send me the >>> program so I can run it here ? >>> >>> >>>> However, num_initial_members was set to 3 (the Infinispan default). >>>> Could that make PING not wait for all the responses? If it's like >>>> that, then I suggest we set a (much) higher num_initial_members and a >>>> lower timeout in the default configuration. >>> >>> >>> >>> Yes, the discovery could return quickly, but the responses would even be >>> processed if they were received later, so I don't think that's the issue. >>> >>> The initial discovery should discover *all* IP addresses, later >>> triggering a discovery because an IP address wasn't found should always >>> be the exceptional case ! >>> >>> If you start members in turn, then they should easily form a cluster and >>> not even merge. Here's what can happen on a merge: >>> - The view is A|1={A,B}, both A and B have IP addresses for A and B >>> - The view splits into A|2={A} and B|2={B} >>> - A now marks B's IP address as removable and B marks A's IP address as >>> removable >>> - If the cache grows to over 500 entries >>> (TP.logical_addr_cache_max_size) or TP.logical_addr_cache_expiration >>> milliseconds elapse (whichever comes first), the entries marked as >>> removable are removed >>> - If, *before that* the merge view A|3={A,B} is installed, A unmarks B >>> and B unmarks A, so the entries won't get removed >>> >>> So a hypothesis of how those IP addresses get removed could be that the >>> cluster had a couple of merges, that didn't heal for 2 minutes (?) hard >>> to believe though... >>> >>> We have to get to the bottom of this, so it would be great if you had a >>> program that reproduced this, that I could send myself. The main >>> question is why the IP address for the target is gone and/or why the IP >>> address wasn't received in the first place. >>> >>> In any case, replacing MERGE2 with MERGE3 might help a bit, as MERGE3 >>> [1] periodically broadcasts IP address/logical name and logical address: >>> "An INFO message carries the logical name and physical address of a >>> member. Compared to MERGE2, this allows us to immediately send messages >>> to newly merged members, and not have to solicit this information first. >>> " (copied from the documentation) >>> >>> >>> >>>>>> Â Note that everything is blocked at this point, we >>>>>> >>>>>> won't send another message in the entire cluster until we got the >>>>>> physical address. >>> >>> >>> >>> Understood. Let me see if I can block sending of the message for a max >>> time (say 2 seconds) until I get the IP address. Not very nice, and I >>> prefer a different approach (plus we need to see why this happens in the >>> first place anyway)... >>> >>> >>>>> As I said; this is an exceptional case, probably caused by Sanne >>>>> starting 12 channels inside the same JVM, at the same time, therefore >>>>> causing a traffic spike, which results in dropped discovery requests or >>>>> responses. >>>>> >>>> >>>> Bela, we create the caches on a single thread, so we never have more >>>> than one node joining at the same time. >>>> At most we could have some extra activity if one node can't join the >>>> existing cluster and starts a separate partition, but hardly enough to >>>> cause congestion. >>> >>> >>> >>> Hmm, does indeed not sound like an issue... >>> >>> >>>>> After than, when F wants to talk to C, it asks the cluster for C's IP >>>>> address, and that should be a few ms at most. >>>>> >>>> >>>> Ok, so when F wanted to send the ClusteredGetCommand request to C, >>>> PING got the physical address right away. But the ClusteredGetCommand >>>> had to wait for STABLE to kick in and for C to ask for retransmission >>>> (because we didn't send any other messages). >>> >>> >>> >>> Yep. Before I implement some blocking until we have the IP address, or a >>> timeout elapses, I'd like to try to get to the bottom of this problem >>> first ! >>> >>> >>>> Maybe *we* should use RSVP for our ClusteredGetCommands, since those >>>> can never block... Actually, we don't want to retransmit the request >>>> if we already got a response from another node, so it would be best if >>>> we could ask for retransmission of a particular request explicitly ;-) >>> >>> >>> >>> I'd rather implement the blocking approach above ! :-) >>> >>> >>>> I wonder if we could also decrease desired_avg_gossip and >>>> stability_delay in STABLE. After all, an extra STABLE round can't slow >>>> us when we're not doing anything, and when we are busy we're going to >>>> hit the max_bytes limit much sooner than the desired_avg_gossip time >>>> limit anyway. >>> >>> >>> >>> I don't think this is a good idea as it will generate more traffic. The >>> stable task is not skipped when we have a lot of traffic, so this will >>> compound the issue. >>> >>> >>>>>> I'm also not sure what to make of these lines: >>>>>> >>>>>>>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>>>>>>> sanne-53650, dropping message >>>>>>>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>>>>>>> sanne-53650 timed out (after 3000 ms), retrying >>>>>> >>>>>> >>>>>> It appears that sanne-55119 knows the logical name of sanne-53650, and >>>>>> the fact that it's coordinator, but not its physical address. >>>>>> Shouldn't all of this information have arrived at the same time? >>>>> >>>>> >>>>> Hmm, correct. However, the logical names are kept in (a static) >>>>> UUID.cache and the IP addresses in TP.logical_addr_cache. >>>>> >>>> >>>> Ah, so if we have 12 nodes in the same VM they automatically know each >>>> other's logical name - they don't need PING at all! >>> >>> >>> >>> Yes. Note that logical names are not the problem; even if we evict some >>> logical name from the cache (and we do this only for removed members), >>> JGroups will still work as it only needs UUIDs and IP addresses. >>> >>> >>>> Does the logical cache get cleared on channel stop? I think that would >>>> explain another weird thing I was seeing in the test suite logs, >>>> sometimes everyone in a cluster would suddenly forget everyone else's >>>> logical name and start logging UUIDs. >>> >>> >>> >>> On a view change, we remove all entries which are *not* in the new view. >>> However, 'removing' is again simply marking those members as >>> 'removable', and only if the cache grows beyond 500 >>> (-Djgroups.uuid_cache.max_entries=500) entries will all entries older >>> than 5 seconds (-Djgroups.uuid_cache.max_age=5000) be removed. (There is >>> no separate reaper task running for this). >>> >>> So, yes, this can happen, but on the next discovery round, we'll have >>> the correct values. Again, as I said, UUID.cache is not as important as >>> TP.logical_addr_cache. >>> >>> >>>> This is running the Transactional benchmark, so it would be simpler if >>>> we enabled PING trace in the configuration and disabled it before the >>>> actual benchmark starts. I'm going to try it myself :) >>> >>> >>> >>> How do you run 12 instances ? Did you change something in the config ? >>> I'd be interested in trying the *exact* same config you're running, to >>> see what's going on ! >> >> >> Pushed it for you, committing the exact configuration changes as well: >> >> git clone git://github.com/Sanne/InfinispanStartupBenchmark.git >> cd InfinispanStartupBenchmark >> git co my >> sh bench.sh >> >> If you look into bench.sh >> (as it is at >> https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/bench.sh >> ) >> the lines 9, 15, 28 should be the most interesting. >> >> You need to run with Infinispan's default configuration to reproduce the >> issue, >> but I wouldn't mind you commenting on >> >> https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/benchmark/src... >> as well. That's what I'm now using as default for further performance >> tests. >> >> Cheers, >> Sanne >> >>> >>> >>> [1] http://www.jgroups.org/manual-3.x/html/protlist.html#MERGE3 >>> >>> -- >>> Bela Ban >>> Lead JGroups (http://www.jgroups.org) >>> JBoss / Red Hat >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Bela Ban > Lead JGroups (http://www.jgroups.org) > JBoss / Red Hat _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Bela Ban

3:53 p.m.

I can also reproduce it by now, in JGroups: I simply create 12 members in a loop... Don't need the bombastic Transactional test Looking into it. On 2/2/12 7:46 AM, Dan Berindei wrote:

...

Hi Bela I think I found why you weren't seeing the warnings. The bench-log4j.xml in github master is configured to log only to the log file (benchmark.log). If you add an<appender-ref ref="CONSOLE"/> you'll see the warnings on the console as well. I am now able to reproduce it pretty reliably, even from IntelliJ. I created a run configuration for the Transactional class, I set the module to infinispan-5.1-SNAPSHOT, and I added these JVM arguments: -Dbench.loops=3000 -Dbench.writerThreads=33 -Dbench.readerThreads=10 -Dbench.dist=true -Dbench.transactional=true -Dbench.numkeys=500 -Dbench.payloadsize=10240 -Dbench.nodes=12 -Djava.net.preferIPv4Stack -Dlog4j.configuration=bench-log4j.xml In Sanne's thread dump it was a ClusteredGetCommand that was waiting for a physical address, but in my test runs it's usually a JOIN that's failing. And it seems that it happens because we got a discovery response from 3 nodes (that is our initial num_initial_members), but none of them was the coordinator: 2012-02-02 08:04:41,672 TRACE [org.jgroups.protocols.PING] (main) discovery took 8 ms: responses: 3 total (2 servers (0 coord), 1 clients) 2012-02-02 08:04:41,672 TRACE [org.jgroups.protocols.pbcast.GMS] (main) localhost-47135: initial_mbrs are localhost-36539 localhost-30921 2012-02-02 08:04:41,672 DEBUG [org.jgroups.protocols.pbcast.GMS] (main) election results: {localhost-32436=2} 2012-02-02 08:04:41,672 DEBUG [org.jgroups.protocols.pbcast.GMS] (main) sending JOIN(localhost-47135) to localhost-32436 2012-02-02 08:04:41,673 TRACE [org.jgroups.protocols.UNICAST2] (main) localhost-47135: created connection to localhost-32436 (conn_id=0) 2012-02-02 08:04:41,673 TRACE [org.jgroups.protocols.UNICAST2] (main) localhost-47135 --> DATA(localhost-32436: #1, conn_id=0, first) 2012-02-02 08:04:41,673 TRACE [org.jgroups.protocols.UDP] (main) sending msg to localhost-32436, src=localhost-47135, headers are GMS: GmsHeader[JOIN_REQ]: mbr=localhost-47135, UNICAST2: DATA, seqno=1, first, UDP: [channel_name=ISPN] 2012-02-02 08:04:41,673 WARN [org.jgroups.protocols.UDP] (main) localhost-47135: no physical address for localhost-32436, dropping message Cheers Dan On Wed, Feb 1, 2012 at 7:12 PM, Sanne Grinovero<sanne(a)infinispan.org> wrote: > On 1 February 2012 16:40, Bela Ban<bban(a)redhat.com> wrote: >> Your benchmark is giving me the creeps ! > > Manik was the original author, I've only been adapting it slightly to > identify performance issues. I wouldn't have used Maven either, but > it's serving me well especially since it turns out I have to > frequently change JGroups version ;-) > >> First, which version of JGroups / Infinispan does this use ? Second, is >> there a way to start this in an IDE rather than through maven ? Third, I >> don't think bench-jgroups.xml is picked up at all ! How do I make a change >> to bench-jgroups.xml and have Transactional *use* it ? > > 1) The intention is to compare Infinispan versions, and see how we > make progress, so the JGroups version is defined as each different > Infinispan version was released with, or in case of a SNAPSHOT the > version it's currently using. > > So, assuming you have a fresh snapshot of Infinispan 5.1.0-SNAPSHOT, > it's JGroups 3.0.4.Final > > 2) As I pointed out in the previous mail, I've reconfigured it for you > to not pick up the bench-jgroups.xml but go with Infinispan's default > UDP configuration. > You have to comment line 15 of bench.sh to use the bench-jgroups.xml, > but then I can't reproduce the issue anymore. > >> This maven crap hides so much, I never know what's going on under the covers >> !@$@ Do I have to do mvn install or mvn package when I make a change to >> bench-jgroups.xml ? > > Aren't you using bench.sh as I pointed out in the script in my > previous mail? It does mvn install before running it. > But it's also configured to NOT use bench-jgroups.xml, but rather > jgroups-udp.xml. > >> I don't think I'll ever switch to this f*cking piece of shit ! > > I don't think Maven is to be blamed today! What's wrong? > > Anyway one of the nice things of this little benchmark is exactly that > it it's a single class with a main file, so you can just import it in > you IDE and run. Any IDE will pick the correct dependencies, thanks to > Maven. Just that if you do, it will use the default test properties as > hardcoded in the test class org.infinispan.benchmark.Transactional: > please set the same environment variable as bench.sh does, unless you > don't want to run my same configuration. > > Just ping me for any doubt. > > Cheers, > Sanne > >> >> Please ping when as soon as you've calmed down... :-) >> Cheers, >> >> >> >> >> On 2/1/12 4:52 PM, Sanne Grinovero wrote: >>> >>> On 1 February 2012 15:18, Bela Ban<bban(a)redhat.com> wrote: >>>> >>>> >>>> >>>> On 2/1/12 10:25 AM, Dan Berindei wrote: >>>> >>>>>> That's not the way it works; at startup of F, it sends its IP address >>>>>> with the discovery request. Everybody returns its IP address with the >>>>>> discovery response, so even though we have F only talking to A (the >>>>>> coordinator) initially, F will also know the IP addresses of A,B,C,D >>>>>> and E. >>>>>> >>>>> >>>>> Ok, I stand corrected... since we start all the nodes on the same >>>>> thread, each of them should reply to the discovery request of the next >>>>> nodes. >>>> >>>> >>>> >>>> Hmm, can you reproduce this every time ? If so, can you send me the >>>> program so I can run it here ? >>>> >>>> >>>>> However, num_initial_members was set to 3 (the Infinispan default). >>>>> Could that make PING not wait for all the responses? If it's like >>>>> that, then I suggest we set a (much) higher num_initial_members and a >>>>> lower timeout in the default configuration. >>>> >>>> >>>> >>>> Yes, the discovery could return quickly, but the responses would even be >>>> processed if they were received later, so I don't think that's the issue. >>>> >>>> The initial discovery should discover *all* IP addresses, later >>>> triggering a discovery because an IP address wasn't found should always >>>> be the exceptional case ! >>>> >>>> If you start members in turn, then they should easily form a cluster and >>>> not even merge. Here's what can happen on a merge: >>>> - The view is A|1={A,B}, both A and B have IP addresses for A and B >>>> - The view splits into A|2={A} and B|2={B} >>>> - A now marks B's IP address as removable and B marks A's IP address as >>>> removable >>>> - If the cache grows to over 500 entries >>>> (TP.logical_addr_cache_max_size) or TP.logical_addr_cache_expiration >>>> milliseconds elapse (whichever comes first), the entries marked as >>>> removable are removed >>>> - If, *before that* the merge view A|3={A,B} is installed, A unmarks B >>>> and B unmarks A, so the entries won't get removed >>>> >>>> So a hypothesis of how those IP addresses get removed could be that the >>>> cluster had a couple of merges, that didn't heal for 2 minutes (?) hard >>>> to believe though... >>>> >>>> We have to get to the bottom of this, so it would be great if you had a >>>> program that reproduced this, that I could send myself. The main >>>> question is why the IP address for the target is gone and/or why the IP >>>> address wasn't received in the first place. >>>> >>>> In any case, replacing MERGE2 with MERGE3 might help a bit, as MERGE3 >>>> [1] periodically broadcasts IP address/logical name and logical address: >>>> "An INFO message carries the logical name and physical address of a >>>> member. Compared to MERGE2, this allows us to immediately send messages >>>> to newly merged members, and not have to solicit this information first. >>>> " (copied from the documentation) >>>> >>>> >>>> >>>>>>> Â Note that everything is blocked at this point, we >>>>>>> >>>>>>> won't send another message in the entire cluster until we got the >>>>>>> physical address. >>>> >>>> >>>> >>>> Understood. Let me see if I can block sending of the message for a max >>>> time (say 2 seconds) until I get the IP address. Not very nice, and I >>>> prefer a different approach (plus we need to see why this happens in the >>>> first place anyway)... >>>> >>>> >>>>>> As I said; this is an exceptional case, probably caused by Sanne >>>>>> starting 12 channels inside the same JVM, at the same time, therefore >>>>>> causing a traffic spike, which results in dropped discovery requests or >>>>>> responses. >>>>>> >>>>> >>>>> Bela, we create the caches on a single thread, so we never have more >>>>> than one node joining at the same time. >>>>> At most we could have some extra activity if one node can't join the >>>>> existing cluster and starts a separate partition, but hardly enough to >>>>> cause congestion. >>>> >>>> >>>> >>>> Hmm, does indeed not sound like an issue... >>>> >>>> >>>>>> After than, when F wants to talk to C, it asks the cluster for C's IP >>>>>> address, and that should be a few ms at most. >>>>>> >>>>> >>>>> Ok, so when F wanted to send the ClusteredGetCommand request to C, >>>>> PING got the physical address right away. But the ClusteredGetCommand >>>>> had to wait for STABLE to kick in and for C to ask for retransmission >>>>> (because we didn't send any other messages). >>>> >>>> >>>> >>>> Yep. Before I implement some blocking until we have the IP address, or a >>>> timeout elapses, I'd like to try to get to the bottom of this problem >>>> first ! >>>> >>>> >>>>> Maybe *we* should use RSVP for our ClusteredGetCommands, since those >>>>> can never block... Actually, we don't want to retransmit the request >>>>> if we already got a response from another node, so it would be best if >>>>> we could ask for retransmission of a particular request explicitly ;-) >>>> >>>> >>>> >>>> I'd rather implement the blocking approach above ! :-) >>>> >>>> >>>>> I wonder if we could also decrease desired_avg_gossip and >>>>> stability_delay in STABLE. After all, an extra STABLE round can't slow >>>>> us when we're not doing anything, and when we are busy we're going to >>>>> hit the max_bytes limit much sooner than the desired_avg_gossip time >>>>> limit anyway. >>>> >>>> >>>> >>>> I don't think this is a good idea as it will generate more traffic. The >>>> stable task is not skipped when we have a lot of traffic, so this will >>>> compound the issue. >>>> >>>> >>>>>>> I'm also not sure what to make of these lines: >>>>>>> >>>>>>>>>> [org.jgroups.protocols.UDP] sanne-55119: no physical address for >>>>>>>>>> sanne-53650, dropping message >>>>>>>>>> [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to >>>>>>>>>> sanne-53650 timed out (after 3000 ms), retrying >>>>>>> >>>>>>> >>>>>>> It appears that sanne-55119 knows the logical name of sanne-53650, and >>>>>>> the fact that it's coordinator, but not its physical address. >>>>>>> Shouldn't all of this information have arrived at the same time? >>>>>> >>>>>> >>>>>> Hmm, correct. However, the logical names are kept in (a static) >>>>>> UUID.cache and the IP addresses in TP.logical_addr_cache. >>>>>> >>>>> >>>>> Ah, so if we have 12 nodes in the same VM they automatically know each >>>>> other's logical name - they don't need PING at all! >>>> >>>> >>>> >>>> Yes. Note that logical names are not the problem; even if we evict some >>>> logical name from the cache (and we do this only for removed members), >>>> JGroups will still work as it only needs UUIDs and IP addresses. >>>> >>>> >>>>> Does the logical cache get cleared on channel stop? I think that would >>>>> explain another weird thing I was seeing in the test suite logs, >>>>> sometimes everyone in a cluster would suddenly forget everyone else's >>>>> logical name and start logging UUIDs. >>>> >>>> >>>> >>>> On a view change, we remove all entries which are *not* in the new view. >>>> However, 'removing' is again simply marking those members as >>>> 'removable', and only if the cache grows beyond 500 >>>> (-Djgroups.uuid_cache.max_entries=500) entries will all entries older >>>> than 5 seconds (-Djgroups.uuid_cache.max_age=5000) be removed. (There is >>>> no separate reaper task running for this). >>>> >>>> So, yes, this can happen, but on the next discovery round, we'll have >>>> the correct values. Again, as I said, UUID.cache is not as important as >>>> TP.logical_addr_cache. >>>> >>>> >>>>> This is running the Transactional benchmark, so it would be simpler if >>>>> we enabled PING trace in the configuration and disabled it before the >>>>> actual benchmark starts. I'm going to try it myself :) >>>> >>>> >>>> >>>> How do you run 12 instances ? Did you change something in the config ? >>>> I'd be interested in trying the *exact* same config you're running, to >>>> see what's going on ! >>> >>> >>> Pushed it for you, committing the exact configuration changes as well: >>> >>> git clone git://github.com/Sanne/InfinispanStartupBenchmark.git >>> cd InfinispanStartupBenchmark >>> git co my >>> sh bench.sh >>> >>> If you look into bench.sh >>> (as it is at >>> https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/bench.sh >>> ) >>> the lines 9, 15, 28 should be the most interesting. >>> >>> You need to run with Infinispan's default configuration to reproduce the >>> issue, >>> but I wouldn't mind you commenting on >>> >>> https://github.com/Sanne/InfinispanStartupBenchmark/blob/my/benchmark/src... >>> as well. That's what I'm now using as default for further performance >>> tests. >>> >>> Cheers, >>> Sanne >>> >>>> >>>> >>>> [1] http://www.jgroups.org/manual-3.x/html/protlist.html#MERGE3 >>>> >>>> -- >>>> Bela Ban >>>> Lead JGroups (http://www.jgroups.org) >>>> JBoss / Red Hat >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev(a)lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Bela Ban >> Lead JGroups (http://www.jgroups.org) >> JBoss / Red Hat > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat

Manik Surtani

Sunday, 5 February Sun, 5 Feb

1:53 a.m.

On 2 Feb 2012, at 07:53, Bela Ban wrote:

...

I can also reproduce it by now, in JGroups: I simply create 12 members in a loop... Don't need the bombastic Transactional test

Yup; Transactional was made to benchmark and profile 2-phase transactions - as the name suggests! ;) - in Infinispan. Not JGroups.

...

Looking into it.

So this is what was addressed in 3.0.5? Cheers Manik -- Manik Surtani manik(a)jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org

Bela Ban

7:04 p.m.

On 2/4/12 5:53 PM, Manik Surtani wrote:

...

On 2 Feb 2012, at 07:53, Bela Ban wrote: > I can also reproduce it by now, in JGroups: I simply create 12 members > in a loop... > > Don't need the bombastic Transactional test Yup; Transactional was made to benchmark and profile 2-phase transactions - as the name suggests! ;) - in Infinispan. Not JGroups. > Looking into it. So this is what was addressed in 3.0.5?

Yes, there are 2 changes which reduce the chance of missing IP addresses. The workaround is to set PING.num_initial_members to 12 -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat

Manik Surtani

1:53 a.m.

Apologies for being so late on this thread. On 1 Feb 2012, at 18:12, Sanne Grinovero wrote:

...

Yup, this is why I used Maven here. To be able to easily swap Infinispan and JGroups versions just by commenting/uncommenting a few lines of XML rather than juggling jar files.

...

> I don't think I'll ever switch to this f*cking piece of shit ! I don't think Maven is to be blamed today! What's wrong? Anyway one of the nice things of this little benchmark is exactly that it it's a single class with a main file, so you can just import it in you IDE and run. Any IDE will pick the correct dependencies, thanks to Maven. Just that if you do, it will use the default test properties as hardcoded in the test class org.infinispan.benchmark.Transactional: please set the same environment variable as bench.sh does, unless you don't want to run my same configuration.

Yup, again the test is designed to be run via Maven or in your IDE - as Sanne pointed out, the IDE picks up dependencies from the Maven pom. Further, I also run it using JProfiler's IntelliJ plugin when I want to profile it. -- Manik Surtani manik(a)jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org

4475

days inactive

4481

days old

infinispan-dev@lists.jboss.org

Manage subscription

14 comments

4 participants

tags (0)

participants (4)

Bela Ban
Dan Berindei
Manik Surtani
Sanne Grinovero

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

again: "no physical address"