[infinispan-dev] [infinispan-internal] Unstable Cluster

Bela Ban bban at redhat.com
Tue Mar 5 01:49:54 EST 2013



On 3/4/13 6:35 PM, Dan Berindei wrote:
>
> On Mon, Mar 4, 2013 at 10:28 AM, Bela Ban <bban at redhat.com
> <mailto:bban at redhat.com>> wrote:
>
>     Another node: in general, would it make sense to use shorter names ?
>     E.g. instead of
>
>     ** New view: [jdg-perf-01-60164|9] [jdg-perf-01-60164,
>     | jdg-perf-01-24167, jdg-perf-01-53841, jdg-perf-01-39558,
>     | jdg-perf-01-8977, jdg-perf-01-49115, jdg-perf-01-24774,
>     | jdg-perf-01-5758, jdg-perf-01-37137, jdg-perf-01-45330,
>     | jdg-perf-01-24793, jdg-perf-01-35602, jdg-perf-02-7751,
>     | jdg-perf-02-37056, jdg-perf-02-50381, jdg-perf-02-53449,
>     | jdg-perf-02-64954, jdg-perf-02-34066, jdg-perf-02-61515,
>     | jdg-perf-02-65045 ...]
>
>
>     we could have
>     ** New view: [1|9] [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
>     16, 17, 18, 19, 20, ...]
>
>     This makes reading logs *much* easier than having those long names.
>
>
> Yes and no... I sometimes find it useful to have a somehow longer name,
> as searching/filtering for a node in the log is pretty much impossible
> with a name like "1".


Yes, however if you have context it's not an issue, e.g. in JGroups I 
oftentimes prefix my messages with name:. So, for example, if you're 
looking for unicast traffic received by 5, this would work "5: <--" as 
grep argument.

I agree this isn't useful when you want to follow the address of a 
member throughout the log, and all protocols.

> I also think we need the random number at the end so that we can debug
> problems with node restarts. In JDG/AS7 they don't add a random number,
> and it was very difficult to see what was happening when a node name
> appeared twice in the consistent hash. But we could make the random
> number shorter.


Would maintaining a base name (A) and then incrementing a short help ? 
E.g. A1, when restarted A2 ? The problem is that we'd have to store the 
number on disk...


>     If we wanted the host name to be part of a cluster name, we could use
>     the alphabet, e.g. A=jdk-perf-01, B=jdg-perf-02:
>
>     ** New view: [A1|9][A1, A2, A3, B4, B6, C2, C3, ...]
>
>     This is of course tied to a given host naming scheme. But oftentimes,
>     host names include numbers, so perhaps we could use a regexp to extract
>     that number and use it as a prefix to the name, e.g.
>     cluster-01 first instance: 1-1
>     cluster-02 2nd instance: 1-2
>     etc.
>
>     Thoughts ?
>
>
> Are you thinking of an automatic way of assigning a letter+digit
> combination to a node on startup? We also use the node name for some
> other stuff (e.g. thread names), so I'm not sure if it's feasible to
> wait until we have connected to the JGroups cluster to set the node name
> dynamically.
>
> For RadarGun we could use a static system where we configure a node name
> for each slave and then RadarGun passes the node name to Infinispan via
> a system property. No Infinispan changes required (except perhaps making
> the random number in the node name optional.)


I think this would already help; for apps / tests that we control, we 
should strive to make names as short as possible. I often find myself 
awking and seding a log file before I even look at it, as names are way 
too long.

Would it help if a base name could be set in a channel, e.g. 
channel.setBaseName("A") and then we maintain a file A in temp storage 
to which we write 1 (first time, file doesn't exist) ? Next time we'd 
read the file (which matches the base name) and write the incremented 
number (2) ? So we'd have member A1, then A2 and so on...

Thoughts ?

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list