The getCache() timeout should not be increased at all. Instead I would
propose that getCache() returns a functional cache immediately, even
if the cache didn't receive any data, and it works solely as an L1
cache until the administrator allows it to join. I'd even make it
possible to designate a cache as an L1-only cache, so it's never an
owner for any key.
I agree that would be very nice, but makes it much more complex to
implement in 5.2 as well: functional L1 means that the other nodes
must accept this node as part of the grid, including for L1
invalidation purposes.
So my proposal on blocking until ready is to make a first step, and I
think it would still be very useful for people wanting to boot some
~100 nodes. Blocking the application is not a big deal, as you're
delaying boot of an application which was likely not even powered on
before.
When adding several new nodes, I just want them to "add all at once",
so preventing intermediate rehashing: until all have joined you should
block rehash - that's a manual (or more likely automated externally)
step and will not be engaged for long, nor it would replace normal
behaviour when disabled.