[ISPN-125] Reading part of dynamic externalizer mappings
by Galder Zamarreno
Hi,
Re: https://jira.jboss.org/jira/browse/ISPN-125
I've been thinking more about this and I don't have the reading part
very clear. At runtime, it's very easy to check the externalizer map to
see if there's externalizer for class A and if there isn't any, inspect
the class for @Marshallable annotation, instantiate the A's Externalizer
class and add it to the map.
However, the reading part is a fairly difficult problem to crack cos all
you'd get would be an Id which you'd need to map back to an Externalizer
that will be used for reading. However, how does the reader know what's
the mapping Id to Externalizer mapping?
One option, rather brutal, would be to look up the entire classpath for
a class containing a @Marshallable with the index you're after.
Another option would be that the 1st time someone writes an instance of
that class A, it adds a String form of the fully qualified class name to
the wire. So, the reader can check the externalizer map and if the id
not present, read the class name and resolve it and add the mapping.
However, this breaks the moment you have two different nodes sending a
class for the 1st time. They would both add the class name but only the
1st one to arrive would resolve the class name correctly, the 2nd time
it arrives on the node, cos the mapping would already be there, it would
read the object directly, failing cos there's also the class name on the
wire.
Anyone has any other ideas?
It's worth bearing in mind that adding this dynamic behaviour adds
complexity the code and it forces previously non-concurrent collections
to be concurrent, hence performance will degrade compared to the current
solution. It's worth keeping in mind that users will always be able to
implement their j.i.Externalizable, which will be less performant than
our internal solution but might be reasonable enough.
One thing is for sure here, the indexes need to be tied down and I'm
already working on it.
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
16 years, 2 months
Infinispan configuration - all in one solution
by Vladimir Blagojevic
Recently, I was assigned to complete
https://jira.jboss.org/jira/browse/ISPN-97
The driving force behind this JIRA was to achieve a minimal maintenance
of Infinispan XML configuration. We have scarce resources but we needed
to: maintain professional looking HTML configuration reference, keep a
well documented XML schema for configuration files, and finally we
wanted to achieve seamless marshaling from XML configuration files into
configuration object model. Since configuration reference and schema
need to be generated for every release we wanted to completely automate
this process.
The obvious candidate library to achieve seamless unmarshalling of XML
configuration files into object model was either JAXB or JBossXB. We
wanted to keep our current configuration class hierarchy loosely coupled
with XML schema - i.e not every type in XML schema mapped into an
appropriate class in configuration hierarchy. In another words we wanted
to group configuration attributes into many XML elements but did not
want to have as many classes in our configuration class hierarchy. Any
solution to this problem, including JBossXB, requires maintenance of
mapping between XML schema types and our configuration class hierarchy.
Not a big problem, but this was a first hurdle in a serious of hurdles.
As any other HTML configuration reference Infinispan's should display
default values and their types, constraints for those values, human
readable descriptions and so on. If we were to generate configuration
reference automatically we needed to keep this information somewhere,
extract it, and spit it out. In JAXB/JBossXB based solution, the obvious
place to keep this information was in a "hand maintained" schema. But
schema, although it carries most of this information, is a blueprint, it
does not contain all instances of all possible configuration elements.
For example, our schema constrains that <loader> element has such and
such form but it does not document all the configuration options for all
possible loaders. The source code of those loader classes does! Another
problem with keeping documentation in hand maintained schema was that
all the information in schema was replicating what already existed in
the source code. We already had comments, attribute types and their
default values, they were already in the source. Why do we need to keep
a mirror of this data and then on top of it hand maintain it along with
a mapping from schema types to our configuration class hierarchy?
A proper solution for all of these minor issues, but issues that create
a lot of maintenance headache, lies in switching to the source code as
the main source of all information we need. What we had to do was to add
a bit of metadata in a form of annotations to complete everything. The
algorithm to unmarshall XML configuration file into object model is
about 300 lines of code [1], documentation [2] and schema generator [3]
around 200 each.
In conclusion, out of source code itself you automatically get a kick
ass configuration documentation reference, a well documented XML schema
and XML configuration unmarshalling into object model. All of this with
almost zero maintenance cost.
Yes, indeed, we are almost reinventing the wheel here but this is the
solution we need and we would gladly contribute to a project that does
exactly what we need. At the moment, I doubt that we can do this with
JBossXB, Alexey please correct me if I am wrong. Since we are already
having documentation reference maintenance problems in other projects
maybe we should somehow extend JBossXB with annotation based framework
and apply it in other projects as well?
Automatically generated configuration reference is included as
attachment. To view, download both stylesheet and config.html into a
local directory and open config.html in your favorite browser.
Regards,
Vladimir
[1]
http://anonsvn.jboss.org/repos/infinispan/trunk/core/src/main/java/org/in...
[2]
http://anonsvn.jboss.org/repos/infinispan/trunk/tools/src/main/java/org/i...
[3]
http://anonsvn.jboss.org/repos/infinispan/trunk/tools/src/main/java/org/i...
16 years, 2 months
Hibernate Search alternative Directory distribution
by Emmanuel Bernard
Here is the concall notes on how to cluster and copy Hibernate indexes
using non file system approaches.
Forget JBoss Cache, forget plain JGroups and focus on Infinispan
Start with Infinispan in replication mode (the most stable code) and
then try distribution. It should be interesting to test the dist algo
and see how well L1 cache behaves in a search environment.
For the architecture, we will try the following approach in decreasing
interest )If the first one works like a charm we stick with it):
1. share the same grid cache between the master and the slaves
2. have a local cache on the master where indexing is done and
manually copy over the chuncks of changed data to the grid
This requires to store some metadata (namely the list of chunks for a
given index and the lastupdate for each chunk) to implement the same
algorithm as the one implemented in FSMaster/SlaveDirectoryProvider
(incremental copy).
3. have a local cache on the master where indexing is done and
manually copy over the chuncks of changed data to the grid. Each slave
copy from the grid to a local version of the index and use the local
version for search.
When writing the InfinispanDirectory (inspired by the RAMDirectory and
the JBossCacheDirectory), one need to consider than Infinispan has a
flat structure. The key has to contain:
- the index name
- the chunk name
Both with essentially be the unique identifier.
Each chunk should have its size limited (Lucene does that already AFAIK)
Question on the metadata. one need ot keep the last update and the
list of chuncks. Because Infinispan is not queryable, we need to store
that as metadata:
- should it be on each chunk (ie last time on each chunk, the size
of a chunk)
- on a dedicated metadata chunk ie one metadata chunk per chunk + a
chink containing the list
- on a single metadata chunk (I fear conflicts and inconsistencies)
On changes or read explore the use of Infinispan transaction to ensure
RR semantic. Is it necessary? A file system does not guarantee that
anyway.
In the case of replication, make sure a FD back end can be activated
in case the grid goes to the unreachable clouds of total inactivity.
Question to Manik: do you have a cluster to play with once we reach
this stage?
Good luck Lukasz,a nd ask for help if you are stuck or unsure of a
decision :)
16 years, 2 months
[ISPN-116] Async cache store: aggregation of multiple changes on a single key
by Galder Zamarreno
Hi,
Re: https://jira.jboss.org/jira/browse/ISPN-116
I can see two ways of providing such feature:
1. Rather than using a queue, using a data structure similar to the one
used in the data container so that when a newer value for a key already
present in the queue is present, the value can be swapped (map like
lookup required on the key so that O(1) is maintained) while maintaining
queue-like FIFO operations required to empty it.
2. Based on suggestion coming amin59 in
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4240980#4240980, make
the current queue only store keys and make the draining process query
the cache for the latest value. Because from the time the data was put
to the cache to the time it's stored, the data could have expired or had
been removed all together, such query would need to make sure it doesn't
go to the cache loader again. Also, if the data had been removed, how
would the Async store know that from cache.get()? The queue would need
to record somehow that a key was actually removed.
The benefit of the 1st option is that we can take advantage of fast
drainTo() like operations when draining it, and also the fact that no
contention is added to the cache itself. The downside would be the need
for a more complex data structure.
The benefit of the 2nd option is the keeping a simple queue, less
complex data structure, but requires that each time a key is to be
drained, we have to query the cache. This could slow down the draining
process.
My preference is with 1. but for to use it, I'd need the data container
collection to be made more generic.
Thoughts?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
16 years, 3 months
Why does ClusteredGetResponseValidityFilter maintain pendingResponders?
by Galder Zamarreno
Hi all,
Why does ClusteredGetResponseValidityFilter maintain a list of
pendingResponders? Wouldn't it be more efficient if needMoreResponses()
returned as soon as one positive response has been received regardless
of who's pending to return anything? I'm assuming here that
SuccessfulResponse means that the clustered get returned what we're
trying to get.
I implemented a very similar thing for HAJNDI in such way that as soon
as a look up had succeed, we wouldn't wait for anyone else.
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
16 years, 3 months
property tags in configuration files
by Mircea Markus
Continuing a conversation Vladimir and I had on irc, about how
<property> tags can be be listed in the documentation.
<property> are a way to configure extension points (e.g. CacheStore
implementations, ScheduledExecutorFactory implemetations etc), e.g:
<replicationQueueScheduledExecutor
factory="org.infinispan.executors.DefaultScheduledExecutorFactory">
<property name="threadNamePrefix" value="ReplicationQueueThread"/>
</replicationQueueScheduledExecutor>
"threadNamePrefix" property is specific to
DefaultScheduledExecutorFactory, other implementations of
ScheduledExecutorFactory would not recognize it, and/or use different
properties.
So there's no way to handle "property" tags as the other 'regular'
attributes that map to fixed configuration elements: e.g. transportClass
maps to GlobalConfiguration#transportClass etc.
Even more, in some situations these properties are not being used to
call setters on the target objects, but are being passed all together
through a java.util.Properties: which makes them impossible/harder to
annotate.
On the other hand it would be very nice to have all the properties of
default extension points (e.g. FileCacheStore, JdbcStringBasedCacheStore
etc) having all the documentation exposed through annotations.
A possible way I see to make this working is:
1) create a new annotation to denote an extension point,
@ExtenssionPoint(defaultImplementations={ClassName1, ClassName2,
ClassName3})
e.g.
CacheLoaderConfig {
@ExtenssionPoint(defaultImplementations={FileCacheStore.class,
JdbcStringBasedCacheStore.class}) //this will break module dependency,
guess strings should be used
void setCacheLoaderClassName(String s);
}
2) For all existing extension points the tool will create a different
table containing all its specific properties, obtained by parsing
class's @Property tags
3) Make all the current extension points that use java.util.Property are
replace by setters that can be annotated: this way we would remain
consistent in all code, and IMHO the code would be more readable
wdyt?
Cheers,
Mircea
16 years, 3 months
DIST and putAll calls
by Galder Zamarreno
Hi,
Re: https://jira.jboss.org/jira/browse/ISPN-120
While looking at this, I've started thinking how DIST dealt with
putAll() calls. It looks, based on
MultipleKeysRecipientGenerator.generateRecipients() that there's a union
made of all the recipients for all the keys in the putAll(). Is that
how's that supposed to work?
Theoretically, it looks to me there should be a put op called for each
of the keys but obviously, doing the current way is more performant in
spite of creating more copies in the cluster than the number of owners set.
Thoughts?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
16 years, 3 months
deadlock detection on local caches
by Mircea Markus
Hi,
Original design[1] treats the situation in which two replicating
transaction are in a deadlock.
Here is an idea for a similar mechanism intended for detecting deadlocks
on a local cache.
T1, T2 two transactions running on the same local cache.
T1 has lock on k1 and tries to acquire k2
T2 has lock on k2 and tries to acquire k1
Each thread runs the following algorithm, (similar to one described in [1])
lock(int timeout)
{
while (currentTime < startTime + timeout)
{
if (acquire(smallTimeout)) break;
testEDD(globalTransaction, key);
}
}
//will run the same algorithm, by both T1 and T2:
//globalTransaction - this is the tx that couldn't acquire lock in 'smallTimeout'
//key - the key that couldn't be locked by globalTransation
testEDD(globalTransaction, key)
globalTransaction.setLockIntention(key); //we intend to lock the given key...
Object lockOwner = getLockOwner(key);
if (isLocallyOriginatedTx(lockOwner)) {
Object keyToBeLocked = lockOwner.getLockIntention();
if (globalTransaction.ownsLock(keyToBeLocked)) {
//this is a deadlock situation
//use coin toss to determine which tx should rollback!!!
return;
} else {
return; //this will go back to the main loop in 'lock(int timeout)' method
}
}
Cheers,
Mircea
[1] http://tinyurl.com/nunmyu
16 years, 3 months