Local state transfer before going over network
by Galder Zamarreño
Not sure if the idea has come up but while at GeeCON last week I was discussing to one of the attendees about state transfer improvements in replicated environments:
The idea is that in a replicated environment, if a cache manager shuts down, it would dump its memory contents to a cache store (i.e. a local filesystem) and when it starts up, instead of going over the network to do state transfer, it would load the state from the local filesystem which would be much quicker. Obviously, at times the cache manager would crash or have some failure dumping the memory contents, so in that case it would fallback on state transfer over the network. I think it's an interesting idea since it could reduce the amount of state transfer to be done. It's true though that there're other tricks if you're having issues with state transfer, such as the use of a cluster cache loader.
WDYT?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
13 years, 7 months
Old JBoss repo in pom.xml
by Galder Zamarreño
Hi all,
So, what's our current approach towards hardcoding maven repositories in the pom.xml files?
Should we allow JBoss repos to be defined master/parent/pom.xml? This was added by Adrian C when he upgraded JClouds:
<repository>
<id>jboss</id>
<url>http://repository.jboss.org/maven2</url>
</repository>
First of all, this is a deprecated repo and not sure it should even be amongst the configured repositories.
Secondly, the idea so far has been that users configure the JBoss Maven repo in their settings.xml - http://community.jboss.org/wiki/MavenGettingStarted-Users
Where do we stand now?
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
13 years, 7 months
chunking ability on the JDBC cacheloader
by Sanne Grinovero
As mentioned on the user forum [1], people setting up a JDBC
cacheloader need to be able to define the size of columns to be used.
The Lucene Directory has a feature to autonomously chunk the segment
contents at a configurable specified byte number, and so has the
GridFS; still there are other metadata objects which Lucene currently
doesn't chunk as it's "fairly small" (but undefined and possibly
growing), and in a more general sense anybody using the JDBC
cacheloader would face the same problem: what's the dimension I need
to use ?
While in most cases the maximum size can be estimated, this is still
not good enough, as when you're wrong the byte array might get
truncated, so I think the CacheLoader should take care of this.
what would you think of:
- adding a max_chunk_size option to JdbcStringBasedCacheStoreConfig
and JdbcBinaryCacheStore
- have them store in multiple rows the values which would be bigger
than max_chunk_size
- this will need transactions, which are currently not being used by
the cacheloaders
It looks like to me that only the JDBC cacheloader has these issues,
as the other stores I'm aware of are more "blob oriented". Could it be
worth to build this abstraction in an higher level instead of in the
JDBC cacheloader?
Cheers,
Sanne
[1] - http://community.jboss.org/thread/166760
13 years, 7 months
Transport related configuration timeouts
by Vladimir Blagojevic
While working on ISPN-83 I realized that we have to form equality
relationships between all our transport related timeouts and verify that
the make sense as configuration instance is being processed.
Alphabetically we have the following timeouts in our configuration elements:
<async>: flushLockTimeout, shutdownTimeout
<deadlockDetection>: spinDuration
<hash>: rehashRpcTimeout
<locking>:lockAcquisitionTimeout
<singletonStore>:pushStateTimeout
<stateRetrieval>:logFlushTimeout, timeout, initialRetryWaitTime
<sync>:replTimeout
<transaction>:cacheStopTimeout
<transport>: distributedSyncTimout
My suggestions so far:
flushLockTimeout < shutdownTimeout
spinDuration < lockAcquisitionTimeout < replTimeout < rehashRpcTimeout
replTimeout < distributedSyncTimout < <stateRetrieval>:timeout
Am I overseeing something? Lets hear your thoughts in the area of your
expertize!
Regards,
Vladimir
13 years, 7 months
5.0.0.CR4 for 1st June
by Galder Zamarreño
Hi all,
I'm planning to release CR4 next Wednesday, 1st of June, and between now and then I'll be trying to narrow down more testsuite failures and fix the suggestions on the Klockwork report.
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
13 years, 7 months
weekly questions
by Mircea Markus
Hi Jonathan,
We're thinking about doing some significant changes/improvements around transactions[1]. I'd like to pick your brain especially on improvement #1, as this changes the way we acquire locks quite a bit.
With this new locking approach, we don't acquire WL for a transaction until prepare phase. On each write we keep a copy of the written key on the node where transaction is executed so that each subsequent access, in the same transaction, would read/update this copy. Same goes for reads.
When transaction commits we acquire WL during prepare time. This looks to me very similar to the optimistic CC, just that there's no conflict verification before commit.
Also, it seems to preserve repeatable read transaction isolation correctly.
This is a very significant change in our transaction model - do you see any problems/suggestions around it?
Cheers,
Mircea
[1] http://community.jboss.org/wiki/PossibleLockingImprovements
13 years, 7 months
Failure looking up the river marshaller under AS 7 environment
by "이희승 (Trustin Lee)"
Hi folks,
I'm trying to run Infinispan under AS 7 (i.e. JBoss Modules). I
succeeded to run an EmbeddedCacheManager, HotRodServer, and
MemcachedServer. However, it fails when a new node joins the cluser:
http://pastebin.com/pGfxSWJP
The root cause of the failure is that GenericJBossMarshaller fails to
find the RiverMarshallerFactory. So, I set the TCCL, but it didn't help
at all. Even setting the TCCL to
RiverMarshallerFactory.class.getClassLoader() doesn't seem to help. Any
clues?
Cheers
--
Trustin Lee, http://gleamynode.net/
13 years, 7 months
Hibernate OGM documentation
by Emmanuel Bernard
Hi all
I've been working on Hibernate OGM documentation. I am still very unhappy with what I have but that's a start.
You can read it at http://docs.jboss.org/hibernate/ogm/3.0/reference/en-US/html_single/
You can contribute to it by forking on github https://github.com/hibernate/hibernate-ogm
To build the documentation,
- go to hibernate-ogm-doucumentation/manual
- run mvn install -DbuildDocs=true
You will get the result in target/docbook/publish/en-US
I'm interested in all kind of feedback:
- general feel
- what part is confusing
- what you think is missing (besides the TODOs)
And of course if you can contribute some part, that would be awesome.
Emmanuel
13 years, 7 months
Architectural question about a system built on Infinispan
by Attila-Mihaly Balazs
Hello all,
I'm new to Infinispan, sorry for any stupid questions. I'm evaluating it
for a medium size processing platforms and would like to get some
feedback about the feasibility of the architecture I've came up with
after reading the documents which I found.
The system will have two components:
- a GUI component which displays (a subset of) data and generates commands
- a datastore / processing component which holds the data and changes it
by reacting to the commands sent by the GUI
Important considerations are:
- high availability in the datastore tier
- low latency
- optimal data transfer from the data store to the GUI (ie. only deltas
/ changed elements should be transferred)
My current ideas are the following:
- use a set of hotrod servers with DIST mode and the number of copies
set to a value I would be comfortable with (I'm thinking 2 or 3 currently)
- use these servers to store both the current state and the commands
(this works out nicely, since I need to keep the commands for later
auditing)
- make hashing such that commands and objects on which the commands
operate get to the same subset of servers
Question: how can I control this? I don't want to control the
specific node, but just to ensure that objects A and B get to the same
subset of servers
- on each hotrod server add custom interceptors [1] which listen for the
command objects and when one is intercepted modifies the corresponding
object accordingly
- the GUI would write the commands to the correct HotRod servers trough
topology aware clients
- the GUI would contain a local cache with a subset of objects. These
objects would be synchronised with the HotRod servers (ie. when the
objects change in the datastore tier / HotRod tier, the change is
propagated to the GUI)
Question: what is the best way to achieve this? (to synchronise a
local cache with a subset of data from a set of HotRod servers). The
only option I'm aware of currently are continious queries [4]
- inside the data tier there would be "supporting" information which is
needed by nodes, but may not be necessary be in the local node (think
for example configuration which can be update runtime, but also more
dynamic information). From what I've read, the L1 cache feature [5]
would be perfect for this, except for the fact that it uses invalidation
when the data changes, rather than sending an update (ie. if the data
changes, it is invalidated and the non-local nodes have to fetch it again)
Question: is it possible to configure the L1 cache mechanism, such
that the original node sends updates when the data changes rather than
invalidations?
How optimal is the solution which I came up? How could it be improved?
I've read about the Distributed Data Stream Processing Framework in
Infinispan [3], but it seems to be more a one-off solution (ie. generate
a report about all the existing objects at a given moment) rather than
something which reacts to a new command as soon as it is written to the
cache.
I'm looking to implement a data grid, where each node contains the data
and the code to operate on the code. I will also be evaluating Hazelcast
and GigaSpaces, but currently Infinispan seems to be the better
alternative since it could be reused in multiple places in the
architecture, making it easier to maintain and to understand. The JBoss
Data Grid [2] also sounds interesting, but unfortunately it's not
available yet.
Best regards,
Attila Balazs
[1] http://community.jboss.org/wiki/InfinispanCustomInterceptors
[2] http://www.jboss.com/edg6-early-access/
[3]
http://community.jboss.org/wiki/DistributedDataStreamProcessingFrameworkI...
[4] http://community.jboss.org/wiki/ContinuousQueryWithInfinispan
[5] http://community.jboss.org/wiki/ClusteringModes#L1
13 years, 7 months
"Link Pull Request" workflow
by Galder Zamarreño
Hi all,
So, ISPN jira project has this workflow now available which means that when a pull req is ready, just click on it and it will pop up a dialog to enter the pull req address.
In theory, the issue is supposed to be marked as Resolved once the Git Pull Req has been been applied, but I dunno if this happens automatically (it'd be cool if it did!)
Cheers,
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
13 years, 7 months