[Design of JBossCache] - Data gravitation cleanup process and stale nodes
by manik.surtani@jboss.com
This is related to JBCACHE-1258.
Currently there is an issue where stale structural nodes are not cleaned up after a gravitation event. There are 2 cases to consider, really - let's start with the first, and easier of the two:
1. Data owner has been shut down or has crashed.
Lets consider a cluster of 3 servers, with the following state:
{{{
Server A:
/a/b/c
Server B:
/_BUDDY_BACKUP_/Server_A/a/b/c
Server C:
}}}
Now server A shuts down (or crashes).
{{{
Server B:
/_BUDDY_BACKUP_/Server_A/a/b/c
Server C:
}}}
And a request for /a/b/c comes in to Server C. Server C then broadcasts a gravitation request, followed by a gravitation cleanup call. What we are now left with is:
{{{
Server B:
/_BUDDY_BACKUP_/Server_A/a/b
/_BUDDY_BACKUP_/Server_C/a/b/c
Server C:
/a/b/c
}}}
The reason for this is that the Fqn requested is /a/b/c, and hence the Fqn gravitated is /a/b/c and, similarly, the gravitation cleanup call broadcast is for /a/b/c. So Server B removes backup state for /a/b/c, but /a/b still remains which can consume unnecessary memory.
I suggest that parent nodes are removed as well, during a cleanup call, provided there are no other children. So we wind our way back up the tree and remove all parents up to /_BUDDY_BACKP_/Server_A. And finally, if /_BUDDY_BACKUP_/Server_A is also empty, and Server A is no longer in the cluster, the buddy backup region should be removed as well.
Does anyone see this causing problems?
2. Data owner is still alive.
The same initial state:
{{{
Server A:
/a/b/c
Server B:
/_BUDDY_BACKUP_/Server_A/a/b/c
Server C:
}}}
Server A is sill alive, but Server C asks for /a/b/c
{{{
Server A:
/a/b
/_BUDDY_BACKUP_/Server_C/a/b/c
Server B:
/_BUDDY_BACKUP_/Server_A/a/b
/_BUDDY_BACKUP_/Server_C/a/b/c
Server C:
/a/b/c
}}}
Now going thru parents during a gravitation cleanup and removing empty nodes in Server B's backup region may make sense, but can this be applied to A's main tree as well? This is where things get tricky since application logic may depend on the existence of /a or /a/b, even if they are empty. It doesn't matter on Server B since this is a backup region and the application on B has no direct access to this region.
Thoughts? Just make it a configurable property? {{}}, which could be set to {{LEAVE_PARENTS}}, {{CLEAN_EMPTY_PARENTS_ON_BACKUP}}, {{CLEAN_ALL_EMPTY_PARENTS}} with the last one being default? I'm concerned we're already en-route to configuration hell, and am keen to limit configuration parameters to the absolute minimum. If we have a sensible enough a default, I'd rather not make this configurable.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4117764#4117764
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4117764
18 years, 3 months
[Design the new POJO MicroContainer] - VFS (URL) cache
by alesj
I was looking into how to cache some things with VirtualFileURLConnection.
A simple initial attempt, to push things fwd:
- http://anonsvn.jboss.org/repos/jbossas/projects/vfs/branches/cache-work/
What I tried to do is register every created VirtualFileHandler into cache - the key is VirtualFileHandler's vfs URL:
| if (vfsUrl != null)
| CacheLocator.getUrlCache().put(vfsUrl, this);
|
So when a VirtualFileURLConnection tries to access VirtualFile(Handler) we don't need to re-create the whole structure if it already exists.
| public static VirtualFileHandler resolveVirtualFileHandler(URL vfsurl, String relativePath) throws IOException
| {
| VFS vfs = VFS.getVFS(vfsurl);
| return vfs.findChild(relativePath).getHandler();
| }
|
| public synchronized VirtualFileHandler getVirtualFileHandler() throws IOException
| {
| if (handler == null)
| {
| boolean trace = log.isTraceEnabled();
| handler = CacheLocator.getUrlCache().get(vfsurl);
| if (handler == null)
| {
| handler = resolveVirtualFileHandler(vfsurl, relativePath);
| if (trace)
| log.trace("Resolving virtual file handler: " + vfsurl + "/" + relativePath);
| }
| else if (trace)
| log.trace("Handler resolved from cache: " + vfsurl + "/" + relativePath);
| }
| return handler;
| }
|
| public InputStream getInputStream() throws IOException
| {
| return getVirtualFileHandler().openStream();
| }
|
Since a AbstractVFSDeployment holds a strong ref to VirtualFile root, the whole structure should be there already (once touched), hence it should be possible to implement a simple cache with weak refs.
This didn't seam to be the case (or is it?), as weak refs were quickly gced - at least that's how it looked since no cache hits were present.
So and attempt was made with soft refs. :-)
This speeds up AS5 quite a bit, but I got a few failures. :-)
| 2008-01-07 22:00:14,125 TRACE [org.jboss.virtual.plugins.vfs.VirtualFileURLConnection] Handler resolved from cache: vfsfile:/C:/projects/jboss5/trunk/build/output/jboss-5.0.0.Beta3/server/default/deploy/jboss-local-jdbc.rar/META-INF/ra.xml
| 2008-01-07 22:00:14,140 DEBUG [org.jboss.resource.deployers.RARParserDeployer] Error during deploy: vfsfile:/C:/projects/jboss5/trunk/build/output/jboss-5.0.0.Beta3/server/default/deploy/jboss-local-jdbc.rar
| org.jboss.deployers.spi.DeploymentException: Error parsing meta data jboss-local-jdbc.rar/META-INF/ra.xml
| at org.jboss.deployers.spi.DeploymentException.rethrowAsDeploymentException(DeploymentException.java:49)
| at org.jboss.deployers.vfs.spi.deployer.ObjectModelFactoryDeployer.parse(ObjectModelFactoryDeployer.java:124)
| at org.jboss.deployers.vfs.spi.deployer.AbstractVFSParsingDeployer.parse(AbstractVFSParsingDeployer.java:86)
| at org.jboss.deployers.spi.deployer.helpers.AbstractParsingDeployerWithOutput.createMetaData(AbstractParsingDeployerWithOutput.java:223)
| at org.jboss.deployers.spi.deployer.helpers.AbstractParsingDeployerWithOutput.createMetaData(AbstractParsingDeployerWithOutput.java:199)
| at org.jboss.deployers.spi.deployer.helpers.AbstractParsingDeployerWithOutput.deploy(AbstractParsingDeployerWithOutput.java:162)
| at org.jboss.deployers.plugins.deployers.DeployerWrapper.deploy(DeployerWrapper.java:169)
| at org.jboss.deployers.plugins.deployers.DeployersImpl.doInstallParentFirst(DeployersImpl.java:853)
| at org.jboss.deployers.plugins.deployers.DeployersImpl.install(DeployersImpl.java:794)
| at org.jboss.dependency.plugins.AbstractControllerContext.install(AbstractControllerContext.java:327)
| at org.jboss.dependency.plugins.AbstractController.install(AbstractController.java:1309)
| at org.jboss.dependency.plugins.AbstractController.incrementState(AbstractController.java:734)
| at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:862)
| at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:784)
| at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:622)
| at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:411)
| at org.jboss.deployers.plugins.deployers.DeployersImpl.process(DeployersImpl.java:498)
| at org.jboss.deployers.plugins.main.MainDeployerImpl.process(MainDeployerImpl.java:506)
| at org.jboss.system.server.profileservice.ProfileServiceBootstrap.loadProfile(ProfileServiceBootstrap.java:245)
| at org.jboss.system.server.profileservice.ProfileServiceBootstrap.start(ProfileServiceBootstrap.java:131)
| at org.jboss.bootstrap.AbstractServerImpl.start(AbstractServerImpl.java:408)
| at org.jboss.Main.boot(Main.java:208)
| at org.jboss.Main$1.run(Main.java:534)
| at java.lang.Thread.run(Thread.java:595)
| Caused by: org.jboss.xb.binding.JBossXBException: Failed to parse source: vfsfile:/C:/projects/jboss5/trunk/build/output/jboss-5.0.0.Beta3/server/default/deploy/jboss-local-jdbc.rar/META-INF/ra.xml@1,1
| at org.jboss.xb.binding.parser.sax.SaxJBossXBParser.parse(SaxJBossXBParser.java:177)
| at org.jboss.xb.binding.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:186)
| at org.jboss.deployers.vfs.spi.deployer.ObjectModelFactoryDeployer.parse(ObjectModelFactoryDeployer.java:120)
| ... 22 more
| Caused by: org.xml.sax.SAXException: Content is not allowed in prolog. @ vfsfile:/C:/projects/jboss5/trunk/build/output/jboss-5.0.0.Beta3/server/default/deploy/jboss-local-jdbc.rar/META-INF/ra.xml[1,1]
| at org.jboss.xb.binding.parser.sax.SaxJBossXBParser$MetaDataErrorHandler.fatalError(SaxJBossXBParser.java:438)
| at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
| at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
| at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
| at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
| at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
| at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
| at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
| at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
| at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
| at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
| at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
| at org.jboss.xb.binding.parser.sax.SaxJBossXBParser.parse(SaxJBossXBParser.java:173)
| ... 24 more
|
And I don't see what exactly causes this.
The input stream for this should be from this piece of code - NoCopyNestedJarHandler:
| public InputStream openStream() throws IOException
| {
| return getJar().getInputStream(getEntry());
| }
|
Which should be there, even if cached.
Looking at the log, I do get a lot of cache hits, which somehow points to the right track.
Any thoughts or other ideas on how to cache VFS?
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4117760#4117760
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4117760
18 years, 3 months
[Design of JBossCache] - Re: Implicit marshalled values - a better way of handling re
by bstansberry@jboss.com
"jason.greene(a)jboss.com" wrote : Thats not a deadlock, the updates would be serialized though.
Sure you can deadlock. When I said "node A" and "node B" I meant different cache instances in the cluster. The both acquire local write locks on the same node in the tree, insert/update their key, then try to acquire the WL globally as part of tx commit. Fails.
Assume pessimistic locking here (which may not be an issue if we do this far enough in the future, but partly I'm thinking about whether I want to try it this way now.)
anonymous wrote : I believe this can be solved by modifying putForExternalRead to not only check for node existence, but also key existence.
Yeah, that's the solution; just not sure how simple that is, since once you start looking inside nodes you might have to start thinking about locking issues etc that don't exist if you just check node existence. But, it might be trivial.
anonymous wrote : In the 2 node solution, which does not seem necessary, the UUID does not need to be known by all nodes. It is generated once, for the entire lifespan of the second node.
Let's pretend a bit that the 2 node solution is necessary, in case it leads somewhere. :) You can have concurrent putForExternalRead calls on different cache instances, each of which would store a different UUID for the same entity. You'd end up with two copies of the entity in the cache.
Hmm -- actually you'd get a weird effect where the PFER call for inserting the key/uuid would be aborted when propagated (since the key already exists on the remote node) but the PFER for the uuid node would succeed.
OK, let's ignore the 2 node solution. ;) Lot's of problems like that; weirdness when Hibernate suspends transactions, but now we're dealing with doing multiple cache writes.
anonymous wrote : Keep them coming!
With OL, we'd have versioning problems, since the version is applied to the node, not the key/value pair. 2 node solution rises from the dead....
Architecturally, probably cleaner to have a cache per SessionFactory, with the default classloader for deserialization being the deployment's classloader. Seems the only negative to that is if a tx spans session factories, which is probably not a common case.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4117754#4117754
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4117754
18 years, 3 months
[Design of Clustering on JBoss (Clusters/JBoss)] - Re: Next Gen Web Tier Load Balancing Design
by bstansberry@jboss.com
Thanks for the input, Andy. The discussion of how the server-side works on the wiki doc is still very thin. Also, we also had a conf call right before I went on holiday; I need to update the docs for that.
How the HASingleton should work is an open issue. If the load balance calculation is stateless, it's a trivial HASingleton problem; if the current master fails/shuts down, the new one takes over and starts sending data to the mod_cluster instances. But, in reality the load balance calculation is unlikely to be stateless; e.g. most will probably use some sort of time-decay function. So, we need something like a master/slave, where the slaves maintain the necessary state to take over the load balance calculation if they are elected master.
A few approaches to this come to mind:
1) Nodes multicast their node data to the cluster. So, any node that is interested in maintaining state has it. Don't much like that, as it's pretty chatty in a large cluster. If the underlying JGroups channel isn't UDP-based, then its a lot of traffic.
2) Nodes are aware who the master and slaves are, and send multiple unicasts. Again chatty.
3) Master knows who the slaves are and sends a copy of aggregated, pre-digested state to them every time it recalculates. This seems the most logical.
In any case, a good load balancing algorithm should have smoothing functions built in to ensure the results do not change too radically if some state gets dropped..
The current HASingleton infrastructure should support these options pretty well; any HASingletonSupport subclass is a regular service that adds a couple extra lifecyle stages to the standard 4:
create
start
become master
stop master
stop
destroy
They also get callbacks for cluster topology changes. So, it's easy for the ModClusterService to run on each node in the "started" mode, monitoring state if they believe their position in the topology makes them a "slave". Then take over interacting with the mod_cluster instances when they "become master".
Re: communication across the firewall, the intent is to use http/https from the AS side to the httpd side for the load balancing information. What port and whether http or https is used depends on how the user configures httpd. Basically, mod_cluster functions as an httpd mount, similar to the jkstatus web app that comes with mod_jk. Here's the config you add to httpd.conf for that; I assume mod_cluster would use something similar (with a different "Allow from"):
| <Location /status/>
| JkMount jkstatus
| Order deny,allow
| Deny from all
| Allow from 127.0.0.1
| </Location>
|
For normal request traffic from mod_cluster to JBossWeb, communication will use AJP. The AS-side endpoint will be the standard AJP Connector; there is no intent to modify that at all for this. So, if Mladen et al decide to add TLS support to AJP, then it would be handled.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4117745#4117745
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4117745
18 years, 3 months