[JBoss JIRA] Created: (MODCLUSTER-67) httpd crashes on windows when you kill JBoss AS
by Brian Stansberry (JIRA)
httpd crashes on windows when you kill JBoss AS
-----------------------------------------------
Key: MODCLUSTER-67
URL: https://jira.jboss.org/jira/browse/MODCLUSTER-67
Project: mod_cluster
Issue Type: Bug
Affects Versions: 1.0.0.CR1
Environment: Windows XP SP3
mod_cluster 1.0.0.CR1 using mod_proxy_ajp and mod_advertise
2 node cluster running an AS build that is approx AS 5.1.0.Beta1
Reporter: Brian Stansberry
Assignee: Jean-Frederic Clere
When playing with 1.0.0.CR1 on my windows laptop I decided to see how it would react to hard killing an AS instance. I was using the demo client, sending about 40 req / sec to a 2 node cluster. I used Windows Task Manager to kill one of the AS processes. Twice in a row this immediately resulted in httpd crashing.
Windows asked if I wanted to send an error report; I agreed hoping I'd be able to extract some info to include with this report. It gave me a ton of likely useful stuff, but sadly puts it in a GUI dialog where there's no way to cut and paste it.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
15 years, 7 months
[JBoss JIRA] Updated: (MODCLUSTER-66) HAModClusterService needs to handle cluster splits
by Paul Ferraro (JIRA)
[ https://jira.jboss.org/jira/browse/MODCLUSTER-66?page=com.atlassian.jira.... ]
Paul Ferraro updated MODCLUSTER-66:
-----------------------------------
Fix Version/s: (was: 1.0.0.CR1)
> HAModClusterService needs to handle cluster splits
> --------------------------------------------------
>
> Key: MODCLUSTER-66
> URL: https://jira.jboss.org/jira/browse/MODCLUSTER-66
> Project: mod_cluster
> Issue Type: Task
> Affects Versions: 1.0.0.Beta4
> Reporter: Brian Stansberry
> Assignee: Paul Ferraro
>
> The case where a split of the JGroups group occurs but nodes are still able to contact the httpd servers needs to be handled. There is a brief discussion of this on https://www.jboss.org/community/docs/DOC-11431 under "Split-Brain Syndrome". Problem is split-brain will result in nodes removing each other from httpd, resulting in no nodes active.
> The wiki page describes a simple approach. A more complex approach would be to take a "primary partition" approach, whereby say an initial cluster of size n==6 {A, B, C, D, E, F} splits into two cluster {A, B, C, D} and { E, F}. To continue to handle requests a partition would need to have at least Math.floor((float) n / 2 + 1) members.
> What kind of approach is appropriate would probably depend on the deployed webapps and how they interact with the cluster. If there is no clustered state that can become inconsistent across the cluster split, the simple approach described on the wiki can work fine (an HAModClusterService master doesn't disable a node if httpd reports it is still available). If there is shared state that needs to remain consistent (e.g. a clustered Hibernate Second Level Cache) then primary partition works better.
> Most likely this overall problem will be resolved in stages, e.g. the simple approach from the wiki first.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
15 years, 8 months