FYI. All systems appear nominal after this weekend's planned outage.
---
A few jobs [1]. [2] are still red after the reboot, but most are due to
one slave's disk being full, which has been corrected (JBIDE-8120). A
few appear to be JUnit test failures which SHOULD be ok if respun on
another slave (JBIDE-8065).
[1]
http://hudson.qa.jboss.com/hudson/view/DevStudio_Stable_Branch/portlet/da...
[2]
http://hudson.qa.jboss.com/hudson/view/DevStudio_Trunk/portlet/dashboard_...
---
More details here, and copied below:
http://post-office.corp.redhat.com/archives//outage-list/2011-January/msg...
-------- Original Message --------
Subject: Re: Hudson check-up after the Westford power outage
Date: Sun, 16 Jan 2011 07:57:09 -0500 (EST)
From: Michael Harvey <mharvey(a)redhat.com>
To: Vojtech Juranek <vjuranek(a)redhat.com>
CC: Len DiMaggio <ldimaggi(a)redhat.com>, Matthew Schick
<mschick(a)redhat.com>, jpechane <jpechane(a)redhat.com>,
Rajesh Rajasekaran <rrajasek(a)redhat.com>, Prabhat Jha
<pjha(a)redhat.com>, Martin Vecera <mvecera(a)redhat.com>, Ondrej
Skutka <oskutka(a)redhat.com>, Juraci Costa <jcosta(a)redhat.com>,
Lukas Petrovicky <lpetrovi(a)redhat.com>, akostadi <akostadi(a)redhat.com>,
Tomas Herfert <therfert(a)redhat.com>, max
<max.andersen(a)redhat.com>, Anne-Louise Tangring
<atangrin(a)redhat.com>, Nick Boldt <nboldt(a)redhat.com>
Hi All,
I talking to Alex and reading the emails, here's the status:
Verification Checklist AFTER the Power Outage:
1. Review Hudson slaves are connected and executing jobs
(Vojtech/Alex): Tomas investigating slaves that are Down
2. Plugins are loaded (Vojtech/Alex): OK
3. Time trigger is working (Vojtech/Alex) OK
4. SVN polling is working (Vojtech/Alex): OK
5. Builds are being published on public Hudson (Vojtech/Alex): OK
6. Hudson dirs are backup (Tomas): Tomas to update.
Hopefully we'll keep the call short so we can use the time to
investigate/resolve the open items.
Thanks,
================================================================
Mike Harvey, Manager, JBoss QE mharvey(a)redhat.com
Red Hat, Inc.
919-754-4819
http://opensource.com/
----- Original Message -----
From: "Vojtech Juranek" <vjuranek(a)redhat.com>
To: "Nick Boldt" <nboldt(a)redhat.com>
Cc: "Len DiMaggio" <ldimaggi(a)redhat.com>, "Matthew Schick"
<mschick(a)redhat.com>, "Michael Harvey" <mharvey(a)redhat.com>,
"jpechane"
<jpechane(a)redhat.com>, "Rajesh Rajasekaran" <rrajasek(a)redhat.com>,
"Prabhat Jha" <pjha(a)redhat.com>, "Martin Vecera"
<mvecera(a)redhat.com>,
"Ondrej Skutka" <oskutka(a)redhat.com>, "Juraci Costa"
<jcosta(a)redhat.com>, "Lukas Petrovicky" <lpetrovi(a)redhat.com>,
"akostadi" <akostadi(a)redhat.com>, "Tomas Herfert"
<therfert(a)redhat.com>,
"max" <max.andersen(a)redhat.com>, "Anne-Louise Tangring"
<atangrin(a)redhat.com>
Sent: Sunday, January 16, 2011 7:24:06 AM
Subject: Re: Hudson check-up after the Westford power outage
Hi Nick
So far everything looks good after the outage.
Attached is a list of all that I've checked.
One issue reported to helpdesk:
https://issues.jboss.org/browse/JBIDE-8120.
the disk was full on dev02, now it's fixed
Thanks
Vojta
> *From: *"Matthew Schick"
<mschick(a)redhat.com>
> *To: *"Len DiMaggio" <ldimaggi(a)redhat.com>
> *Cc: *"Michael Harvey" <mharvey(a)redhat.com>,
"jpechane"
> <jpechane(a)redhat.com>, "Vojtech Juranek"
<vjuranek(a)redhat.com>,
> "Rajesh Rajasekaran" <rrajasek(a)redhat.com>, "Prabhat
Jha"
> <pjha(a)redhat.com>, "Martin Vecera" <mvecera(a)redhat.com>,
"Ondrej
> Skutka" <oskutka(a)redhat.com>, "Juraci Costa"
<jcosta(a)redhat.com>,
> "Lukas Petrovicky" <lpetrovi(a)redhat.com>, "akostadi"
> <akostadi(a)redhat.com>, "Tomas Herfert"
<therfert(a)redhat.com>, "max"
> <max.andersen(a)redhat.com>, "Nick Boldt"
<nboldt(a)redhat.com>,
> "Anne-Louise Tangring" <atangrin(a)redhat.com>
> *Sent: *Friday, January 14, 2011 10:33:34 AM
> *Subject: *Re: Hudson check-up after the Westford power outage
>
> Yup, I'll be there. Can't promise I'll be entirely coherent, but
I'll
> be on the call. :)
>
> On Fri, 2011-01-14 at 10:31 -0500, Len DiMaggio wrote:
> > 'Morning Matt,
> >
> > I had arranged a meeting for 08:00 on Sunday Jan 16 - I thought
>
> we could use the Hudson DNS change doc -
>
https://docspace.corp.redhat.com/docs/DOC-50942 - as a guide for the
> post power outage Hudson check.
>
> > Question - are you still planning on attending this meeting? I
>
> want to be certain to get everyone the correct dial-in info.
>
> > Thanks!,
> > Len
> >
> >
> >
> > ----- Original Message -----
> >
> >
> > From: "Matthew Schick" <mschick(a)redhat.com>
> > To: "Leonard Dimaggio" <ldimaggi(a)redhat.com>
> > Cc: "Michael Harvey" <mharvey(a)redhat.com>,
"jpechane"
>
> <jpechane(a)redhat.com>, "Vojtech Juranek"
<vjuranek(a)redhat.com>,
> "Rajesh Rajasekaran" <rrajasek(a)redhat.com>, "Prabhat
Jha"
> <pjha(a)redhat.com>, "Martin Vecera" <mvecera(a)redhat.com>,
"Ondrej
> Skutka" <oskutka(a)redhat.com>, "Juraci Costa"
<jcosta(a)redhat.com>,
> "Lukas Petrovicky" <lpetrovi(a)redhat.com>, "akostadi"
> <akostadi(a)redhat.com>, "Tomas Herfert"
<therfert(a)redhat.com>, "max"
> <max.andersen(a)redhat.com>, "Nick Boldt"
<nboldt(a)redhat.com>,
> "Anne-Louise Tangring" <atangrin(a)redhat.com>
>
> > Sent: Wednesday, January 5, 2011 12:04:31 PM
> > Subject: Re: Hudson check-up after the Westford power outage
> >
> > The rough timeline for the outage is as follows (all times EST):
> > Jan 14th:
> > 5:00 PM - All non-essential lab systems brought offline. If
> > there's hudson slaves we can include here let me know.
> > Jan 15th:
> > 4:30 AM - We bring all services down
> > 8:00 AM - Power off
> > 6:00 PM - Power is back, IT begins bringing their systems online
> > 8:00 PM - Core systems (including Hudson master/storage) begin
>
> coming
>
> > back online
> > 10:00 PM - Core systems should all be back, begin lab systems
> >
> > We'll be doing at least basic sanity checking on Hudson when it's
> > brought online, comprehensive testing will gate on folks being
>
> around to
>
> > test. Barring any unforseen emergencies, hudson should be fully
> > functional by this meeting time.
> >
> > FYI, we'll have a more comprehensive timeline up on docspace on
>
> Monday
>
> > and I'll pass that link around once it's up.
> >
>
> --
> Matthew Schick
> GPG: 98211610 (9214 656C 606A 6B75 3325 2D04 B067 F237 9821 1610)
> Supervisor, Systems Administrators, Engineering Operations
> Red Hat, Inc.
>
> --
>
> Len DiMaggio (ldimaggi(a)redhat.com)
> JBoss by Red Hat
> 314 Littleton Road
> Westford, MA 01886 USA
> tel: 978.392.3179
> cell: 781.472.9912
>
http://www.redhat.com