[jbosstools-dev] Hudson is back up and running well after this weekend's planned outage

Nick Boldt nboldt at redhat.com
Sun Jan 16 08:32:38 EST 2011


FYI. All systems appear nominal after this weekend's planned outage.

---

A few jobs [1]. [2] are still red after the reboot, but most are due to 
one slave's disk being full, which has been corrected (JBIDE-8120). A 
few appear to be JUnit test failures which SHOULD be ok if respun on 
another slave (JBIDE-8065).

[1] 
http://hudson.qa.jboss.com/hudson/view/DevStudio_Stable_Branch/portlet/dashboard_portlet_13/
[2] 
http://hudson.qa.jboss.com/hudson/view/DevStudio_Trunk/portlet/dashboard_portlet_9/

---

More details here, and copied below:

http://post-office.corp.redhat.com/archives//outage-list/2011-January/msg00106.html

-------- Original Message --------
Subject: Re: Hudson check-up after the Westford power outage
Date: Sun, 16 Jan 2011 07:57:09 -0500 (EST)
From: Michael Harvey <mharvey at redhat.com>
To: Vojtech Juranek <vjuranek at redhat.com>
CC: Len DiMaggio <ldimaggi at redhat.com>, Matthew Schick 
<mschick at redhat.com>,        jpechane <jpechane at redhat.com>, 
Rajesh Rajasekaran <rrajasek at redhat.com>,        Prabhat Jha 
<pjha at redhat.com>, Martin Vecera <mvecera at redhat.com>,        Ondrej 
Skutka <oskutka at redhat.com>, Juraci Costa <jcosta at redhat.com>, 
Lukas Petrovicky <lpetrovi at redhat.com>, akostadi <akostadi at redhat.com>, 
        Tomas Herfert <therfert at redhat.com>, max 
<max.andersen at redhat.com>,        Anne-Louise Tangring 
<atangrin at redhat.com>,        Nick Boldt <nboldt at redhat.com>

Hi All,

I talking to Alex and reading the emails, here's the status:

Verification Checklist AFTER the Power Outage:

     1. Review Hudson slaves are connected and executing jobs 
(Vojtech/Alex):  Tomas investigating slaves that are Down
     2. Plugins are loaded (Vojtech/Alex): OK
     3. Time trigger is working (Vojtech/Alex) OK
     4. SVN polling is working (Vojtech/Alex): OK
     5. Builds are being published on public Hudson (Vojtech/Alex): OK
     6. Hudson dirs are backup (Tomas):  Tomas to update.

Hopefully we'll keep the call short so we can use the time to 
investigate/resolve the open items.

Thanks,

================================================================
Mike Harvey, Manager, JBoss QE          mharvey at redhat.com 
Red Hat, Inc.
919-754-4819
http://opensource.com/

----- Original Message -----
From: "Vojtech Juranek" <vjuranek at redhat.com>
To: "Nick Boldt" <nboldt at redhat.com>
Cc: "Len DiMaggio" <ldimaggi at redhat.com>, "Matthew Schick" 
<mschick at redhat.com>, "Michael Harvey" <mharvey at redhat.com>, "jpechane" 
<jpechane at redhat.com>, "Rajesh Rajasekaran" <rrajasek at redhat.com>, 
"Prabhat Jha" <pjha at redhat.com>, "Martin Vecera" <mvecera at redhat.com>, 
"Ondrej Skutka" <oskutka at redhat.com>, "Juraci Costa" 
<jcosta at redhat.com>, "Lukas Petrovicky" <lpetrovi at redhat.com>, 
"akostadi" <akostadi at redhat.com>, "Tomas Herfert" <therfert at redhat.com>, 
"max" <max.andersen at redhat.com>, "Anne-Louise Tangring" 
<atangrin at redhat.com>
Sent: Sunday, January 16, 2011 7:24:06 AM
Subject: Re: Hudson check-up after the Westford power outage

Hi Nick

> So far everything looks good after the outage.
>
> Attached is a list of all that I've checked.
>
> One issue reported to helpdesk: https://issues.jboss.org/browse/JBIDE-8120.

the disk was full on dev02, now it's fixed
Thanks
Vojta

> >     *From: *"Matthew Schick" <mschick at redhat.com>
> >     *To: *"Len DiMaggio" <ldimaggi at redhat.com>
> >     *Cc: *"Michael Harvey" <mharvey at redhat.com>, "jpechane"
> >     <jpechane at redhat.com>, "Vojtech Juranek" <vjuranek at redhat.com>,
> >     "Rajesh Rajasekaran" <rrajasek at redhat.com>, "Prabhat Jha"
> >     <pjha at redhat.com>, "Martin Vecera" <mvecera at redhat.com>, "Ondrej
> >     Skutka" <oskutka at redhat.com>, "Juraci Costa" <jcosta at redhat.com>,
> >     "Lukas Petrovicky" <lpetrovi at redhat.com>, "akostadi"
> >     <akostadi at redhat.com>, "Tomas Herfert" <therfert at redhat.com>, "max"
> >     <max.andersen at redhat.com>, "Nick Boldt" <nboldt at redhat.com>,
> >     "Anne-Louise Tangring" <atangrin at redhat.com>
> >     *Sent: *Friday, January 14, 2011 10:33:34 AM
> >     *Subject: *Re: Hudson check-up after the Westford power outage
> >
> >     Yup, I'll be there. Can't promise I'll be entirely coherent, but I'll
> >     be on the call. :)
> >
> >     On Fri, 2011-01-14 at 10:31 -0500, Len DiMaggio wrote:
> >      > 'Morning Matt,
> >      >
> >      > I had arranged a meeting for 08:00 on Sunday Jan 16 - I thought
> >
> >     we could use the Hudson DNS change doc -
> >     https://docspace.corp.redhat.com/docs/DOC-50942 - as a guide for the
> >     post power outage Hudson check.
> >
> >      > Question - are you still planning on attending this meeting? I
> >
> >     want to be certain to get everyone the correct dial-in info.
> >
> >      > Thanks!,
> >      > Len
> >      >
> >      >
> >      >
> >      > ----- Original Message -----
> >      >
> >      >
> >      > From: "Matthew Schick" <mschick at redhat.com>
> >      > To: "Leonard Dimaggio" <ldimaggi at redhat.com>
> >      > Cc: "Michael Harvey" <mharvey at redhat.com>, "jpechane"
> >
> >     <jpechane at redhat.com>, "Vojtech Juranek" <vjuranek at redhat.com>,
> >     "Rajesh Rajasekaran" <rrajasek at redhat.com>, "Prabhat Jha"
> >     <pjha at redhat.com>, "Martin Vecera" <mvecera at redhat.com>, "Ondrej
> >     Skutka" <oskutka at redhat.com>, "Juraci Costa" <jcosta at redhat.com>,
> >     "Lukas Petrovicky" <lpetrovi at redhat.com>, "akostadi"
> >     <akostadi at redhat.com>, "Tomas Herfert" <therfert at redhat.com>, "max"
> >     <max.andersen at redhat.com>, "Nick Boldt" <nboldt at redhat.com>,
> >     "Anne-Louise Tangring" <atangrin at redhat.com>
> >
> >      > Sent: Wednesday, January 5, 2011 12:04:31 PM
> >      > Subject: Re: Hudson check-up after the Westford power outage
> >      >
> >      > The rough timeline for the outage is as follows (all times EST):
> >      > Jan 14th:
> >      > 5:00 PM - All non-essential lab systems brought offline. If
> >      > there's hudson slaves we can include here let me know.
> >      > Jan 15th:
> >      > 4:30 AM - We bring all services down
> >      > 8:00 AM - Power off
> >      > 6:00 PM - Power is back, IT begins bringing their systems online
> >      > 8:00 PM - Core systems (including Hudson master/storage) begin
> >
> >     coming
> >
> >      > back online
> >      > 10:00 PM - Core systems should all be back, begin lab systems
> >      >
> >      > We'll be doing at least basic sanity checking on Hudson when it's
> >      > brought online, comprehensive testing will gate on folks being
> >
> >     around to
> >
> >      > test. Barring any unforseen emergencies, hudson should be fully
> >      > functional by this meeting time.
> >      >
> >      > FYI, we'll have a more comprehensive timeline up on docspace on
> >
> >     Monday
> >
> >      > and I'll pass that link around once it's up.
> >      >
> >
> >     --
> >     Matthew Schick
> >     GPG: 98211610 (9214 656C 606A 6B75 3325 2D04 B067 F237 9821 1610)
> >     Supervisor, Systems Administrators, Engineering Operations
> >     Red Hat, Inc.
> >
> > --
> >
> > Len DiMaggio (ldimaggi at redhat.com)
> > JBoss by Red Hat
> > 314 Littleton Road
> > Westford, MA 01886 USA
> > tel: 978.392.3179
> > cell: 781.472.9912
> > http://www.redhat.com


More information about the jbosstools-dev mailing list