[jbosstools-dev] Hudson is back up and running well after this weekend's planned outage
Nick Boldt
nboldt at redhat.com
Sun Jan 16 08:32:38 EST 2011
FYI. All systems appear nominal after this weekend's planned outage.
---
A few jobs [1]. [2] are still red after the reboot, but most are due to
one slave's disk being full, which has been corrected (JBIDE-8120). A
few appear to be JUnit test failures which SHOULD be ok if respun on
another slave (JBIDE-8065).
[1]
http://hudson.qa.jboss.com/hudson/view/DevStudio_Stable_Branch/portlet/dashboard_portlet_13/
[2]
http://hudson.qa.jboss.com/hudson/view/DevStudio_Trunk/portlet/dashboard_portlet_9/
---
More details here, and copied below:
http://post-office.corp.redhat.com/archives//outage-list/2011-January/msg00106.html
-------- Original Message --------
Subject: Re: Hudson check-up after the Westford power outage
Date: Sun, 16 Jan 2011 07:57:09 -0500 (EST)
From: Michael Harvey <mharvey at redhat.com>
To: Vojtech Juranek <vjuranek at redhat.com>
CC: Len DiMaggio <ldimaggi at redhat.com>, Matthew Schick
<mschick at redhat.com>, jpechane <jpechane at redhat.com>,
Rajesh Rajasekaran <rrajasek at redhat.com>, Prabhat Jha
<pjha at redhat.com>, Martin Vecera <mvecera at redhat.com>, Ondrej
Skutka <oskutka at redhat.com>, Juraci Costa <jcosta at redhat.com>,
Lukas Petrovicky <lpetrovi at redhat.com>, akostadi <akostadi at redhat.com>,
Tomas Herfert <therfert at redhat.com>, max
<max.andersen at redhat.com>, Anne-Louise Tangring
<atangrin at redhat.com>, Nick Boldt <nboldt at redhat.com>
Hi All,
I talking to Alex and reading the emails, here's the status:
Verification Checklist AFTER the Power Outage:
1. Review Hudson slaves are connected and executing jobs
(Vojtech/Alex): Tomas investigating slaves that are Down
2. Plugins are loaded (Vojtech/Alex): OK
3. Time trigger is working (Vojtech/Alex) OK
4. SVN polling is working (Vojtech/Alex): OK
5. Builds are being published on public Hudson (Vojtech/Alex): OK
6. Hudson dirs are backup (Tomas): Tomas to update.
Hopefully we'll keep the call short so we can use the time to
investigate/resolve the open items.
Thanks,
================================================================
Mike Harvey, Manager, JBoss QE mharvey at redhat.com
Red Hat, Inc.
919-754-4819
http://opensource.com/
----- Original Message -----
From: "Vojtech Juranek" <vjuranek at redhat.com>
To: "Nick Boldt" <nboldt at redhat.com>
Cc: "Len DiMaggio" <ldimaggi at redhat.com>, "Matthew Schick"
<mschick at redhat.com>, "Michael Harvey" <mharvey at redhat.com>, "jpechane"
<jpechane at redhat.com>, "Rajesh Rajasekaran" <rrajasek at redhat.com>,
"Prabhat Jha" <pjha at redhat.com>, "Martin Vecera" <mvecera at redhat.com>,
"Ondrej Skutka" <oskutka at redhat.com>, "Juraci Costa"
<jcosta at redhat.com>, "Lukas Petrovicky" <lpetrovi at redhat.com>,
"akostadi" <akostadi at redhat.com>, "Tomas Herfert" <therfert at redhat.com>,
"max" <max.andersen at redhat.com>, "Anne-Louise Tangring"
<atangrin at redhat.com>
Sent: Sunday, January 16, 2011 7:24:06 AM
Subject: Re: Hudson check-up after the Westford power outage
Hi Nick
> So far everything looks good after the outage.
>
> Attached is a list of all that I've checked.
>
> One issue reported to helpdesk: https://issues.jboss.org/browse/JBIDE-8120.
the disk was full on dev02, now it's fixed
Thanks
Vojta
> > *From: *"Matthew Schick" <mschick at redhat.com>
> > *To: *"Len DiMaggio" <ldimaggi at redhat.com>
> > *Cc: *"Michael Harvey" <mharvey at redhat.com>, "jpechane"
> > <jpechane at redhat.com>, "Vojtech Juranek" <vjuranek at redhat.com>,
> > "Rajesh Rajasekaran" <rrajasek at redhat.com>, "Prabhat Jha"
> > <pjha at redhat.com>, "Martin Vecera" <mvecera at redhat.com>, "Ondrej
> > Skutka" <oskutka at redhat.com>, "Juraci Costa" <jcosta at redhat.com>,
> > "Lukas Petrovicky" <lpetrovi at redhat.com>, "akostadi"
> > <akostadi at redhat.com>, "Tomas Herfert" <therfert at redhat.com>, "max"
> > <max.andersen at redhat.com>, "Nick Boldt" <nboldt at redhat.com>,
> > "Anne-Louise Tangring" <atangrin at redhat.com>
> > *Sent: *Friday, January 14, 2011 10:33:34 AM
> > *Subject: *Re: Hudson check-up after the Westford power outage
> >
> > Yup, I'll be there. Can't promise I'll be entirely coherent, but I'll
> > be on the call. :)
> >
> > On Fri, 2011-01-14 at 10:31 -0500, Len DiMaggio wrote:
> > > 'Morning Matt,
> > >
> > > I had arranged a meeting for 08:00 on Sunday Jan 16 - I thought
> >
> > we could use the Hudson DNS change doc -
> > https://docspace.corp.redhat.com/docs/DOC-50942 - as a guide for the
> > post power outage Hudson check.
> >
> > > Question - are you still planning on attending this meeting? I
> >
> > want to be certain to get everyone the correct dial-in info.
> >
> > > Thanks!,
> > > Len
> > >
> > >
> > >
> > > ----- Original Message -----
> > >
> > >
> > > From: "Matthew Schick" <mschick at redhat.com>
> > > To: "Leonard Dimaggio" <ldimaggi at redhat.com>
> > > Cc: "Michael Harvey" <mharvey at redhat.com>, "jpechane"
> >
> > <jpechane at redhat.com>, "Vojtech Juranek" <vjuranek at redhat.com>,
> > "Rajesh Rajasekaran" <rrajasek at redhat.com>, "Prabhat Jha"
> > <pjha at redhat.com>, "Martin Vecera" <mvecera at redhat.com>, "Ondrej
> > Skutka" <oskutka at redhat.com>, "Juraci Costa" <jcosta at redhat.com>,
> > "Lukas Petrovicky" <lpetrovi at redhat.com>, "akostadi"
> > <akostadi at redhat.com>, "Tomas Herfert" <therfert at redhat.com>, "max"
> > <max.andersen at redhat.com>, "Nick Boldt" <nboldt at redhat.com>,
> > "Anne-Louise Tangring" <atangrin at redhat.com>
> >
> > > Sent: Wednesday, January 5, 2011 12:04:31 PM
> > > Subject: Re: Hudson check-up after the Westford power outage
> > >
> > > The rough timeline for the outage is as follows (all times EST):
> > > Jan 14th:
> > > 5:00 PM - All non-essential lab systems brought offline. If
> > > there's hudson slaves we can include here let me know.
> > > Jan 15th:
> > > 4:30 AM - We bring all services down
> > > 8:00 AM - Power off
> > > 6:00 PM - Power is back, IT begins bringing their systems online
> > > 8:00 PM - Core systems (including Hudson master/storage) begin
> >
> > coming
> >
> > > back online
> > > 10:00 PM - Core systems should all be back, begin lab systems
> > >
> > > We'll be doing at least basic sanity checking on Hudson when it's
> > > brought online, comprehensive testing will gate on folks being
> >
> > around to
> >
> > > test. Barring any unforseen emergencies, hudson should be fully
> > > functional by this meeting time.
> > >
> > > FYI, we'll have a more comprehensive timeline up on docspace on
> >
> > Monday
> >
> > > and I'll pass that link around once it's up.
> > >
> >
> > --
> > Matthew Schick
> > GPG: 98211610 (9214 656C 606A 6B75 3325 2D04 B067 F237 9821 1610)
> > Supervisor, Systems Administrators, Engineering Operations
> > Red Hat, Inc.
> >
> > --
> >
> > Len DiMaggio (ldimaggi at redhat.com)
> > JBoss by Red Hat
> > 314 Littleton Road
> > Westford, MA 01886 USA
> > tel: 978.392.3179
> > cell: 781.472.9912
> > http://www.redhat.com
More information about the jbosstools-dev
mailing list