Subject: OSCER: Boomer Outage Extended
From: Brandon George
Date: Wed, 26 Mar 2014 18:56:17 -0500
Unfortunately we experienced a major problem today in that
all compute node connecting Infiniband switches failed to
return after rebooting to load the new firmware image.
We are working diligently with the vendor to resolve
the issue and hope to have Infiniband restored some
time after the arrival of additional hardware required
for recovery tomorrow.
Accordingly, the boomer maintenance outage has been extended
through tomorrow. However, we will be re-opening login nodes
boomer1 and boomer3 shortly for access to data. We'll send
further updates when we have new information.
Our apologies for the inconvenience,
On Wed, Mar 26, 2014 at 08:19:29AM -0500, OSCER Support wrote:
> Just a reminder of today's maintenance beginning now.
> On Tue, Mar 25, 2014 at 10:50:19AM -0500, OSCER Support wrote:
> > OSCER users,
> > This Wednesday, March 26th, we will take all of Boomer down for
> > maintenance. The outage is expected to last from 08:00am-midnight.
> > Some jobs may not be dispatched prior to this outage due to a
> > system reservation. This can be worked around by re-submitting
> > your job with a WALLTIME less than the amount of time prior to
> > the start of the outage.
> > Our apologies for the inconvenience,
> > -brandon
> > --
> > Brandon George, RHCE, CSSGB (email@example.com, 405.325.5113)
> > Manager of Operations
> > OU Supercomputing Center for Education & Research (OSCER)
> > University of Oklahoma Information Technology