Sunday, January 27, 2008

HP C-Class Blades single point of failure

We powered down the SAN for maintenance about a week ago. Prior to that, we have to power down all the servers connected directly to the SAN first. We have a C-class blade enclosure with five servers currently. After the SAN upgrade was completed, we tried to power the enclosure back up. Guess what. The onboard administrator of enclosure failed to start up. We attempted to power up the server and it failed. We contacted HP support. The answer was a hardware failure. The part arrived at UPS store close to Pearson airport. We figured out we went to there to pick it up instead of waiting for them to deliver. Replace the onboard admin and all servers power up. Now, update with latest firmware and reload the config from backup. If you don't have config file backup, it won't take long to re-config it. So, make sure you save the config before power down the whole enclosure.

The HP support told me the servers would keep running if the onboard admin failed while the servers were powered on. However, if I powered down the servers, you could not power them back up until onboard admin was replaced. One thing forget to ask was "can I replace onboard admin while the servers are running".

No comments: