GlowHost Web Hosting Forums  

Go Back   GlowHost Web Hosting Forums > Announcements > Outages and Scheduled Maintenance
Register Forum FAQ Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-20-2008
Matt's Avatar
chown -R us.us *YourBase*
 
Join Date: Jan 2005
Location: Moved to Florida!
Posts: 1,725
Rep Power: 10
Matt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of light
Default A big Power Blip at GNAX Datacenter

We still do not have official news as to what happened, why it happened, and what will be done to prevent it in the future.

What we do know is that there was a power outage that lasted for about 5 minutes at GNAX. Unfortunately when this happens it causes hard reboots on all of the devices affected. While 905% of the devices were overall, unaffected (except the reboot) some of them have gone into fsck which is something that Linux does to make sure that the filesystems are not damaged and repairs them if they are.

Right now Vern and Ratbite are the only 2 remaining servers in this status.

If you have any VPS, it means you are on Vern and it is running a disk check. Vern was also scheduled for another upgrade this week, but while this machine is down, we may take advantage of the downtime and get the upgrades installed on it this evening. We have to see what the workload looks like and what sort of damage we are looking at on the disks for this unit.

We have techs looking at both devices right now and will have them back online as soon as possible.
__________________
:::::
01001100 00110011 00110011 00110111
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 05-20-2008
jmarcv's Avatar
immoderate moderator
 
Join Date: Jan 2005
Posts: 297
Rep Power: 67
jmarcv is just really nicejmarcv is just really nicejmarcv is just really nicejmarcv is just really nicejmarcv is just really nice
Default From GNAX

5/20 Power Outage RFO
Severe storm cells came through North Georgia Region this evening. AtlantaNAP experienced an over current fault outage on one of our 2 main feeds. The feed is the original feed that has the most load currently connected to it. The amount of systems connected to the load is the amount of lightning and over current that will try to be passed to the system – i.e. if you don’t have very much load on it - like our new feed is currently only at 1/6th load - then current does not try to flow to it very much. Our first system is currently at 65% load so it tried to absorb much more of the lightning strike than the other one and hence the main breaker going into over current fault.

I have spoken with all of our key electrical engineers associated with the building at this point. According to Georgia power / our PSSI and Cummins engineers – we likely took a lightning strike to the utility very near the facility which caused an over current fault on our main incoming breaker on our first set of switchgear. The breaker is designed to trip in the event of this kind of fault to protect the gear (your computers) inside the building from being burned up by the lightning strike.

When this type of fault happens - the computer will not start the generators until an engineer verifies where the fault is. This is because a fault inside the wiring plant could also cause this kind of over current in the event of a main short if a feeder wire of main current in the building were to become damaged.
In that case it would be very dangerous to turn the power back on manually or to force a manual start of the gen sets and push current to the system with a fault remaining. Lives and machinery could be lost.

We dispatched several of our staff visually to inspect for faults – (we did not want to turn something on and have it fry everyone’s gear) and found none and verified it was likely a lightning strike and manually started the generators to restore power. Unfortunately the ups system is only designed to carry that load for 10 minutes which was not enough time for us to safely verify and do a manual start.

This is apparently a rare event – to get a direct utility strike like this – that close that does not get dissipated before it hits us. The farther away from your site the strike occurs - the more other load and grounds it has to dissipate before it gets to you.

The good news is we did not burn up any equipment.

Some of you did not lose power because you were connected to the other lightly loaded feed coming in and it was not enough load source to overwhelm the breaker since it is only 18% loaded at this point.

Some of you lost network connectivity because downstream feeder switches that your computers are connected to are only single power supply units.

We are in the process of examining a facility wide network upgrade that will move to a newer chassis based solution throughout the facility - we started looking at this as a way to offer new services capability that many f you have been asking for - it is a costly upgrade and will bring redundancy but also brings some pitfalls as well since you have more connections into a single chassis. We are still looking at this currently and will keep you up to date as to the direction we decide to move.

They have told me that under normal operating conditions there is really nothing we could have done and we should simply be glad we had good equipment installed that kept our computers from being fried.

I am thankful that I am not looking at a lot of damaged equipment that could not simply be turned back on - that would be a disaster I do not want to deal with. At this point it seems like the new switchgear with over current protection was a good investment.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 05-20-2008
Matt's Avatar
chown -R us.us *YourBase*
 
Join Date: Jan 2005
Location: Moved to Florida!
Posts: 1,725
Rep Power: 10
Matt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of light
Default

From the techs at the DC:

Vern is running a fsck onthe VZ directory. Ratbite is up but cannot get it to network to the outside they are checking on it now.
__________________
:::::
01001100 00110011 00110011 00110111
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 05-20-2008
I am a Glowru! (brownie points fo' shizzle)
 
Join Date: Jan 2008
Location: Dallas, TX
Posts: 79
Rep Power: 15
omarfilip is on a distinguished road
Default

Any news why vern is taking so long to come online?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 05-20-2008
Matt's Avatar
chown -R us.us *YourBase*
 
Join Date: Jan 2005
Location: Moved to Florida!
Posts: 1,725
Rep Power: 10
Matt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of light
Default

Fsck takes a long time on VPS because VPS servers are usually:

very large sites + an OS + cPanel * "X" servers

which take up a lot of disk space that need to be checked. Basically its like checking 15 to 20 or so dedicated servers and it cannot come back online until Fsck checks them all.

Most people on VPS are on the upgrade path to dedicated and that is why they have such large sites.

It is one of the largest drawbacks to VPS hosting IMHO.
__________________
:::::
01001100 00110011 00110011 00110111

Last edited by Matt; 05-20-2008 at 11:58 PM.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 05-21-2008
Junior Guru
 
Join Date: Oct 2005
Location: California
Posts: 48
Rep Power: 0
Websync is on a distinguished road
Default

How long is long? Plus, how long have they been down? The one day I decide to take a few hours off.

Curious minds want to know if Vern will be back up and running by the start of business tomorrow - PST.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 05-21-2008
I am a Glowru! (brownie points fo' shizzle)
 
Join Date: Jan 2008
Location: Dallas, TX
Posts: 79
Rep Power: 15
omarfilip is on a distinguished road
Default

Down since 7:24 PM Central time
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 05-21-2008
Junior Guru
 
Join Date: Oct 2005
Location: California
Posts: 48
Rep Power: 0
Websync is on a distinguished road
Default

Thanks for the reply omarfilip.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 05-21-2008
Matt's Avatar
chown -R us.us *YourBase*
 
Join Date: Jan 2005
Location: Moved to Florida!
Posts: 1,725
Rep Power: 10
Matt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of light
Default

As you may have noticed vern is finally done with fsck. If any of you have problems please open a ticket. Alex is standing by waiting for your problems. We would have updated you a little sooner about the server being online but we were busy putting out other fires.

I am sure most of you noticed the machines back online. Sorry that it took so long to respond.
__________________
:::::
01001100 00110011 00110011 00110111
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 05-21-2008
Matt's Avatar
chown -R us.us *YourBase*
 
Join Date: Jan 2005
Location: Moved to Florida!
Posts: 1,725
Rep Power: 10
Matt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of lightMatt is a glorious beacon of light
Default

Still working on getting parts for Ratbite. Looks like some of them were hosed in the power fault. We haven't forgotten about you. We will try to get it solved before the phones start ringing for you.
__________________
:::::
01001100 00110011 00110011 00110111
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -5. The time now is 05:25 PM.


Powered by vBulletin® Version 3.7.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO
Copyright 2000-2007 GlowHost.com