24 Hours of Server Hell

It started around 11:00PM last night. I installed the .NET Framework 3.5 SP1 on one of our servers (a web server) and rebooted. I was then unable to connect to that server. After trying to connect for many minutes, I remoted into another one of our servers (a db server). From there I was able to access the web server just fine. After trying many things, I was at a loss. The web server was no longer accessible via www, ftp, or rdp from outside the local network.

This is about when I called in the support team at our hosting provider. I opened a few tickets (several were already active from the alerts the down server was generating), explaining the situation. After going back and forth with techs and trying things for a few hours, I finally had to sleep at around 3:30AM.

At this point the db server was just fine.

This morning I woke to find that neither of the servers were accessible. I spoke briefly with one of of the support crew from the hosting provider, and with a coworker of mine, asking them to work together while I went to an ultrasound with my wife (it’s a boy!).

When I returned, I spoke to both our hosting provider and my coworker and was pretty disappointed. Basically they were at a loss. On top of that, from reading some of the support tickets, I saw that, somehow, my request for help after installing the .NET framework had been translated into “ignore those alerts the customer is working on that server” – and was ignored all night.

The tech at our hosting provider indicated that my options were to either “re-kick” the server, or restore a system state backup from the 13th for the web server while they looked into the db server. We agreed to having the system state backup restored.

A few minutes later, we got some very mixed news:

  • The good: the servers were both back up
  • The good: it was a firewall issue – rebooting our firewall solved the issue
  • The bad: it had taken roughly twelve hours to figure this out
  • The ugly: the system state backup had already been restored!

Things only got better from there (and by better I mean shoot me). At this point I started to investigate the state of the .NET 3.5 framework install after the system backup was restored. What I found was not good. I was unable to install the release at all due to the unknown state of the framework on the system. As I tried a few solutions I found online, the servers again become inaccessible.

Really? … Really??

At this point I contacted our hosting provider again and indicated they may want to keep looking into that whole firewall issue. They rebooted the firewall again to get the systems accessible as they, over the course of the next few hours, assessed, and eventually replaced, the firewall.

Things are fine connectivity wise now, but the .NET framework installs are now a mess. I used a .NET framework cleanup tool and was successful in reinstalling version 2.0 SP1 of the framework, but neither 3.0 nor 3.5 will install.

All because a firewall was going on the fritz.

I can’t wait to continue trying to recover from this tomorrow. Good thing these servers aren’t used for much at this point.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s