Move Completed! Apr 08, 2011 09:24
Another drive started failing in our old server, so we decided to push forward with our move as an emergency. As of now the move is fully completed and all of our DNS entries point to our new location.

Since we had to rush in moving over all our software, there may be pieces of AlliedModders that are busted. Please post here if you have any problems, or e-mail me at dvander-at-alliedmods-dot-net. Note that until your DNS updates, you will be going through a proxy and the site may seem slow.

This is a huge step forward for us that has been in planning for months, and I'm excited to see it come to fruition.

I would like to HUGELY thank Scott Ehlert (DS) for helping with the emergency migration, as well as revamping a bunch of our infrastructure in the process. I'd also like to thank MorgyN for helping diagnose a problem in our network configuration, and LumiStance for helping us get into the colocation game.

And last, all of our donors who made this transition possible. Thanks for your support!
.: by BAILOPAN 38 comments

We're Moving! Apr 02, 2011 22:30
Hello, everyone. Two months ago we suffered a pretty catastrophic disk failure, and since vowed to upgrade our infrastructure. Thanks to a successful donations drive, we're doing exactly that.

A few weeks ago we purchased three new servers. Today, pRED and I moved them into a collocation center, and over the next two weeks DS and I will be migrating all of our sites and services.

Things maybe be a little bumpy as IPs change and we take our old rented server offline. I'm planning for a few outages during non-peak hours this weekend and the next two weekends. I'll post news again once we're completely moved.

Thanks for your support!
.: by BAILOPAN 21 comments

Bugzilla Back Up Feb 26, 2011 21:40
The bug tracker is now working again! I've taken the opportunity to upgrade to Bugzilla 4.0 as well. If you have any problems, please post here.

If you've forgotten, the URL is https://bugs.alliedmods.net/

In other news, AlliedModders will be moving to a new home in a few weeks. I have ordered three new servers that we will be colocating in northern California. I will post more once we're close to the move.
.: by BAILOPAN 6 comments

500 Victories Feb 11, 2011 02:38
Tentatively, I'm declaring victory over the "HTTP 500 - Internal Server Errors" everyone has come to know and love. I know, I'll be sad to see them go too.

The actual problem, and the steps it took to find it, are in my latest blog post: http://www.bailopan.net/blog/?p=855

I'm going to continue to monitor for website problems over the next few weeks.

Note that major portions of our site are still down, for example, the bug tracker. As mentioned in the last post, we didn't have full backups, so I'm waiting for a professional recovery service to try and restore our data. Hopefully it won't be much longer, they claim to be wrapping up.

Once I know more, I'll post again about the recovery progress and future plans. I really appreciate everyone's patience and generosity. Thanks for your support!
.: by BAILOPAN 28 comments

Downtime Over Jan 22, 2011 08:11
Hello, Everyone.

I'd like to talk about what happened this week, what the current state of recovery is, and what you can do to help.

If you don't want to read, the bottom line is: we need your help! Our webserver has gone through a bit of shock.

The dirty secret is that we have always kept the donation goal significantly less than our actual costs. In the past, I felt like we should get by on what we can, without asking more of people. Perhaps that was the right attitude a few years ago, but now the community has really grown. That's awesome! But it means we have to be more proactive and responsible about our infrastructure.

So, if you want to help, please donate! We need to upgrade our hardware, backup capabilities, and more.

I'll be talking more over the next few weeks as we bring things online and start on longer-term improvements.

What Happened

Early Wednesday morning, all AlliedModders Websites became very slow. We'd come to recognize this as an intermittent problem, usually causing site errors, and always characterized by extremely high disk I/O wait times. What we didn't realize is that our primary hard drive had been failing, and on Wednesday it failed completely.

We did not have RAID, so it quickly became a worst-case situation. We had partial backups, but I didn't know what was included. The backup system wouldn't let me see without having a working operating system. So I decided the best decision was to keep the server offline, and try to copy as much data as I could before the drive completely failed. But, the drive quickly degraded so much that I decided it was best not to attempt anything further.

Meanwhile, the communication channel with our provider wasn't good. I now know how to deal with this better in the future, but suffice to say we wasted a lot of time. I didn't want to replace the drive without first securing physical ownership of the old one, in order to send it to a recovery service. We got that negotiated on Thursday night. Then we had the drive replaced and an identical one added for RAID-1.

Very, very early Friday morning, I reinstalled the operating system and restored our partial backups.

Recovery

The damage report is pretty good. Our partial backups had enough to restore:
  • Forums, avatars, most attachments (ONLINE)
  • SourceMod Sites (ONLINE)
  • Metamod:Source, AMX Mod X sites (ONLINE)
  • AM Wiki (ONLINE)
  • @alliedmods.net e-mails

Our partial backups did not include:
  • Bugzilla
  • @alliedmods.net e-mail service
  • WC3Mods
  • Superhero Mod
  • AMXBans
  • UAIO
  • CSDM/CS:S DM
  • Some forum attachments (possibly from Monday through Wednesday)

What was not affected:
  • Source Code repositories, hgweb (ONLINE)
  • Buildbots


This list isn't comprehensive. Our partial backups don't have anything that could otherwise be easily recovered, so a lot of our infrastructure may simply be broken. Files might be missing, pages might not work, services might be down, etc. I will try to list those in a second post, and cross them off as they come back online.

Why didn't you do X, Y, Z, etc?

I've gotten a lot of suggestions, rants, complaints from people about various things over the past few days. Why didn't we have RAID? Why didn't we do complete backups? Why don't we switch hosting? Some of it has been really helpful. I especially owe MatthiasVance, asherkin, devicenull and others in #smdevs and #sourcemod for their advice.

It's important to put this site into perspective. It started out of my first college dorm room. It was a computer sitting next to my desktop, made from scrap parts. When it broke, we had our first donations drive to buy a new server. In 2005, we started renting a dedicated server. There was no way I could afford it as a college student, and we worked out a deal with SteamFriends (then, GameConnect) to be sponsored. That ended in 2006.

We've always ran things on a tight budget, and our whole motif is kind of, "We're scrappy, but we get things done!" We didn't have any backups at all until 2008. Off-site backup charges by the GB, so I was pretty selective in choosing what to backup. We didn't have a drive fail until 2010.

But it's clear we as a community have grown really big, and that's awesome. We almost always meet the donation goal, which is a spectular testament to how much people care about the project. It sucks when things like this happen. So immediately, here's what I'm doing:
  • We now use RAID.
  • We will begin backing up things that were missed by the partial backup scheme.
  • The old drive is being sent to a data recovery service. Hopefully we can get more data back.
  • We will start running monitoring software to detect future problems.
  • The site will be bumpy over the next few weeks as little missing pieces are discovered.


Thanks for your patience and support. I'll answer questions in this thread, or e-mail if you're more comfortable through that.
.: by BAILOPAN 198 comments

1 ... 8 9 10 ... 30
© Copyright 2004-2024 SourceMod Dev Team