Lessons From The Great Malware Disaster Of 2011

Lessons From The Great Malware Disaster Of 2011

Submitted by Brian on January 27th, 2011 at 12:38 PM

[Note: iPhone app is currently broken; that is the #1 priority in terms of fixes. Hope to have it up by Monday.]

This has nothing to do with Michigan football but the least I can do to help the greater health of the internet is to offer some measure of advice for people who find themselves hacked in the face.

I'm not an expert. Please read the comments for people disagreeing with me, as they may/are better at this than I am. But I just went through this and if you're in the same boat here's what happened with me and what I took from it.


Boatmurdered. BURN. ALL BURN.

"Last known good" may not be as good as you think. We have a backup. That backup overwrites itself on a nightly basis. Correction: that backup overwrote itself on a nightly basis. Going forward we wanted to be able to roll back up to a week.

However, we found out that would not have helped us here. Some of our infected files were last modified in early January. A "last known good" configuration from last weekend would have still featured multiple scripts with backdoors that Eastern European hackers could jump in.

We're still going to change our backup system so that it has more snapshots—an injection attack would be more susceptible to a DB rollback, I think—and we are going to have a billion and two backups of the actual code so that if, God forbid, something like this happens again we can have a reference point to pull forward stuff we customized and don't want to lose.


BURN. ALL BURN. I'm not pulling anything forward except select bits and pieces I can hand-inspect. The rest of it dies in a fire. I thought we were destroyed until my brother asked "how long would it take to recreate it from scratch?" This was the moment in the movie when the camera zooms out and the city becomes transparent. It would take… um… maybe a couple hours. The defining feature of a CMS is that everything is in the database. So if you're confident the database isn't the issue you can pick that out, raze the world, download and install all your crap, and not have to worry about finding every last piece of corrupted code. You're going to break a few things when the new versions of your modules don't work exactly as expected but it's way better than the alternative.

Then change your FTP password over SSH. And then, if you're paranoid (ie: us now), turn FTP off entirely for a while. We had to use plain FTP, which is not very secure, because for some reason enabling encryption turned directory listing into a cripplingly slow process. A reader had related an experience in which a corrupted local computer had been giving away FTP passwords, giving hackers direct access to the server. We're not taking any chances despite my incessant scanning.

Burn, all burn exception: we pulled the "files" folder forward despite it being too massive to check because it's all data and those folders are locked down by server permissions so they can't execute anything. Everything else was pored over.

Why we thought it wasn't the database. Well, one, we found plenty of stuff indicating the server had taken a direct hit in the form of scripts that included helpful comments like "webshell by oRb." We brought those shells up and didn't find any database functionality.

Also, injection attacks usually don't affect the entire site—they're more likely to be hostile code submitted by users (something Drupal is good about) that affect only the pages they're submitted on. The malware was being delivered via the CSS and JS files, which are amongst the few bits of the page you're reading that don't come from the DB. While the server corruption could have in turn hit the DB, we didn't see obvious avenues for that and all of the problems were segregated from said DB.

We're now watching it closely just in case, but the evidence pointed to something other than an SQL injection.

What to search for. This article is fairly comprehensive but I'd also suggest looking for "unescape" or the string "%3C%69%66%72%61%6D%65." If you run that through the unescape function you get "<iframe". What are the chances that's helpful code? Not so good.

Don't waste your time with "StopBadware." This is the site you get funneled to if you click the I'm-so-screwed button on the Google warning page. Their extremely awesome advice is to look for the bad things and remove them. They list scripts, redirects, and iframes as the main ways you transmit the bad things—okay, probably helpful—and then offer this up:

There exist several free and paid website scanning services on the Internet that can help you zero in on specific badware on your site. There are also tools that you can use on your web server and/or on a downloaded copy of the files from your website to search for specific text.

Awesome! Where are they? Which are the best ones?

StopBadware does not list or recommend such services, but the volunteers in our online community will be glad to point you to their favorites.

Fu. The "online community" at "badwarebusters" mostly consists of people screaming about erroneous hits. About four threads pop up per day and they can go days without a response. If you're looking to do something quickly it's useless.

That's annoying. This is the worst advice possible:

Once you have located the code that is causing the badware behavior, removing it is often as simple as deleting the offending code from all files in which it appears. Sometimes, it is easier, if you have a clean backup of your site’s contents, to re-upload all of the site’s files, though be careful about overwriting files that may have changed since your last backup.

They've just glossed over the difference between the offending iframe and the code that generated it. Backdoors are not mentioned. This section needs to be replaced with:


Whoever wrote it should be horsewhipped. The next section is about "preventing future infection" when the previous section has essentially advised a n00b who needs to be informed that scripts and iframes are bad, mmmmkay, that "removing the offending code" "often" solves the problem. False. Burn. All burn. 

If you aren't already, sign up with Google's Webmaster tools. We first found out the aggregated JS file was an issue from them, and they periodically updated their findings to let us know we still hadn't killed the problems. Tip: if you're aggregating js and css you may want to stop for more precise identification of the end destinations.

These are not the sources. You have to find those, or just burn everything to the ground.

Don't get notifications other than security notifications. This site is now running dozens of Drupal modules, some of which actually have release changelists that read, in their entirety, "fixed typo X." After a while you stop checking just to see that some random module has done some stuff you don't care about, and then you don't know when certain modules are out of date. We're still not sure what the attack vector was but one of the main candidates was known, patched holes in Drupal. I went from weekly updates about everything to daily updates about security. Drupal shouldn't have other options.

Status. We're not entirely out of the woods yet but it's looking promising, and we have installed various alarms in the system to blare at us whenever anything unexpected (a file getting updated outside of the areas that's supposed to happen) goes down. Hopefully if there is another breach we will catch it long before anything starts getting delivered.