Please ensure Javascript is enabled for purposes of website accessibility

3 Epic Fails of Backup and Recovery

Auditors — especially seasoned IT auditors — are familiar with the scene of sitting with a nervous backup and recovery analyst once a year for a walkthrough. If you haven’t had the pleasure of experiencing it, here’s the drill:

It begins with rapport-building (read: uncomfortable) small talk mixed with a few questions about the overview of the company’s backup strategy. Next, the auditor observes the analyst painstakingly capture a sampling of automated backup jobs using the Snipping Tool or print screen key. More niceties are exchanged and the meeting is over. Kid stuff, right?

Well, that is until you see stories about hardware failure with data loss. Turns out, we are really good at capturing periodic backup files… but not so great at recovering and restoring the data. It's pretty embarrassing.

Here are three noteworthy backup and recovery snafus worthy of a prize:

Lock it up, Lockeed!

Most recently, a database run by Lockheed Martin went kaput in May and the company spent two weeks trying to recover data. On June 6, the company gave up and notified the U.S. Air Force (whose data was on the database). According to CNET, “More than 100,000 internal investigations records dating back to 2004 have been lost.”

Salesfail.com

Also in May, Salesforce.com had a nightmare of a day after experiencing a "file integrity issue […] resolved by restoring the database from an earlier backup” according to an EWeek article. The company had to break it to the public that they had about 5 hours of permanent data loss. Yikes! Maybe we need to run an incremental backup job more than once a day…

Lost in the Amazon jungle

Or, who can forget that time in 2011 when Amazon lost its customers’ stored data while they thought it was safely tucked away on a Amazon cloud server… maybe that’s one of the reasons some people are still wary of cloud storage technology.

My question… who did the IT audit of these systems? Or, maybe the better question is: Were these systems in scope? Most likely, no (especially if not in scope for SOX compliance). 

Should they have been? Maybe… and of if any of these system were in-scope for SOC 2 they would have been tested. Remember, that’s the only time we care about information availability (or lack thereof).

Plus, I wonder how many of these incidences have gone unreported in the media. My guess is a ton!

Nevertheless, I used to think it was silly to set up a walkthrough for backups. Honestly, determining if a scheduled backup conforms with the policy document didn’t seem like the most fruitful exercise, especially not when you compare it with privileged user access. But maybe it is a big deal after all.

That leaves only one question: Should auditors be the de facto watchdog for information system data loss? Are we even qualified for the task? Share your thoughts.

Image: Someecards