US airline Delta has revealed that the total cost of its recent five-hour IT outage (which saw over 2,000 flights either cancelled or severely delayed over a three-day period in August) was approximately $150 million USD. This figure shows how much of an impact even a small amount of IT downtime can have on an organisation and should serve as a stark warning to others, according to Peter Groucutt, managing director of disaster recovery provider Databarracks.
Groucutt states that it is imperative businesses take action to ensure that effective disaster recovery and business continuity plans are in place to avoid a similar outcome:
“When it comes to air travel, arguably the biggest impact of downtime is the knock-on effect it has. The cancellation of a single flight affects availability of aircraft and crew, and also causes a ripple effect amongst other scheduled departures. Issues can take days to resolve as we saw with Delta.
“While Delta haven’t disclosed the breakdown of the $150 million cost, there are a number of items this would have included. Tangible costs are likely to include lost revenue, refunds paid to customers, the cost of extra staff to resolve the crisis and also fines and penalties from regulators. On top of this there are those intangible costs such as reputational damage and defected customers. When you consider that the outage was for only for five hours, it shows the impact IT downtime can have.”
Groucutt continued: “Ultimately, this outage demonstrates how dependent we are on our IT systems. An effective system can streamline processes, make significant cost savings and dramatically improve productiveness across a business, but this dependency is a double-edged sword. In the case of any downtime across our IT systems, the costs to the business is much greater.
“Last week we saw British Airways (BA) hit by an IT outage across its check-in systems, and staff were forced to resort to manual processes for checking in passengers, such as hand-writing boarding passes. Arguably BA were lucky they were able to do this, as our reliance on IT now means that in many cases during an outage there are far fewer manual jobs that employees can do to remain productive.
“To address this it’s critical that regular testing is carried out across your disaster recovery process in order to identify bottlenecks which can slow down recovery. During testing you should throw very specific scenarios at the plan and see how you would cope. This helps to identify any gaps you may have in your plan.
“Realistic Recovery Time Objectives (RTOs) should be written into your DR plan and, if disaster does strike, these should be strictly adhered to in order to avoid costs spiraling further out of control.” Groucutt concluded.