With last week’s Moody’s credit rating downgrades, a continuing crisis in the Eurozone, and stuttering consumer confidence, if you are a big bank, there really couldn’t be a worse time for your IT systems to crash than now. But that’s what has happened at the RBS Group. Systems affecting millions of customers at RBS, Ulster Bank and Nat West Bank crashed, causing transactions to be delayed for days, and an outburst of public and media criticism.
The RBS Group IT glitch is not alone of course, and IT related operational problems continue to plague household names. For example, banking giants HSBC faced a similar problem in May 2012; RIM had to offer a $100 application voucher sweetener to customers who had endured a 3 day network outage, while Dixons Group web sites crashed for 48 hours over the 2010 holiday period.
So what happened? The RBS error was apparently caused as a result of an upgrade to the mainframe batch scheduling system (Phil Virgo, Computer Weekly, June 24th). Vanson Bourne’s April 2012 CIO survey suggests that this is in fact an industry problem that many organisations could face when tackling application complexity:
- Nearly one fifth (18%) say their app portfolio contains legacy apps that no one knows how to update and that they are afraid to touch
- Nearly one fifth (18%) say their app portfolio contains redundant apps that are eating up unnecessary MIPS – but that they don’t have a means to identify and / or retire them
Without knowing any further specifics, it is difficult to pin-point the root cause, or to offer a definitive solution. We would always recommend a review of the following four key areas to help isolate and resolve such major glitches.
First – know your systems
Unexpected results happen all too often because the impact of a change has not been fully anticipated. This suggests either a lack of understanding on the part of those responsible, but also a lack of supporting technology to help establish the right level of insights into the impacts of change. Educated and trained staff is a step forward, but using automated and comprehensive analysis technology for the whole application portfolio help determine the impacts of major system changes.
Second – empower your IT teams
While system changes may be made in isolation, they usually have an effect on other systems, and a variety of users. Enabling the developers to work together so they are able to check across application boundaries and test impacts across other teams allows them to execute changes with more confidence and quality.
Third – improve the quality and delivery process
Testing mainframe changes, especially in an outsourced environment, puts testing teams under stringent timeframes and resource constraints. Additional checks or quality procedures outside those preset constraints are difficult to allow for, as there is a direct mainframe cost implication. By taking some of the mainframe testing processes away from the mainframe environment it will provide greater flexibility and cost-effectiveness when additional quality checks need to be implemented at low-cost.
By focusing on the operational requirements of an already-streamlined mainframe IT team, you can identify potential bottlenecks and process inefficiencies that put the delivery of robust services at risk, hopefully avoiding any high-profile disasters that the organization can ill-afford.