At last the dust is settling on RBS’s recent computing problems: a fiasco which has seen people without access to their own money, and at least one person spending the weekend ‘at Her Majesty’s pleasure’ because his payment to the courts could not be verified. What can be learned by others whose business depends on the continued effective operation of their IT systems?
First, let’s look at the problem. It is reported to have been triggered by attempts to upgrade application software. Nothing unusual about that, as application software is being upgraded all the time, and frequently, despite extensive testing, it fails to work first time round in the live environment.
Problems occur in such cases when there is an inability to return the systems swiftly and effectively to a safe operating position afterwards. And if this really was part of the failure at RBS, it could have led to the backlog in processing credits to customers’ accounts; a fairly straightforward but high volume process with no room to catch up if you miss one, two or three nights’ processing.
In computing, the golden rule before considering any changes is to secure your current position (system-wise) and make sure you can easily back out your changes, return to the original position and carry on processing as before if anything goes wrong. These are the fundamental principles of change management, a process that companies who are dependent on IT spend millions on to guard against situations like this occurring.
The worst possible IT disaster is that something should wipe out your data centre. But even in these cases companies work to a 48 hour disaster recovery strategy to have their key systems up and running again.
So why would a fairly predictable failure in an upgrade to a bank’s application software apparently have resulted in no easy, safe and secure point to return to, and an outage longer than they would expect to endure even if the whole data centre had been wiped out?
The information in the media is scant at best. Whatever the underlying problem, it appears that those involved in resolving it are keeping their cards close to their chest. Generally in such situations I think it is safe to assume that away from the public eye some serious questions would be asked about how internal change management processes and, in particular backout, procedures can be improved.
Another consequence of the spotlight falling on the computer systems of one of the UK’s major banks, is that it has created a perception among the general public that many of the components of these systems are ‘old’.
It’s easy to associate ‘old’ (the mainframe workhorse that puts money into your account) with ‘bad’; and all those fancy new Web, hand held and tablet devices that allow you to manage your account and spend your money as ‘good’. And there has been some tendency to make this assumption in relation to the problems at RBS.
However, the truth is that all our major banks, and most other big companies and organisations, have ‘older’, mainframe technology at the heart of their computer systems, working hand in hand with the ‘new’.
In reality, while the mainframe platform is old in the sense that it has a long history, today’s mainframes use the latest hardware developments and technology; are vastly more powerful than those from the early days and are considered some of the most durable and cost-effective computing platforms available. So let’s not lay the blame on the platform.