Large storage systems, holding what is commonly known as Big Data, do not yield to data recovery efforts. Costs aside, turnaround time for a recovery exceeds what is acceptable for business, even for Small Big Data cases. There are no quick repair options, and the standard approach of making a new copy is costly and lengthy. Even worse, the complexity of a Big Storage System often thwarts any kind of recovery.
Big Data Recovery
From the practical business standpoint, Big Data is not recoverable after a data loss. Once the data store goes down, that is the end of it. Surprisingly, the problem is not that Big Data is technically not recoverable. Irreversible damage does happen, maybe more often with Big Data than with home users and small systems, but that is not a big deal. The most significant aspect of Big Data recovery is that turnaround times for recovery, often measured in months, are not good enough to be practically useful. Let’s take a look into whys.
In-Place Repair Is Impossible
Filesystems used in Big Storage Systems are not repairable in-place. With a storage that big, you cannot repair a damaged filesystem to make it work as before. With NTFS, you are looking at CHKDSK (built-in repair tool) run times of weeks. With ReFS, there is no CHKDSK, but at least ReFS has the advantage that if you worked with a mirror or other redundant storage, or if integrity check on a simple space was enabled, you can determine what files are damaged. This feature is not provided for other common filesystems by regular means (ZFS being a notable exception).
Time Is Actually Of More Concern Than Cost
Regular filesystem drivers are optimised to provide both maximum speed and parallel operations. This way, normally the filesystem performance is limited by the bus throughput. In contrast, data recovery tools are designed in such a way to provide the greatest possible stability rather than throughput because it is known in advance that they will have to deal with inconsistent filesystem data.
All this leads to the fact that the data recovery industry lags behind a growth of data volume and performance of disk subsystems. While a regular filesystem driver reads data at bus speed, typical data recovery tool reads data at a speed of a single drive, which is about 50-60MB/s for a rotational hard drive and 200-300MB/s for an SSD.
At best, copying of recovered data takes five hours per terabyte. Note that often you have to multiply this time by two, for example, when you are dealing with a significantly damaged filesystem. In such cases it is required to scan all the data capacity twice – first when analysing and then when copying recovered data.
In my company, we had a case involving a ReFS storage slightly upwards of 100TB – not too large a storage by cloud’s standards. The recovery, while reasonably successful, took a couple of months. If you need to recover 100TB storage, preparation and setup alone take at least three days and material costs for equipment start around $3000-$4000 (3TB Toshiba hard drives at a price of $120 per each plus $740 for HighPoint RAID controller to connect the disks – NewEgg prices, October 2013).
Since we are talking about a business, we can assume this is not too high a cost, but on the other side, are there businessmen who do not care about this? However, a material aspect in this case is nothing compared to the time. You can outsource the recovery to a specialised lab and they will take care of the equipment, but there is no way to get the result any faster. Quite often by the time the recovery completes the restored data would no longer be relevant for you – the time has gone.
Sometimes Recovery Is Just Not Possible
Even not a tech-savvy person understands that Big Data requires uncommon methods of data storage. The capacity of stored data grows steadily leading to the development of new technologies to store and organise data. No more than a year ago, Microsoft came up with Storage Spaces technology allowing to combine hundreds of disks into pools and create storage units of up to half a petabyte.
In such a system, data is stored on the disks in a complicated manner and in case of failure probability to recover data is much lower than, say, for a single drive of a home user. ZFS filesystem from Sun Microsystems designed to store large amounts of data has the same problems with the recovery because of the complexity of its organisation.
With Big Data it is impossible to recover data both due to unacceptable recovery time and due to complexity of storage techniques. Thus, it is much better to invest money and efforts in the backup system and in the salary of a system administrator who will ensure that the storage system works properly rather than to try to recover data in case of a failure.