Few organisations today would deny the importance of their company data and reliability of storage technology as mission-critical. What defines a company’s ability to operate and offer a seamless service is the inherent reliability of the equipment and systems in place.
Equipment that works erratically or IT networks that fail consistently become a liability. For every hour a computer remains out of action due to data loss, there is a huge cost implication. In a recent independent survey it was estimated that the loss of 20 Mbytes of sales-related data costs an organisation $17,000 while the loss of 20Mbytes of data of a financial nature runs at around $19,000.
This article will examine some of the issues facing business as well as what organisations and government departments should consider when focussing on the reliability aspect of their mid-range storage solutions.
It is a well recognised fact that today’s open-systems environments create unique challenges for storage users. The 24/7 and 365 nature uptime requirements, coupled with the sheer volume of data throughput in today’s enterprise, put a huge focus on the right systems needed to support business. Furthermore, due to the unpredictability of capacity increases, there will also always be a demand for reliable cost-efficient and scalable mid-range storage solutions.
So, what are the top considerations when selecting mid-range storage for the enterprise?
When it comes to reliability often storage systems provide a required set of high availability features such as automated I/O path failover, redundant components, RAID protection, global hot spares and mirrored data cache with battery back-up.
Differentiation among these systems comes with additional aspects of the storage system design that can significantly improve data availability, integrity and protection. Some vendors have gone above and beyond the basic high availability features with technologies such as proactive monitoring, background repair, advanced protection and extensive diagnostic features to deliver 99.999% availability for uninterrupted access to data.
Having a Disaster Recovery plan is one thing, however, avoiding disaster in the first place should be the primary goal. RAID protects data on the disk drive in the event of failure, but why wait for failure? Some systems can now monitor the health of a disk drive and indicate when corrective action needs to be taken – proactively protecting the data. Correcting unrecoverable read errors can be a seamless process that carries on undetected by the application and/or administrator.
After too many recoveries a drive left unchecked can become degraded over time. In most cases, a drive will show warning signs before failing, however, not all storage systems are capable of looking for them. Proactive drive health monitoring examines every completed drive I/O and tracks the rate of drive reported error returned by the drives as well as drive performance and degradation often associated with unreported internal drive issues.
Using predictive failure analysis technology to indicate that a drive is showing signs of impending failure, systems can now issue a critical alert message and take corrective action deemed safe and necessary to protect the data.
I/O paths can also be continually monitored. I/O paths that require abnormal retry activities are marked as degraded and I/Os are discontinued down that path. The administrator is alerted and is able to repair the defective path, assuring continued availability. At the same time performance is optimised as the controller does not spend time attempting I/Os on a failing path.
Background detect and repair drive errors
When a bad data block is discovered during a read operation most enterprise-class storage systems use redundancy data to recreate lost data. Encountering one of these uncorrectable errors during a failed drive reconstruction can, however, cause disaster. User-initiated background media scans proactively check drives for defects and initiate repairs before they cause a problem.
Advanced protection features
When data is trusted to your storage system, protecting its integrity and security is vital. Today there are a number of key technologies that go above and beyond other offerings in this respect. For example, an additional level of data integrity verification can be provided by using RAID redundancy information to perform a final validation check before returning the requested information to the host application.
Data security is equally important; a company’s data is one of its most valuable assets. It is particularly difficult as well as critical to protect data-at-rest, i.e. data stored on a hard drive or other storage device.
All disk drives eventually leave the data centre whether for repair, service, theft or disposal and most drives that leave the data centre are operable and readable. Encryption services available in some storage systems today combine local key management and drive-level encryption for comprehensive data security that ensures data protection throughout the lifecycle of the drive, without sacrificing performance or ease-of-use.
Redundant components ensure data is accessible when components fail or become degraded. The longer it takes to diagnose a failure or problem, the longer the overall system performance is degraded and data put at risk.
Instant notification of failing devices enables them to be quickly and efficiently replaced and repaired. The collection of extensive diagnostic and statistical data provides comprehensive fault isolation and simplifies analysis of unanticipated events. For example, the Capture All Support Data (CASD) command offered by some of the storage systems based on LSI technology provides 15 different diagnostic and log outputs in a single package, ensuring that the support team has information needed to resolve unanticipated issues in a timely manner.
Site disasters range in scale but can have the same end result i.e. data loss. How critical that data is, determines its recovery point objective (RPO) and recovery time objective (RTO). Multiple replication options are now available that provide multiple replication options designed to protect against a range of disasters and ensure data is back online as quickly as possible. Local replication features protect against accidentally deleted files and data corruption, while remote mirrors duplicate primary site data to an off-site location.
Uninterrupted access to information and its unwavering protection is critical to a company’s health and success. Mid-range storage often forms the very backbone of an organisation’s ability to transact and back-up its important data, and this could be the difference that enables a company to survive and prosper. Data storage has three main elements including availability, integrity and security, and this article has discussed some of the features that users should consider in order to ensure optimisation of each of these.