In the effort to optimise available resources, especially the power of multi-core CPUs, virtualisation has all but eliminated the boundaries associated with physical machines. Within the box, multiple virtual machines can be established to take advantage of available CPU and memory resources.
One obvious consequence of eliminating physical machines is the elimination of physical connections, and while they can be virtualised, this ultimately means there are fewer pipes to go around. So, a solution that may be configured to support virtual machines with multiple cores and large amounts of system memory to support guest OSs and related applications can create bottlenecks in other parts of the datapath.
“IT departments are consolidating servers and implementing virtualisation to simplify their environments and lower the total cost of ownership. Some organisations are achieving consolidation ratios of as high as 100:1, and 20:1 consolidation is commonplace, using VMware and multi-core server technology.
“But success in reducing the number of servers, and the concomitant reduction in management and environmental costs, has a price: Each reduction in physical server presence also subtracts a number of NIC ports. A proliferation of virtual machines on a single server competes for that server’s resources, often resulting in I/O bottlenecks (Source: Dell Whitepaper, 10GbE, Servers, Storage and Virtualisation – Interoperability Review and Highlights, June 2009).”
Taking dozens of underutilised physical servers and consolidating them onto a single physical server doesn’t just increase the CPU utilisation; it increases storage I/O requirements as well. The mixture of unrelated workloads on a single machine makes the I/O considerably more random and increases storage performance requirements.
HDDs Falling Behind
In traditional data storage models, the hard drives can often present a performance bottleneck. RAID controller DRAM cache can help, but generally, overall performance remains limited by the spinning media. One solution has been to over provision hard drives, buy more than is needed by a strict capacity budget calculation, and then set up the drives so that only the outermost sectors are used because those are the ones that spin faster. Hence, those are the ones where access to the data is fastest – known as “Short Stroking” the drives. The problem is that this solution can be expensive and often ineffective.
Scaling the Data Path Up and Out
Locally, 10GbE addresses many of those pressures on the front end, and SAS technology – both 6Gb/s and the soon-to-be-released 12Gb/s version help deliver on the back end, but there is a lot more that can be done to keep data flowing at a balanced rate. At present, host-based RAID suppliers such as LSI are seeing a rapid migration from the older 3Gb SAS products of the early-2000s to 6Gb/s SAS. The timing of this transition seems to coincide with the uptake in server virtualisation and migration to 10GbE. Not far behind, products implementing the 12Gb SAS interface are on the horizon.
Switching to SAS Switching
Based on today’s 6Gb SAS technology, LSI and others have introduced SAS switching for storage area networks and scale-out capacity expansion. A SAS Switch can take advantage of end-to-end 6Gb SAS performance and extend it through the implementation of wide SAS ports to combine SAS channels for higher, aggregated performance between a server and its external JBOD storage racks.
Solid State, but Smarter
This is where solid state storage comes in, which is still very expensive compared to traditional HDDs. We’ve all read the studies or seen the tech news feeds that tell us about the very heavy use of very small portions of data we typically generate – the notion of “hot data.”
In a recent podcast, Simon Johnson, data recovery practice lead at Glasshouse Technologies, stated, “We’ve observed in a number of profiles, that 90 percent of all stored data is not touched during a three-month period, and beyond that, the industry figures that 70 percent to 80 percent of data in organisations is inactive or never accessed.”
One solution that has emerged recently involves building entire arrays out of solid state storage. However, if only a small portion of the data generated is actually used, then it becomes hard to justify the cost, just in case you might need fast access to some small percentage of the data stored there. SSDs are still significantly more expensive per GB than HDDs, though this is improving as volumes grow.
So we’re left with this situation: Small amounts of data are hot. Over provisioning HDDs is expensive and not terribly effective. SD arrays are fast, but expensive, and we are still only accessing a fraction of the data.
The Middle Way
As the Buddha has instructed, let us take the middle path. A number of storage vendors offer intelligent, dynamic software optimisation tools for converting a small number of SSDs into a dedicated cache pool for reading and writing hot data. This middle path can provide the performance boost needed to keep up with the order-of-magnitude increase in front side network speeds, from 1Gb to 10Gb. In some cases, improvements up to 13X have been demonstrated, mapping nicely to the performance gains suggested for 10GbE.
This intelligent, dynamic caching software can be paired up with solid state storage behind a disk interface, so-called SSDs, or with solid state storage that plugs directly into a PCIe slot. Some software runs on a specifically designed RAID controller to cache hot data in a direct attach storage (DAS) model, while other packages will run at a server level and dynamically cache hot data for any volumes and any virtual machines hosted.
To conclude, do you not agree that the current and anticipated improvements to the SAS interface, along with the options currently available in solid state storage and acceleration software, are excellent complements to the performance available from 10GbE pipes on the front end?