Most predictions about the storage industry seem to be either in the weeds (”will FCOE take off – and what flavor?”) or driven by the most sensational news possible (”Cisco will buy EMC and Oracle will buy NetApp!”). This list strives to take a balanced view that takes positions on the key drivers transforming our industry. With that in mind, here are the first 5 of the ten most important trends in storage with predictions about what will change in 2011:
- Capacity demands will continue to grow exponentially
It is amazing how little this one is mentioned. Think about what happened to the bandwidth market when it was realized that internet usage was doubling every year. OK – that might be a painful reminder for those of us that experienced first hand the telecom bubble. The point is that one of the world’s largest industrieswas radically transformed. And yet…. commentators seem to assume that change in the storage industry will *slow down* now that consolidation has occurred despite 75% year on year on year on year growth in unstructured data demand.
- Storage contributes more profits to IT vendors than any other piece of the IT industry
Again, we don’t hear this one talked about a lot. But with analysts saying that storage is 42% of enterprise IT spending and growing and vendor margins in storage at least 3x-4x vendor margins in the server business, the math is pretty simple. There is more money being made by storage vendors than any other segment of enterprise IT. If I learned one thing in economics it is that just as nature abhors a vacuum, capitalist markets abhor extraordinary profits. Many storage buyers are able to read 10-Qs as well and can see that they are paying at least 4x the mark-up for storage that they pay for servers.
- Hardware commoditization
The hardware is getting much, much, much better in the server market – so much so that it has left the proprietary approaches to storage hardware behind. With SandyBridge from Intel the gulf will be oceanic in breadth. Hardware RAID will fade away thanks to SandyBridge based CPUs that for storage will be 10x or more faster than Nehalem based systems and that will include SAS-2 and 3 on the chip. Intel predicts that the vast majority of storage hardware will be based on SandyBridge. SandyBridge will also bring much less expensive 10Gbe; Intel predicts that 10Gbe will cross over with 1Gbe in 2011. And what Intel says – goes.
You also see solutions like the Storage Bridge Bay from Supermicro and others beginning to proliferate. These are industry standard form factors based upon commodity hardware and built to address storage requirements. They include HA in a box, duel everything with no single point of failure.
There will continue to be start-ups that exploit the newest hardware quicker than the legacy providers – but like smaller server vendors I predict that they will have a hard time differentiating themselves over the longer term. Buyers are already noticing that they are paying 5-10x as much for Intel and similar components when it comes in their array than if they buy them themselves. And they are motivated to figure this out by trends 1) and 2) above.
Buyers that are pushed by 1) and 2) to reevaluate their storage strategy are trying and buying NexentaStor based systems at a pace that has led us to grow at almost a 4x per year growth rate.
- Silent data corruption gets louder
A deep, dark secret in the storage industry is that over time systems suffer from bit-rot and other insidious, incremental failures that make it increasingly likely that your data just won’t be available when you need it. Actually commentators like Robin Harris have raised the alarm bell years ago – but little seems to be getting done about it. You can read Robin’s excellent post on the subject here. You can also read about silent data corruption reports on Amazon’s S3 here.
With disks and systems getting larger the math is not pretty – if you are an average storage user you are likely loosing data without being aware of it. The answer is to use 3) to answer 4) -> use today’s massively more powerful hardware to checksum everything, end to end, in order to find and then even correct the corrupted data. Today, as far as I know, ZFS is the only file system that is able to do this. All other solutions for data integrity occur at the higher levels of hardware RAID and cannot offer as airtight of a solution.
- Developers, developers, developers and the cloud
The cloud and open source software both are rising in importance because enterprises are increasingly listening to developers and/or because developers are tired of listening to corporate IT and are forging on to get their projects done now.
If you are a developer you are as likely to work for a large enterprise as you are to work for an IT company. We have dozens of customers that have 500+ developers. And my sense is that all of these developers are tired of having to fight for space on the SAN or NAS. Instead of waiting for corporate IT to allow them to do their jobs, they are downloading and building their own NAS, or buying a perfectly good one for less than the cost of a golf outing with their legacy SAN provider, or simply putting their code up in the cloud.
I had a large hosting company tell me in December last year that they ran the numbers and found that they could not purchase, buy and power on a legacy SAN (at a 55% street discount) for the cost that Amazon sells S3 services; their conclusion was that if they kept buying legacy storage that they would be out of business in a few years. As a result many, many hosting companies are looking for ways to run storage on industry standard hardware just like Google and Amazon; the problem is that Google and Amazon have thousands of the world’s top computer scientists. It turns out that building your own file system and the storage management services to turn that file system into something reliable and useful is extremely difficult.
We are seeing similar dynamics on the part of enterprise IT shops that see Amazon as a competitor. I met the week of Christmas with a gigantic enterprise who happens to be a leading publishing company; we chatted about the growth of unsanctioned cloud usage on the part of their 1,100 developers and the IT director pointed out that their bill back charges internally for storage are 40-50x as a high as Amazon.
snip —- that’s the top 5 storage industry drivers as I see them. I’ll pick up on the next 5 in a later post. In the meantime, what have I missed? Are commentators actually starting with the basics such as 1, 2 and 3 and making the right inferences? If so, which commentators do you recommend?