When getting to grips with your big data project, which will your organisation need, Hadoop or a data warehouse? Actually, that was a trick question. Not only can Hadoop and data warehouses work well alongside each other, organisations actually benefit from their collegiality.
Data warehouses are used to keep, and make useable, data sources in one central place. The data warehouse is best used to crunch your important, structured data and to store it where business intelligence tools and dashboards can find it easily. But it’s weaker and slower for analytic processing and some types of transformation.
That’s where Hadoop comes into its own. Hadoop allows large computer networks to destruct and parcel out large compute challenges. Although Hadoop is weak in interactive queries and data management, it’s good at gulping down your raw, unstructured, and complex data. Together, the two technologies form a symbiotic relationship.
Hadoop & Data Warehouses In Partnership
Over the years, organisations have built data warehouses to hold the ever increasing amounts of business data being generated. The success IT has had in capturing data has made a rich store available to help guide business decisions. Unfortunately, the volume of data available in the warehouse can have the potential to overwhelm the data explorer!
When it takes a significant effort to sift the data and come up with a result, it makes sense to retain the benefits of the effort. Imagine, for example, the data that business executives use to project their inventory needs for next year. The data set might be massive, and when the need is identified there could be too little time to model it, restructure it, or otherwise prepare it for the data warehouse. When the executives are done with it, perhaps in only a week, they’ll dispose of it. That’s when Hadoop can step up to store and refine the data and send a sample to the data warehouse.
Your big data isn’t a replacement for data warehousing and you needn’t consider it a siloed project. Big data is a part of the constantly evolving IT environment. You can and should use both Hadoop and data warehouses to manage your big data needs.
Enterprise data sets are large, detailed, and often complex. For the typical report recipient or spreadsheet user looking for the underlying patterns, this is a worst-case combination. Spreadsheets have a lot of power, but are hard to use against large, detailed data. Reports are often designed to serve as detailed lookups so they have rows and rows of summarised data, making it difficult to see overall patterns and trends.
With the right business intelligence capabilities integrated with your data sources, you can actually keep spreadsheets in their place and can be extracting data, running reports, creating interactive visualisations and telling stories with your data in a shorter time – without data fatigue. Investigate ways to give users the flexibility to display data in whatever format makes the patterns most visible. Consider working with people’s natural process of interacting with visual representations of data to perform the analysis. Not everyone is a ‘table head!’