Hadoop is a file system that allows the storage of any type of data, most of which would have been discarded in the past (because making it usable would’ve been too difficult and expensive). The value of big data and Hadoop comes through on-the-fly modeling of data that might actually be useful and which, when integrated with existing big data and analytics environment, can enrich business insights.
BIG DATA INTEGRATION: THE MOST IMPORTANT VARIABLE
Limited reusability is, to a large extent, a function of poor integration. In fact, integration may be the most important variable in the equation for big data success.
Forrester Research has written that 80% of the value in big data comes through integration. The big picture idea is that the highest value big data is readily accessible to the right users, and robust and clearly defined business rules and governance structures. Deeper data sets – legacy transactional data and long-tail customer histories – may only need reliable storage and robust data management, so data scientists and data explorers can review and model it when it makes sense to do so.
Big data integration is also about thinking big. In this instance, “big” actually means holistically, inclusively and multi-dimensionally. Dots must be connected, islands of data bridged and functional silos plugged into each other (if not broken down entirely).
High degrees of integration. Well designed ecosystems. Unified architectures. Data and analytics centricity. That short list doesn’t necessarily require every component or technical detail to make big data programs function. But certainly these are difference-making attributes that ensure big data programs work effectively.