Archive for March 2014

The big issue in enterprise big data: linking massive number of data islands


Reports are coming on studies that suggest organizations started to see the opportunity and benefits of big data technology adoption for driving business decisions (Forbes on IDG survey). While IDG survey suggests that the investment related to big data analytics in the enterprise will increase steadily in 2014, other surveys still do not show signs of rapid growth in investments by organizations (CNN iReport on Bain & Company survey).

The real issue that may be underlying this observation is the nature of big data problem in the enterprise that have to be understood and addressed to support greater adoption. The enterprise big data is characterized as enabling analytical tools and technique to process large volume of data efficiently (mainly on top of the stack of HDFS, Hadoop, noSQL, etc.), however, arguments emerging that the enterprise big data problems is not about size, or even small data analysis is the next big thing, and the fact the loosely coupled small data could be more interesting aspect of big data in the enterprise.

Philippines-Hundred-IslandsWhile I strongly assert the importance of small data in the enterprise, I would go a step beyond by saying the big data problem in the enterprise today is how to make sense of massive number of data islands, a lot of small and some large, some centered around employees and generated by them, some shared in group settings using sharing and social media inside the enterprise, some stored in large enterprise application databases and document repositories and other information outside of the enterprise wall that the enterprise may care about to serve their customer better. The overarching problem in this context is how to link this data, interpret and understand it and make it available for data and business analytics purposes.

One trend to watch for in this space is development in the graph databases and graph knowledge representation, and how they are evolved to intelligently discover entities, and their relationships and make the graph available for analysis. The graph database providers are focused and advanced a great deal in improving the performance of data analysis on top of knowledge graphs, but more innovation needed on forming knowledge graphs over data islands.