I am bullish on Hadoop and other NoSQL technologies. Long-term I believe they will be instrumental in providing quantum leaps in efficiency for existing businesses. But even more, I believe that mainstream BigData will open up brand new opportunities that were simply unavailable before. Right now we focus on applying BigData to user activity and clickstream analysis. why? Because that’s where the data is. But that condition will not persist. There will be oceans of structured and semi-structured data to analyze. The chicken-and-egg situation with the tools and the data will evolve, and brand new application scenarios will open up.
So I’m Bullish.
On the other hand I don’t think Hadoop is ready for prime time today. Why? Let me count the reasons:
- The Foundations are not finished. The Hadoop community is still expending significant energy laying basic foundations. Here’s a blog post from three months ago detailing the internal organization and operation of Hadoop 2.0. Look at the raft of terminology this article foists on unsuspecting Hadoop novices: Applications Master, Application Manager (different!), Node Manager, Container Launch Context, and on and on. And, these are all problems that have been previously solved; we saw similar Resource Management designs with EJB containers, and before that with antecedents like Transarc’s Encina Monitor from 1992, with its node manager, container manager, nanny processes and so on. Developers (the users of Hadoop) don’t want or need to know about these details.
- The Rate of Change is still very high. Versioning and naming is still in high flux. 0.20? 2.0? 0.23? YARN? MRv2? You might think that version numbers are a minor detail but until the use of terminology and version numbers converges, enterprises will have difficulty adopting. In addition, the actual application model is still in flux. For enterprise apps, change is expensive and confusing. People cannot afford to attend to all the changes in the various moving targets.
- The Ecosystem is nascent. There aren’t enough companies that are oriented around making money on these technologies. Banks – a key adopter audience – are standing back waiting for the dust to settle. Consulting shops are watching and waiting. As a broader ecosystem of companies pops up and invests, enterprises will find it easier to get from where they are into the land of Hadoop.