I was thinking lately how lucky we are to be the graduates of the 1.5/2.0 bubble times.
I'm thinking of things we did on the early 2000 and the challenges that we faced on the technological front and how handy it all became these days. As example, I think of the the time (and $$$) we invested in Shopping.com building a scalable and fast search engine. Now days you get ~95% of it free from open source products.
For Start-ups this is truly a life saver. You want to invest your developers time generating your core technology and not throwing your investors money on building infrastructures.
As promised in my last post, I want to share my thoughts about the Hadoop project.
The project started as part of the Lucene Nutch project and greatly grew as Google unveiled their GFS and MapReduce white-papers. I'm not going to tell you all about Hadoop now but if you are interested there is a great presentation about it here.
The big advantage of Hadoop and it's sub projects is by the way they make handling *huge* amount of data easily handled and in a cost effective way. Imagine that you have an 'unlimited' storage to store big amount of data. Your data is redundant and fail safe and you can process this data in parallel processes using many machines simultaneously.
Basically that's one of the biggest gifts we got from Google and the moon-lighters from apache hadoop community.
But... this was not enough in order to make something widely used. To achieve that goal you need to make things easy to use. Writing Map Reduce tasks in Java or C++ is nice challenging task for engineers that want to sharpen their brain with the Map Reduce way of thinking but this will not make the framework a commodity that everybody can use. So people invented human friendly languages like Pig (from Yahoo!), Hive (from facebook) and Cloudbase that make it a lot easier for Analysts, without high programming skills, to access the Big Fat Data and process it easily.
It's great when really big things become so easy to use and... for free. First it was Linux vs. paid OS then MySql vs. paid DB and now Hadoop vs. investing a lot of engineering effort. I assume this is just the beginning of many great things we will see on top of this infrastructure.