I am in a graduate level course in business intelligence and dataming. For
my term project I want to run some data analysis using Hadoop and gain good
experience for when I graduate. I was led to using Cloudera Manager Free
4.0 and after some trials, I now have it installed on my own small server.
(Fujitsu Primergy MX130S2, 4gb total ram).
So whats next? Online searches either give me vague overviews of map/reduce
methodology but no step by step guides to running my first M/R job, such as
a word count, on the Cloudera system. [ example -
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/#running-a-mapreduce-job
].
Links to learning material would be great. Advice on how to integrate
non-cloudera tutorials onto the CDH distro.
Thanks -matt