|| at Feb 2, 2008 at 5:15 pm
I don't think that they would be all that difficult as long as you have a
large enough problem.
EM methods for discrete problems like HMM's as well as the closely related
variational Bayesian methods depend mostly on counting instances. Indeed,
Gibbs sampling on hidden variable techniques depend on the same sort of
thing. A good example is the Buntine and Jakulin paper on DCA.
Map-reduce is famously good at this sort of counting problem. In general
for methods analogous to EM, you will have a map-reduce step for the
estimation phase and one for the maximization phase. Both steps are very
much like word counting except that it just takes a bit of math to figure
out which words you think you are counting.
Just like with word counting, if you are doing a tiny example, MR will be
much slower. If you working on a very large problem, though, it can be much
On 2/2/08 3:43 AM, "edward yoon" wrote:
I thought of Hidden Markov Models (HMM) as absolutely impossible on MR model.
If anyone have some information, please let me know.
On 2/2/08, edward yoon wrote:
I read an interesting piece of information in that NISP paper, and i
was implemented but
Now, there's too much mailing-list for me to read.
Lucene, Core, Hbase, Pig, Solr, Mahout ..... :(
On 2/2/08, gopi wrote:
I'm definitely excited about Machine Learning Algorithms being implemented
into this project!
I'm currently a student studying a Machine Learning, and would love to help
out in every possible manner.
On Jan 25, 2008 5:55 PM, Grant Ingersoll wrote:
(Apologies for cross-posting)
The Lucene PMC is pleased to announce the creation of the Mahout
Machine Learning project, located at http://lucene.apache.org/mahout.
Mahout's goal is to create a suite of practical, scalable machine
learning libraries. Our initial plan is to utilize Hadoop (http://hadoop.apache.org
) to implement a variety of algorithms including naive bayes, neural
networks, support vector machines and k-Means, among others. While
our initial focus is on these algorithms, we welcome other machine
learning ideas as well.
Naturally, we are looking for volunteers to help grow the community
and make the project successful. So, if machine learning is your
thing, come on over and lend a hand!
Edward yoon @ NHN, corp.