FAQ
(Apologies for cross-posting)

The Lucene PMC is pleased to announce the creation of the Mahout
Machine Learning project, located at http://lucene.apache.org/mahout.
Mahout's goal is to create a suite of practical, scalable machine
learning libraries. Our initial plan is to utilize Hadoop (http://hadoop.apache.org
) to implement a variety of algorithms including naive bayes, neural
networks, support vector machines and k-Means, among others. While
our initial focus is on these algorithms, we welcome other machine
learning ideas as well.

Naturally, we are looking for volunteers to help grow the community
and make the project successful. So, if machine learning is your
thing, come on over and lend a hand!

Cheers,
Grant Ingersoll

http://lucene.apache.org/mahout

Search Discussions

  • Bradford Stephens at Jan 25, 2008 at 5:36 pm
    Quite an interesting initiative -- I'll keep my eye on it!
    On Jan 25, 2008 4:25 AM, Grant Ingersoll wrote:
    (Apologies for cross-posting)

    The Lucene PMC is pleased to announce the creation of the Mahout
    Machine Learning project, located at http://lucene.apache.org/mahout.
    Mahout's goal is to create a suite of practical, scalable machine
    learning libraries. Our initial plan is to utilize Hadoop (http://hadoop.apache.org
    ) to implement a variety of algorithms including naive bayes, neural
    networks, support vector machines and k-Means, among others. While
    our initial focus is on these algorithms, we welcome other machine
    learning ideas as well.

    Naturally, we are looking for volunteers to help grow the community
    and make the project successful. So, if machine learning is your
    thing, come on over and lend a hand!

    Cheers,
    Grant Ingersoll

    http://lucene.apache.org/mahout
  • Gopi at Feb 2, 2008 at 8:09 am
    I'm definitely excited about Machine Learning Algorithms being implemented
    into this project!
    I'm currently a student studying a Machine Learning, and would love to help
    out in every possible manner.

    Thanks
    Chaitanya Sharma
    On Jan 25, 2008 5:55 PM, Grant Ingersoll wrote:

    (Apologies for cross-posting)

    The Lucene PMC is pleased to announce the creation of the Mahout
    Machine Learning project, located at http://lucene.apache.org/mahout.
    Mahout's goal is to create a suite of practical, scalable machine
    learning libraries. Our initial plan is to utilize Hadoop (
    http://hadoop.apache.org
    ) to implement a variety of algorithms including naive bayes, neural
    networks, support vector machines and k-Means, among others. While
    our initial focus is on these algorithms, we welcome other machine
    learning ideas as well.

    Naturally, we are looking for volunteers to help grow the community
    and make the project successful. So, if machine learning is your
    thing, come on over and lend a hand!

    Cheers,
    Grant Ingersoll

    http://lucene.apache.org/mahout
  • Edward yoon at Feb 2, 2008 at 11:34 am
    I read an interesting piece of information in that NISP paper, and i
    was implemented but

    Now, there's too much mailing-list for me to read.
    Lucene, Core, Hbase, Pig, Solr, Mahout ..... :(

    Too distributed.
    On 2/2/08, gopi wrote:
    I'm definitely excited about Machine Learning Algorithms being implemented
    into this project!
    I'm currently a student studying a Machine Learning, and would love to help
    out in every possible manner.

    Thanks
    Chaitanya Sharma
    On Jan 25, 2008 5:55 PM, Grant Ingersoll wrote:

    (Apologies for cross-posting)

    The Lucene PMC is pleased to announce the creation of the Mahout
    Machine Learning project, located at http://lucene.apache.org/mahout.
    Mahout's goal is to create a suite of practical, scalable machine
    learning libraries. Our initial plan is to utilize Hadoop (
    http://hadoop.apache.org
    ) to implement a variety of algorithms including naive bayes, neural
    networks, support vector machines and k-Means, among others. While
    our initial focus is on these algorithms, we welcome other machine
    learning ideas as well.

    Naturally, we are looking for volunteers to help grow the community
    and make the project successful. So, if machine learning is your
    thing, come on over and lend a hand!

    Cheers,
    Grant Ingersoll

    http://lucene.apache.org/mahout

    --
    B. Regards,
    Edward yoon @ NHN, corp.
  • Edward yoon at Feb 2, 2008 at 11:43 am
    I thought of Hidden Markov Models (HMM) as absolutely impossible on MR model.
    If anyone have some information, please let me know.

    Thanks.
    On 2/2/08, edward yoon wrote:
    I read an interesting piece of information in that NISP paper, and i
    was implemented but

    Now, there's too much mailing-list for me to read.
    Lucene, Core, Hbase, Pig, Solr, Mahout ..... :(

    Too distributed.
    On 2/2/08, gopi wrote:
    I'm definitely excited about Machine Learning Algorithms being implemented
    into this project!
    I'm currently a student studying a Machine Learning, and would love to help
    out in every possible manner.

    Thanks
    Chaitanya Sharma
    On Jan 25, 2008 5:55 PM, Grant Ingersoll wrote:

    (Apologies for cross-posting)

    The Lucene PMC is pleased to announce the creation of the Mahout
    Machine Learning project, located at http://lucene.apache.org/mahout.
    Mahout's goal is to create a suite of practical, scalable machine
    learning libraries. Our initial plan is to utilize Hadoop (
    http://hadoop.apache.org
    ) to implement a variety of algorithms including naive bayes, neural
    networks, support vector machines and k-Means, among others. While
    our initial focus is on these algorithms, we welcome other machine
    learning ideas as well.

    Naturally, we are looking for volunteers to help grow the community
    and make the project successful. So, if machine learning is your
    thing, come on over and lend a hand!

    Cheers,
    Grant Ingersoll

    http://lucene.apache.org/mahout

    --
    B. Regards,
    Edward yoon @ NHN, corp.

    --
    B. Regards,
    Edward yoon @ NHN, corp.
  • Ted Dunning at Feb 2, 2008 at 5:15 pm
    I don't think that they would be all that difficult as long as you have a
    large enough problem.

    EM methods for discrete problems like HMM's as well as the closely related
    variational Bayesian methods depend mostly on counting instances. Indeed,
    Gibbs sampling on hidden variable techniques depend on the same sort of
    thing. A good example is the Buntine and Jakulin paper on DCA.

    Map-reduce is famously good at this sort of counting problem. In general
    for methods analogous to EM, you will have a map-reduce step for the
    estimation phase and one for the maximization phase. Both steps are very
    much like word counting except that it just takes a bit of math to figure
    out which words you think you are counting.

    Just like with word counting, if you are doing a tiny example, MR will be
    much slower. If you working on a very large problem, though, it can be much
    larger.

    On 2/2/08 3:43 AM, "edward yoon" wrote:

    I thought of Hidden Markov Models (HMM) as absolutely impossible on MR model.
    If anyone have some information, please let me know.

    Thanks.
    On 2/2/08, edward yoon wrote:
    I read an interesting piece of information in that NISP paper, and i
    was implemented but

    Now, there's too much mailing-list for me to read.
    Lucene, Core, Hbase, Pig, Solr, Mahout ..... :(

    Too distributed.
    On 2/2/08, gopi wrote:
    I'm definitely excited about Machine Learning Algorithms being implemented
    into this project!
    I'm currently a student studying a Machine Learning, and would love to help
    out in every possible manner.

    Thanks
    Chaitanya Sharma
    On Jan 25, 2008 5:55 PM, Grant Ingersoll wrote:

    (Apologies for cross-posting)

    The Lucene PMC is pleased to announce the creation of the Mahout
    Machine Learning project, located at http://lucene.apache.org/mahout.
    Mahout's goal is to create a suite of practical, scalable machine
    learning libraries. Our initial plan is to utilize Hadoop (
    http://hadoop.apache.org
    ) to implement a variety of algorithms including naive bayes, neural
    networks, support vector machines and k-Means, among others. While
    our initial focus is on these algorithms, we welcome other machine
    learning ideas as well.

    Naturally, we are looking for volunteers to help grow the community
    and make the project successful. So, if machine learning is your
    thing, come on over and lend a hand!

    Cheers,
    Grant Ingersoll

    http://lucene.apache.org/mahout

    --
    B. Regards,
    Edward yoon @ NHN, corp.
  • Peter W. at Feb 6, 2008 at 6:04 pm
    Hello,

    This is Mahout project seems very interesting.

    Any problem that has reducibility components
    using mapreduce and can then be described as a
    linear equation would be excellent candidates.

    Most Nutch developers probably don't need HMM
    but instead the power method to iterate over
    Markov chains or Perron-Frobenius.

    However, some of that work as it pertains to
    the web has been patented so it would be more
    productive for the Hadoop community to focus
    on other areas such as adjacency matrices,
    SALSA or bipartite graphs using Hbase.

    Bye,

    Peter W.

    On Feb 2, 2008, at 3:43 AM, edward yoon wrote:

    I thought of Hidden Markov Models (HMM) as absolutely impossible on
    MR model.
    If anyone have some information, please let me know.

    Thanks.
    On 2/2/08, edward yoon wrote:
    I read an interesting piece of information in that NISP paper, and i
    was implemented but

    Now, there's too much mailing-list for me to read.
    Lucene, Core, Hbase, Pig, Solr, Mahout ..... :(

    Too distributed.
    On 2/2/08, gopi wrote:
    I'm definitely excited about Machine Learning Algorithms being
    implemented
    into this project!
    I'm currently a student studying a Machine Learning, and would
    love to help
    out in every possible manner.

    Thanks
    Chaitanya Sharma

    On Jan 25, 2008 5:55 PM, Grant Ingersoll <gsingers@apache.org>
    wrote:
    (Apologies for cross-posting)

    The Lucene PMC is pleased to announce the creation of the Mahout
    Machine Learning project, located at http://lucene.apache.org/
    mahout.
    Mahout's goal is to create a suite of practical, scalable machine
    learning libraries. Our initial plan is to utilize Hadoop (
    http://hadoop.apache.org
    ) to implement a variety of algorithms including naive bayes,
    neural
    networks, support vector machines and k-Means, among others. While
    our initial focus is on these algorithms, we welcome other machine
    learning ideas as well.

    Naturally, we are looking for volunteers to help grow the community
    and make the project successful. So, if machine learning is your
    thing, come on over and lend a hand!

    Cheers,
    Grant Ingersoll

    http://lucene.apache.org/mahout
  • Ted Dunning at Feb 6, 2008 at 6:32 pm
    I don't think anybody has figured out how to patent the Lanczos algorithm
    itself!

    On 2/6/08 10:03 AM, "Peter W." wrote:

    Hello,

    This is Mahout project seems very interesting.

    Any problem that has reducibility components
    using mapreduce and can then be described as a
    linear equation would be excellent candidates.

    Most Nutch developers probably don't need HMM
    but instead the power method to iterate over
    Markov chains or Perron-Frobenius.

    However, some of that work as it pertains to
    the web has been patented so it would be more
    productive for the Hadoop community to focus
    on other areas such as adjacency matrices,
    SALSA or bipartite graphs using Hbase.

    Bye,

    Peter W.

    On Feb 2, 2008, at 3:43 AM, edward yoon wrote:

    I thought of Hidden Markov Models (HMM) as absolutely impossible on
    MR model.
    If anyone have some information, please let me know.

    Thanks.
    On 2/2/08, edward yoon wrote:
    I read an interesting piece of information in that NISP paper, and i
    was implemented but

    Now, there's too much mailing-list for me to read.
    Lucene, Core, Hbase, Pig, Solr, Mahout ..... :(

    Too distributed.
    On 2/2/08, gopi wrote:
    I'm definitely excited about Machine Learning Algorithms being
    implemented
    into this project!
    I'm currently a student studying a Machine Learning, and would
    love to help
    out in every possible manner.

    Thanks
    Chaitanya Sharma

    On Jan 25, 2008 5:55 PM, Grant Ingersoll <gsingers@apache.org>
    wrote:
    (Apologies for cross-posting)

    The Lucene PMC is pleased to announce the creation of the Mahout
    Machine Learning project, located at http://lucene.apache.org/
    mahout.
    Mahout's goal is to create a suite of practical, scalable machine
    learning libraries. Our initial plan is to utilize Hadoop (
    http://hadoop.apache.org
    ) to implement a variety of algorithms including naive bayes,
    neural
    networks, support vector machines and k-Means, among others. While
    our initial focus is on these algorithms, we welcome other machine
    learning ideas as well.

    Naturally, we are looking for volunteers to help grow the community
    and make the project successful. So, if machine learning is your
    thing, come on over and lend a hand!

    Cheers,
    Grant Ingersoll

    http://lucene.apache.org/mahout

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJan 25, '08 at 12:26p
activeFeb 6, '08 at 6:32p
posts8
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase