FAQ
Hi all,
I have a basic query regarding Mapreduce usage in search
engines. My queries are:

1.How Map-Reduce is used in search?
2.Is Google uses Mapreduce algorithm for its search engine? Then how they
use it? Explain the architecture or flow of how google or other search
engines work and what is the part of mapreduce in it.....................

Please Explain.........

With Regards,
B.Yuhendar


-----------------------------------------
This email was sent using TCEMail Service.
Thiagarajar College of Engineering
Madurai-625 015, India

Search Discussions

  • Saikat Kanjilal at Jul 30, 2010 at 2:08 pm
    Hello Yuhendar,I'll add as much as I can at a high level from what I have learned so far about map-reduce to answer your questions:
    1) The goal behind map-reduce is to perform a distributed computation which breaks up a large computation intensive problem into smaller chunks and solve those individual chunks and finally combine the result, the problem in this case being search, in this problem you have a master node and a set of slave nodes, the master (or in the hadoop domain I believe its known as the name node) takes input from the client in the form of a job and forwards this job out to the slaves which go off and solve smaller pieces of the problem and return the results. The master then uses a combine approach to gather the results from all the slaves and present it back to the client. A more concrete example is the distributed grep problem which is a form of searching for a particular word (or document) in a huge dataset. Take a look at the hadoop examples or the hadoop webpage to learn more about this.
    2) Google to my understanding is using their internal implementation of the general algorithm for mapreduce to store data in their datastore known as bigtable which is a multi-dimensional sorted map.

    My 2 cents.Regards.
    Date: Fri, 30 Jul 2010 11:53:49 +0530
    Subject: Re: MapReduce Usage in Search Engines
    From: yuhendar@tce.edu
    To: common-dev@hadoop.apache.org

    Hi all,
    I have a basic query regarding Mapreduce usage in search
    engines. My queries are:

    1.How Map-Reduce is used in search?
    2.Is Google uses Mapreduce algorithm for its search engine? Then how they
    use it? Explain the architecture or flow of how google or other search
    engines work and what is the part of mapreduce in it.....................

    Please Explain.........

    With Regards,
    B.Yuhendar


    -----------------------------------------
    This email was sent using TCEMail Service.
    Thiagarajar College of Engineering
    Madurai-625 015, India
  • Jeff Zhang at Jul 30, 2010 at 2:19 pm
    As my understanding, google may use mapred to build index, but won't use
    mapred in the search phase.
    Because search phase need to be low latency which is not mapred's feature.

    On Fri, Jul 30, 2010 at 7:06 AM, Saikat Kanjilal wrote:


    Hello Yuhendar,I'll add as much as I can at a high level from what I have
    learned so far about map-reduce to answer your questions:
    1) The goal behind map-reduce is to perform a distributed computation
    which breaks up a large computation intensive problem into smaller chunks
    and solve those individual chunks and finally combine the result, the
    problem in this case being search, in this problem you have a master node
    and a set of slave nodes, the master (or in the hadoop domain I believe its
    known as the name node) takes input from the client in the form of a job and
    forwards this job out to the slaves which go off and solve smaller pieces of
    the problem and return the results. The master then uses a combine approach
    to gather the results from all the slaves and present it back to the client.
    A more concrete example is the distributed grep problem which is a form of
    searching for a particular word (or document) in a huge dataset. Take a
    look at the hadoop examples or the hadoop webpage to learn more about this.
    2) Google to my understanding is using their internal implementation of the
    general algorithm for mapreduce to store data in their datastore known as
    bigtable which is a multi-dimensional sorted map.

    My 2 cents.Regards.
    Date: Fri, 30 Jul 2010 11:53:49 +0530
    Subject: Re: MapReduce Usage in Search Engines
    From: yuhendar@tce.edu
    To: common-dev@hadoop.apache.org

    Hi all,
    I have a basic query regarding Mapreduce usage in search
    engines. My queries are:

    1.How Map-Reduce is used in search?
    2.Is Google uses Mapreduce algorithm for its search engine? Then how they
    use it? Explain the architecture or flow of how google or other search
    engines work and what is the part of mapreduce in it.....................

    Please Explain.........

    With Regards,
    B.Yuhendar


    -----------------------------------------
    This email was sent using TCEMail Service.
    Thiagarajar College of Engineering
    Madurai-625 015, India


    --
    Best Regards

    Jeff Zhang
  • Otis Gospodnetic at Jul 31, 2010 at 3:13 am
    MapReduce tends to be used for massive (re)indexing.
    See http://search-lucene.com/?q=hadoop+mapreduce&fc_project=Solr&fc_project=Lucene
    for how Lucene/Solr people are using MapReduce.

    For example, in a recent project we used MapReduce (streaming with jruby,
    actually) together with Solr (Embedded version, to be more precise) to speed up
    indexing of a 20 GB index that used to take a couple of hours. Now it takes 7
    minutes, because it's parallelized to Nth degree.


    MapReduce can also be used for various Machine Learning data crunching, say for
    query log analysis, for content analysis, for NLP, for building of better
    relevance models for search, etc. etc. See http://mahout.apache.org .

    Otis
    ----Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
    Lucene ecosystem search :: http://search-lucene.com/


    ----- Original Message ----
    From: "yuhendar@tce.edu" <yuhendar@tce.edu>
    To: common-dev@hadoop.apache.org
    Sent: Fri, July 30, 2010 2:23:49 AM
    Subject: Re: MapReduce Usage in Search Engines

    Hi all,
    I have a basic query regarding Mapreduce usage in search
    engines. My queries are:

    1.How Map-Reduce is used in search?
    2.Is Google uses Mapreduce algorithm for its search engine? Then how they
    use it? Explain the architecture or flow of how google or other search
    engines work and what is the part of mapreduce in it.....................

    Please Explain.........

    With Regards,
    B.Yuhendar


    -----------------------------------------
    This email was sent using TCEMail Service.
    Thiagarajar College of Engineering
    Madurai-625 015, India

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJul 30, '10 at 6:24a
activeJul 31, '10 at 3:13a
posts4
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase