FAQ
If i have a query that i would normally fire on a database and i want o fire
that using the data loaded into multiple nodes on hadoop. Will the query be
distributed over all the datanodes so it returns results faster or will it
just send it to 1 node? If so is there a way to get it to distribute the
query instead of sending it to 1 node?
Thanks
Divij

Search Discussions

  • Amogh Vasekar at Jul 15, 2009 at 6:28 am
    Confused. What do you mean by "query be distributed over all datanodes or just 1 node" . If your data is small enough so that it fits in just one block ( and replicated by hadoop ), then just one task will be run ( assuming default input split).
    If the data is spread across multiple blocks, you can make it run on just one compute node by setting your input split to be large enough ( yes there are use cases for this when whole data is to be fed to a single mapper ). Else, the job will be scheduled on numerous nodes with each getting a block / chunk ( input split size set ) of your actual data. The nodes picked for running your job depends on data-locality to reduce network latency.

    Thanks,
    Amogh

    -----Original Message-----
    From: Divij Durve
    Sent: Wednesday, July 15, 2009 2:32 AM
    To: common-user@hadoop.apache.org; core-user@hadoop.apache.org
    Subject: Question about job distribution

    If i have a query that i would normally fire on a database and i want o fire
    that using the data loaded into multiple nodes on hadoop. Will the query be
    distributed over all the datanodes so it returns results faster or will it
    just send it to 1 node? If so is there a way to get it to distribute the
    query instead of sending it to 1 node?
    Thanks
    Divij

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 14, '09 at 9:02p
activeJul 15, '09 at 6:28a
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Amogh Vasekar: 1 post Divij Durve: 1 post

People

Translate

site design / logo © 2022 Grokbase