FAQ
I have come across a problem. I just want to sort the num from 1 to 100, and
with a maptask to map 1 to 50, with another to map 51 to 100,
then how can I configure the jobconf?

--
Huang Qian(黄骞)
Institute of Remote Sensing and GIS,Peking University
Phone: (86-10) 5276-3109
Mobile: (86) 1590-126-8883
Address:Rm.554,Building 1,ChangChunXinYuan,Peking
Univ.,Beijing(100871),CHINA

Search Discussions

  • Amandeep Khurana at Oct 5, 2009 at 10:48 pm
    Is it a strict partitioning of the input that you need? If yes, why?
    Why not just feed the data into the job and let it split
    automatically. You can process differently based what it inputs if you
    need that.

    On 10/5/09, Huang Qian wrote:
    I have come across a problem. I just want to sort the num from 1 to 100, and
    with a maptask to map 1 to 50, with another to map 51 to 100,
    then how can I configure the jobconf?

    --
    Huang Qian(黄骞)
    Institute of Remote Sensing and GIS,Peking University
    Phone: (86-10) 5276-3109
    Mobile: (86) 1590-126-8883
    Address:Rm.554,Building 1,ChangChunXinYuan,Peking
    Univ.,Beijing(100871),CHINA

    --


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz
  • Huang Qian at Oct 6, 2009 at 1:39 am
    I am a beginner at hadoop. I want to ask a question , how can I configurate
    a job with two map task with the same mapper class and different dataset?
    For example, I want to sort the num from 1 to 100, then use one task to deal
    with 1 to 50, and the other with 51 to 100, I want to control the dataset I
    send to mapper. How can I make it? Can anyone help me ?
  • Huang Qian at Oct 6, 2009 at 1:46 am
    The real problem is I want to use different mapper to deal with different
    hbase data. For example the data is storing in different HTable, So I should
    use different mapper to connect to different Htable and get the data.How can
    I made it?

    2009/10/5 Huang Qian <skyswind@gmail.com>
    I am a beginner at hadoop. I want to ask a question , how can I configurate
    a job with two map task with the same mapper class and different dataset?
    For example, I want to sort the num from 1 to 100, then use one task to deal
    with 1 to 50, and the other with 51 to 100, I want to control the dataset I
    send to mapper. How can I make it? Can anyone help me ?
  • Amogh Vasekar at Oct 6, 2009 at 5:24 am
    Hi Huang,

    Haven't worked with Hbase but in general,
    If you want to have control over what data split to go as a whole to mapper, easiest way is to compress that split in single file; making as many split files as needed. If you need to know what file is currently being processed, you can use map.input.file ( corresponds to HBase table?? )from configuration, and do file specific operations as needed.
    Hope this helps

    Amogh

    -----Original Message-----
    From: Huang Qian
    Sent: Tuesday, October 06, 2009 7:15 AM
    To: common-user@hadoop.apache.org
    Subject: Re: How can I assign the same mapper class with different data?

    The real problem is I want to use different mapper to deal with different
    hbase data. For example the data is storing in different HTable, So I should
    use different mapper to connect to different Htable and get the data.How can
    I made it?

    2009/10/5 Huang Qian <skyswind@gmail.com>
    I am a beginner at hadoop. I want to ask a question , how can I configurate
    a job with two map task with the same mapper class and different dataset?
    For example, I want to sort the num from 1 to 100, then use one task to deal
    with 1 to 50, and the other with 51 to 100, I want to control the dataset I
    send to mapper. How can I make it? Can anyone help me ?

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 5, '09 at 10:31p
activeOct 6, '09 at 5:24a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase