FAQ
I have just started to explore Hadoop but I am stuck in a situation now.

I want to run a MapReduce job in hadoop which needs to create a "setup"
folder in working directory. During the execution the job will generate
some additional text files within this "setup" folder. The problem is I
dont know how to access or move this setup folder content to my local file
system as at end of the job, the job directory will be cleaned up.

It would be great if you can help.

Regards
Shrish

Search Discussions

  • Smriti singh at Aug 2, 2011 at 12:53 am
    I want to run a MapReduce job in hadoop which needs to create a "setup"
    folder in working directory. During the execution the job will generate
    some additional text files within this "setup" folder. The problem is I
    dont know how to access or move this setup folder content to my local file
    system as at end of the job, the job directory will be cleaned up.

    It would be great if you can help.

    Regards
    Smriti
  • Harsh J at Aug 2, 2011 at 6:04 am
    Smriti,

    By working directory, do you mean the task attempt's working directory
    or the global job staging directory?
    On Tue, Aug 2, 2011 at 6:22 AM, smriti singh wrote:
    I want to run a MapReduce job in hadoop which needs to create a "setup"
    folder in working directory. During the execution the job will generate
    some additional text files within this "setup" folder. The problem is I
    dont know how to access or move this setup folder content to my local file
    system as at end of the job, the job directory will be cleaned up.

    It would be great if you can help.
    Regards
    Smriti


    --
    Harsh J
  • Smriti singh at Aug 2, 2011 at 10:19 am
    Hi Harsh, let me explain this in detail:

    this is what I am trying to do in my mapper:

    File setupFolder = new File(setupFolderName);

    setupFolder.mkdirs();



    MARD mard = new MARD(setupFolder);

    Text valuz = new Text();

    IntWritable intval = new IntWritable();

    File original = new File("Vca1652.txt");

    File mardedxml = new File("Vca1652-mardedxml.txt");

    File marded = new File("Vca1652-marded.txt");



    mardedxml.createNewFile();

    marded.createNewFile();

    NormalisationStats stats;

    try {

    stats = mard.normaliseFile(original,mardedxml,marded,50.0);

    //This meathod requires access to the myMardfolder


    System.out.println(stats);

    } catch (MARDException e) {

    // TODO Auto-generated catch block

    e.printStackTrace();

    }


    Now,
    1. This mard.normalise() creates files in the "setup" folder.
    2. I have no control on this method as I just a got a jar mard.jar to call
    this meathod.
    3. Mard.normalise() searches for folder called "foul" in the working
    directory. If not found throws on Exception. It is this folder's data that
    mard.normalise() method process on to generate files in the "setup"
    folder. I passed this folder to the working directory through the
    -archives option (**by first compressing it)
    3. I am not using the "input path" data in anyway in the mapper.
    4. Hence not using the key and values generated.
    5. I am using and Identity reducer as there is no need of any reduction .
    6. Hence also the output is of no use for me.
    7. I need to get the content of "setup folder", but I dont know the method
    to do so.


    * I might be wrong in the way I am doing it because I had no formal hadoop
    training I have just learned it by reading articles on the net.

    Thanking you in anticipation

    Regards

    Smriti

    On Tue, Aug 2, 2011 at 11:33 AM, Harsh J wrote:

    Smriti,

    By working directory, do you mean the task attempt's working directory
    or the global job staging directory?
    On Tue, Aug 2, 2011 at 6:22 AM, smriti singh wrote:
    I want to run a MapReduce job in hadoop which needs to create a "setup"
    folder in working directory. During the execution the job will generate
    some additional text files within this "setup" folder. The problem is I
    dont know how to access or move this setup folder content to my local file
    system as at end of the job, the job directory will be cleaned up.

    It would be great if you can help.
    Regards
    Smriti


    --
    Harsh J
  • Subroto Sanyal at Aug 2, 2011 at 10:38 am
    Hi Smriti,



    I would suggest you to have custom OutputCommiter which will be extension of
    FileOutputCommiter and will help you achieve your desired functionality.



    Regards,



    Regards,

    Computers make very fast and accurate mistakes.......


    This email and its attachments contain confidential information from HUAWEI,
    which is intended only for the person or entity whose address is listed
    above.
    Any use of the information contained herein in any way (including, but not
    limited to, total or partial disclosure, reproduction, or dissemination) by
    persons other than the intended recipient(s) is prohibited.
    If you receive this e-mail in error, please notify the sender by phone or
    email immediately and delete it!

    _____

    From: smriti singh
    Sent: Tuesday, August 02, 2011 3:48 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: How to access contents of a Map Reduce job's working directory



    Hi Harsh, let me explain this in detail:



    this is what I am trying to do in my mapper:

    File setupFolder = new File(setupFolderName);

    setupFolder.mkdirs();



    MARD mard = new MARD(setupFolder);

    Text valuz = new Text();

    IntWritable intval = new IntWritable();

    File original = new File("Vca1652.txt");

    File mardedxml = new File("Vca1652-mardedxml.txt");

    File marded = new File("Vca1652-marded.txt");



    mardedxml.createNewFile();

    marded.createNewFile();

    NormalisationStats stats;

    try {

    stats = mard.normaliseFile(original,mardedxml,marded,50.0);

    //This meathod requires access to the myMardfolder


    System.out.println(stats);

    } catch (MARDException e) {

    // TODO Auto-generated catch block

    e.printStackTrace();

    }





    Now,
    1. This mard.normalise() creates files in the "setup" folder.
    2. I have no control on this method as I just a got a jar mard.jar to call
    this meathod.
    3. Mard.normalise() searches for folder called "foul" in the working
    directory. If not found throws on Exception. It is this folder's data that
    mard.normalise() method process on to generate files in the "setup"
    folder. I passed this folder to the working directory through the
    -archives option (**by first compressing it)
    3. I am not using the "input path" data in anyway in the mapper.
    4. Hence not using the key and values generated.
    5. I am using and Identity reducer as there is no need of any reduction .
    6. Hence also the output is of no use for me.
    7. I need to get the content of "setup folder", but I dont know the method
    to do so.


    * I might be wrong in the way I am doing it because I had no formal hadoop
    training I have just learned it by reading articles on the net.



    Thanking you in anticipation



    Regards



    Smriti





    On Tue, Aug 2, 2011 at 11:33 AM, Harsh J wrote:

    Smriti,

    By working directory, do you mean the task attempt's working directory
    or the global job staging directory?

    On Tue, Aug 2, 2011 at 6:22 AM, smriti singh wrote:
    I want to run a MapReduce job in hadoop which needs to create a "setup"
    folder in working directory. During the execution the job will generate
    some additional text files within this "setup" folder. The problem is I
    dont know how to access or move this setup folder content to my local file
    system as at end of the job, the job directory will be cleaned up.

    It would be great if you can help.
    Regards
    Smriti



    --
    Harsh J
  • Tom uno at Aug 2, 2011 at 10:39 am
    2011/8/2 Subroto Sanyal <subrotosanyal@huawei.com>
    Hi Smriti,****

    ** **

    I would suggest you to have custom OutputCommiter which will be extension
    of FileOutputCommiter and will help you achieve your desired functionality.
    ****

    ** **

    Regards,****

    ** **

    *Regards,*

    Computers make very fast and accurate mistakes....... ****


    This email and its attachments contain confidential information from
    HUAWEI, which is intended only for the person or entity whose address is
    listed above.
    Any use of the information contained herein in any way (including, but not
    limited to, total or partial disclosure, reproduction, or dissemination) by
    persons other than the intended recipient(s) is prohibited.
    If you receive this e-mail in error, please notify the sender by phone or
    email immediately and delete it! ****
    ------------------------------

    *From:* smriti singh
    *Sent:* Tuesday, August 02, 2011 3:48 PM
    *To:* mapreduce-user@hadoop.apache.org
    *Subject:* Re: How to access contents of a Map Reduce job's working
    directory****

    ** **

    Hi Harsh, let me explain this in detail:****

    ** **

    this is what I am trying to do in my mapper:

    File setupFolder = new File(setupFolderName);

    setupFolder.mkdirs();



    MARD mard = new MARD(setupFolder);

    Text valuz = new Text();

    IntWritable intval = new IntWritable();

    File original = new File("Vca1652.txt");

    File mardedxml = new File("Vca1652-mardedxml.txt");

    File marded = new File("Vca1652-marded.txt");



    mardedxml.createNewFile();

    marded.createNewFile();

    NormalisationStats stats;

    try {

    stats = mard.normaliseFile(original,mardedxml,marded,50.0);

    //This meathod requires access to the myMardfolder


    System.out.println(stats);

    } catch (MARDException e) {

    // TODO Auto-generated catch block

    e.printStackTrace();

    }****

    ** **

    ** **

    Now,
    1. This mard.normalise() creates files in the "setup" folder.
    2. I have no control on this method as I just a got a jar mard.jar to call
    this meathod.
    3. Mard.normalise() searches for folder called "foul" in the working
    directory. If not found throws on Exception. It is this folder's data that
    mard.normalise() method process on to generate files in the "setup"
    folder. I passed this folder to the working directory through the
    -archives option (**by first compressing it)
    3. I am not using the "input path" data in anyway in the mapper.
    4. Hence not using the key and values generated.
    5. I am using and Identity reducer as there is no need of any reduction .
    6. Hence also the output is of no use for me.
    7. I need to get the content of "setup folder", but I dont know the method
    to do so.


    * I might be wrong in the way I am doing it because I had no formal hadoop
    training I have just learned it by reading articles on the net.****

    ** **

    Thanking you in anticipation****

    ** **

    Regards****

    ** **

    Smriti****

    ** **

    ** **

    On Tue, Aug 2, 2011 at 11:33 AM, Harsh J wrote:****

    Smriti,

    By working directory, do you mean the task attempt's working directory
    or the global job staging directory?****

    On Tue, Aug 2, 2011 at 6:22 AM, smriti singh wrote:
    I want to run a MapReduce job in hadoop which needs to create a "setup"
    folder in working directory. During the execution the job will generate
    some additional text files within this "setup" folder. The problem is I
    dont know how to access or move this setup folder content to my local file
    system as at end of the job, the job directory will be cleaned up.

    It would be great if you can help.
    Regards
    Smriti

    ****

    --
    Harsh J****

    ** **

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedAug 2, '11 at 12:50a
activeAug 2, '11 at 10:39a
posts6
users5
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase