FAQ
hello.

I am new to hadoop map reduce programming. I need to write a map reduce
program. I have a input folder, it contain a 10 number of documents in text
format. My aim is to write a map reduce program that read each text file and
create the word count of each text file separately. My input split is each
line. The map function is called for each line of text. But i need my file
name in map function. How can i get the file name to my map function.
Similarly i need to write the output of each file separately, is it
possible?
My hadoop version is Hadoop 0.20.2.
please help me .
Advanced thanks.

Search Discussions

  • Harsh J at Apr 1, 2011 at 7:52 pm
    Hello,

    (Inline reply.)
    On Fri, Apr 1, 2011 at 8:35 PM, ranjith k wrote:
    hello.
    I am new to hadoop map reduce programming. I need to write a map reduce
    program. I have a input folder, it contain a 10 number of documents in text
    format. My aim is to write a map reduce program that read each text file and
    create the word count of each text file separately. My input split is each
    line. The map function is called for each line of text. But i need my file
    name in map function. How can i get the file name to my map function.
    This is covered in the docs as part of the Map/Reduce Tutorial itself.
    Have a look at the table right below this para-link:
    http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Task+JVM+Reuse
    Similarly i need to write the output of each file separately, is it
    possible?
    You can achieve some levels of output file-naming using the
    MultipleOutputs class.
    My hadoop version is Hadoop 0.20.2.
    --
    Harsh J
    http://harshj.com
  • Ranjith k at Apr 4, 2011 at 4:31 pm
    Thank you..
    On Sat, Apr 2, 2011 at 1:22 AM, Harsh J wrote:

    Hello,

    (Inline reply.)
    On Fri, Apr 1, 2011 at 8:35 PM, ranjith k wrote:
    hello.
    I am new to hadoop map reduce programming. I need to write a map reduce
    program. I have a input folder, it contain a 10 number of documents in text
    format. My aim is to write a map reduce program that read each text file and
    create the word count of each text file separately. My input split is each
    line. The map function is called for each line of text. But i need my file
    name in map function. How can i get the file name to my map function.
    This is covered in the docs as part of the Map/Reduce Tutorial itself.
    Have a look at the table right below this para-link:

    http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Task+JVM+Reuse
    Similarly i need to write the output of each file separately, is it
    possible?
    You can achieve some levels of output file-naming using the
    MultipleOutputs class.
    My hadoop version is Hadoop 0.20.2.
    --
    Harsh J
    http://harshj.com


    --
    Ranjith k
    +918129419842

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedApr 1, '11 at 3:05p
activeApr 4, '11 at 4:31p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Ranjith k: 2 posts Harsh J: 1 post

People

Translate

site design / logo © 2022 Grokbase