FAQ
I am struggling with a issue in hadoop streaming in the "-file" option.

First I tried the very basic example in streaming:

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper
org.apache.hadoop.mapred.lib.IdentityMapper \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstchk22

which worked absolutely fine.

Then I copied the IdentityMapper.java source code and compiled it. Then I placed
this class file in the /home/hadoop folder and executed the following in the
terminal.

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -file ~/IdentityMapper.class
-mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch6

The execution failed with the following error in the stderr file:

java.io.IOException: Cannot run program "IdentityMapper.class":
java.io.IOException: error=2, No such file or directory

Then again I tried it by copying the IdentityMapper.class file in the hadoop
installation and executed the following:

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -file IdentityMapper.class
-mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch5

But unfortunately again I got the same error.

It would be great if you can help me with it as I cannot move any further
without overcoming this.


***I am trying this after I tried hadoop-streaming for a different class file
which failed, so to identify if there is something wrong with the class file
itself or with the way I am using it


Thanking you in anticipation

Search Discussions

  • Robert Evans at Jul 22, 2011 at 4:06 pm
    From a practical standpoint if you just leave off the -mapper you will get an IdentityMapper being run in streaming. I don't believe that -mapper will understand something.class as a class file that should be loaded and used as the mapper. I think you need to specify the class, including the package to get it to load like you did with org.apache.hadoop.mapred.lib.IdentityMapper. I am not sure what changes you made to IdentiyMapper.java before recompiling but in order to get it on the classpath you probably need to ship it as a jar not as a single file. I believe that you can use -libJars to ship it and add it to the classpath of the JVM, but I am not positive of that.
    --Bobby Evans

    On 7/22/11 10:18 AM, "Shrish" wrote:



    I am struggling with a issue in hadoop streaming in the "-file" option.

    First I tried the very basic example in streaming:

    hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
    contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper
    org.apache.hadoop.mapred.lib.IdentityMapper \ -reducer /bin/wc -inputformat
    KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstchk22

    which worked absolutely fine.

    Then I copied the IdentityMapper.java source code and compiled it. Then I placed
    this class file in the /home/hadoop folder and executed the following in the
    terminal.

    hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
    contrib/streaming/hadoop-streaming-0.20.203.0.jar -file ~/IdentityMapper.class
    -mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
    KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch6

    The execution failed with the following error in the stderr file:

    java.io.IOException: Cannot run program "IdentityMapper.class":
    java.io.IOException: error=2, No such file or directory

    Then again I tried it by copying the IdentityMapper.class file in the hadoop
    installation and executed the following:

    hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
    contrib/streaming/hadoop-streaming-0.20.203.0.jar -file IdentityMapper.class
    -mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
    KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch5

    But unfortunately again I got the same error.

    It would be great if you can help me with it as I cannot move any further
    without overcoming this.


    ***I am trying this after I tried hadoop-streaming for a different class file
    which failed, so to identify if there is something wrong with the class file
    itself or with the way I am using it


    Thanking you in anticipation

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 22, '11 at 3:25p
activeJul 22, '11 at 4:06p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Robert Evans: 1 post Shrish: 1 post

People

Translate

site design / logo © 2022 Grokbase