FAQ
I think this should be a common use case. I'm trying to invoke an
executable from my mapper. Because my executable takes steaming input, I
used something like the following:

String lCmdStr = "hadoop dfs -cat file | myExec";

Process lChldProc = Runtime.getRuntime().exec(lCmdStr);

All executable and file names are full qualified name with path. I got
the following errors:

cat: File does not exist: |
cat: File does not exist: myExec

Looks like this kind of steaming/pipe method was not supported from a
mapper? I can take the exact command string and run it directly on a
Hadoop server and it works. Anybody have any experience with it?

I also tried to use Hadoop-steaming and it did not work either. It did
not give any error but nothing happened either. My program is supposed
to write a file on the local system and it's not there. I'm at the end
of my wit. Any help is most appreciated. Hadoop version 0.20.2.

String lCmdStr = "hadoop jar hadoop-0.20.2-streaming.jar -input
inputFile -output outputFile -mapper myExec";

Thanks very much,
Grace

Search Discussions

  • Robert Evans at Aug 25, 2011 at 4:33 pm
    I think this is a java issue. I don't think that it is launching a shell to run your command. I think it is just splitting on white space and then passing all the args to hadoop. What you want to do is to run

    sh -e 'hadoop dfs -cat file| myExec'

    Or with streaming white a small shell script that has the command in it and then tall the mapper/reducer to use that. The key with streaming is that you have to make sure that you read stdin before going on, or it will error out

    #!/bin/sh
    Hadoop fs -cat file | myExec;
    #Read all of the input to this mapper.
    cat > /dev/null;


    --Bobby Evans

    On 8/25/11 10:32 AM, "Zhixuan Zhu" wrote:

    I think this should be a common use case. I'm trying to invoke an
    executable from my mapper. Because my executable takes steaming input, I
    used something like the following:

    String lCmdStr = "hadoop dfs -cat file | myExec";

    Process lChldProc = Runtime.getRuntime().exec(lCmdStr);

    All executable and file names are full qualified name with path. I got
    the following errors:

    cat: File does not exist: |
    cat: File does not exist: myExec

    Looks like this kind of steaming/pipe method was not supported from a
    mapper? I can take the exact command string and run it directly on a
    Hadoop server and it works. Anybody have any experience with it?

    I also tried to use Hadoop-steaming and it did not work either. It did
    not give any error but nothing happened either. My program is supposed
    to write a file on the local system and it's not there. I'm at the end
    of my wit. Any help is most appreciated. Hadoop version 0.20.2.

    String lCmdStr = "hadoop jar hadoop-0.20.2-streaming.jar -input
    inputFile -output outputFile -mapper myExec";

    Thanks very much,
    Grace
  • Zhixuan Zhu at Aug 25, 2011 at 7:14 pm
    A shell script actually worked great! Thanks so much!

    Grace

    -----Original Message-----
    From: Robert Evans
    Sent: Thursday, August 25, 2011 11:32 AM
    To: common-dev@hadoop.apache.org
    Subject: Re: Question about invoking an executable from Hadoop mapper

    I think this is a java issue. I don't think that it is launching a
    shell to run your command. I think it is just splitting on white space
    and then passing all the args to hadoop. What you want to do is to run

    sh -e 'hadoop dfs -cat file| myExec'

    Or with streaming white a small shell script that has the command in it
    and then tall the mapper/reducer to use that. The key with streaming is
    that you have to make sure that you read stdin before going on, or it
    will error out

    #!/bin/sh
    Hadoop fs -cat file | myExec;
    #Read all of the input to this mapper.
    cat > /dev/null;


    --Bobby Evans

    On 8/25/11 10:32 AM, "Zhixuan Zhu" wrote:

    I think this should be a common use case. I'm trying to invoke an
    executable from my mapper. Because my executable takes steaming input, I
    used something like the following:

    String lCmdStr = "hadoop dfs -cat file | myExec";

    Process lChldProc = Runtime.getRuntime().exec(lCmdStr);

    All executable and file names are full qualified name with path. I got
    the following errors:

    cat: File does not exist: |
    cat: File does not exist: myExec

    Looks like this kind of steaming/pipe method was not supported from a
    mapper? I can take the exact command string and run it directly on a
    Hadoop server and it works. Anybody have any experience with it?

    I also tried to use Hadoop-steaming and it did not work either. It did
    not give any error but nothing happened either. My program is supposed
    to write a file on the local system and it's not there. I'm at the end
    of my wit. Any help is most appreciated. Hadoop version 0.20.2.

    String lCmdStr = "hadoop jar hadoop-0.20.2-streaming.jar -input
    inputFile -output outputFile -mapper myExec";

    Thanks very much,
    Grace

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedAug 25, '11 at 3:32p
activeAug 25, '11 at 7:14p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Zhixuan Zhu: 2 posts Robert Evans: 1 post

People

Translate

site design / logo © 2022 Grokbase