FAQ
Hello @all!



I am using Hadoop (version 0.14.3) and I tried to execute Hadoop-Streaming
with C.

Firstly, I compiled and linked my C-Files, then specified it as mapper and
reducer to hadoop-streaming.



The code seems to work, as no error message occurred on the web-interface.
The mapper receives input data correctly, but does no output of data.

Consequently, also the resulting output file is empty.



Do I make any failures with stdin or stdout?

Could anyone give me a small example of a simple mapper and a simple reducer
in C (not C++)?



Thanks a lot in advance.



Christian





------------------------------------------------------

University of Economics and Business Administration

Research Institute for Computational Methods

UZA 2

Augasse 2-6

A-1090 Wien



Tel: +43-1-31336-5251

E-Mail: kremnitzer@ai.wu-wien.ac.at

Web: <http://www.wu-wien.ac.at/firm> http://www.wu-wien.ac.at/firm

------------------------------------------------------

Search Discussions

  • Lohit Vijayarenu at Oct 30, 2007 at 4:10 pm
    Hello Christian,

    Hadoop Streaming executable does not matter if it is in C++/C or any other language.
    You just have to make sure you are reading from STDIN and writing to STDOUT.
    Take for example the cat command in UNIX.
    This reads and writes as we need (shown below)

    [hadoop-trunk]$ cat < README.txt
    For the latest information about Hadoop, please visit our website at:

    http://lucene.apache.org/hadoop/

    and our wiki, at:

    http://wiki.apache.org/lucene-hadoop/


    Now you could use this as your mapper to cat your input files like this
    hadoop jar hadoop-streaming.jar -mapper "cat" -reducer NONE -input <your input> -output <your output>

    Alternatively you might want to take a look at the logs to see if anything went wrong. $HADOOP_LOG_DIR

    Thanks,
    Lohit

    ----- Original Message ----
    From: Christian Kremnitzer <kremnitzer@ai.wu-wien.ac.at>
    To: hadoop-user@lucene.apache.org
    Sent: Tuesday, October 30, 2007 12:54:36 AM
    Subject: Hadoop-Streaming with C


    Hello @all!



    I am using Hadoop (version 0.14.3) and I tried to execute
    Hadoop-Streaming
    with C.

    Firstly, I compiled and linked my C-Files, then specified it as mapper
    and
    reducer to hadoop-streaming.



    The code seems to work, as no error message occurred on the
    web-interface.
    The mapper receives input data correctly, but does no output of data.

    Consequently, also the resulting output file is empty.



    Do I make any failures with stdin or stdout?

    Could anyone give me a small example of a simple mapper and a simple
    reducer
    in C (not C++)?



    Thanks a lot in advance.



    Christian





    ------------------------------------------------------

    University of Economics and Business Administration

    Research Institute for Computational Methods

    UZA 2

    Augasse 2-6

    A-1090 Wien



    Tel: +43-1-31336-5251

    E-Mail:
    kremnitzer@ai.wu-wien.ac.at

    Web: <http://www.wu-wien.ac.at/firm> http://www.wu-wien.ac.at/firm

    ------------------------------------------------------











    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com
  • Christian Kremnitzer at Oct 31, 2007 at 6:05 am
    Thanks, lohit for your help!

    I exactly followed your instructions. Your example, as well as the
    wordcount example in python, works fine.

    Applying my C-Code, I always get an empty result file. When I inspect
    the job via the web-interface, it seems to me that my mapper does not
    pass on any information (and therefore the reducer does not get any and
    still is not able to write data to the result file)

    Could you provide me a small example mapper and reducer in C?
    Also, the exact command, how to call this from the shell. It seems to
    me, that I always make the same (short, but important) error :)

    Christian
  • Christian Kremnitzer at Oct 31, 2007 at 6:40 am
    my log stderr - file tells me the following:
    /tmp/hadoop-kremnitzer/mapred/local/taskTracker/jobcache/job_200710301154_0053/work/mapper.out:
    /tmp/hadoop-kremnitzer/mapred/local/taskTracker/jobcache/job_200710301154_0053/work/mapper.out:
    cannot execute binary file

    I only have a /tmp/hadoop-kremnitzer/mapred/local/*jobTracker* folder

    Christian
  • Lohit Vijayarenu at Oct 31, 2007 at 8:44 am
    Does it depend on shared libraries which are not accessible during
    runtime?
    Are you passing all dependents using -file option?

    and also, try to run it on one node cluster and debug it as described
    here http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms

    Here is a simple/dirty c program that is like cat

    /* Simple program to read stdin and write to stdout */
    #include <stdio.h>
    int main() {
    char buffer[256];
    while (fgets(buffer, 255, stdin)) {
    fputs(buffer, stdout);
    }
    return 0;
    }

    hadoop jar $HADOOP_HOME/hadoop-streaming.jar -mapper "./mycat" -input
    cinput -output tmp_out -reducer NONE


    ----- Original Message ----
    From: Christian Kremnitzer <kremitzer@ai.wu-wien.ac.at>
    To: hadoop-user@lucene.apache.org
    Sent: Tuesday, October 30, 2007 11:34:07 PM
    Subject: Re: Hadoop-Streaming with C


    my log stderr - file tells me the following:
    /tmp/hadoop-kremnitzer/mapred/local/taskTracker/jobcache/job_200710301154_0053/work/mapper.out:
    /tmp/hadoop-kremnitzer/mapred/local/taskTracker/jobcache/job_200710301154_0053/work/mapper.out:
    cannot execute binary file

    I only have a /tmp/hadoop-kremnitzer/mapred/local/*jobTracker* folder

    Christian




    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 30, '07 at 3:49p
activeOct 31, '07 at 8:44a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase