FAQ
Hi,
i want to write a file to hdfs, using hadoop pipes. can anyone tell me how
to do that?
im using an external library that writes its output to disk, so probably i
have to read that output and write it to the distributed filesystem?
i found FSDataOutputStream, but its a java class.
can anyone help?
moreover, can anyone tell me where i can find goot documentation about
hadoop pipes? nearly everything i find is java specific. i looked at the
hadoop pipes source and it looked very restricted, can i do everything in
hadoop pipes that possible in java?

thank for your help
horson
--
View this message in context: http://old.nabble.com/writing-files-to-HDFS-%28from-c%2B%2B-pipes%29-tp26681351p26681351.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Search Discussions

  • Prakhar Sharma at Dec 7, 2009 at 6:44 pm
    Hi Horson,
    Quite unfortunately, there is no documentation available for PIpes
    API. Its not just that, the API itself is quite weak and unstable.
    Only few examples given in PIpes distro work. And it appears there are
    very few ppl who use Hadoop Map/Reduce through the PIpes API. I have
    myself been hacking my way around the Pipes API for quite some time
    now and I guess that's the only way around it.

    Thanks,
    Prakhar
    On Mon, Dec 7, 2009 at 1:38 PM, horson wrote:

    Hi,
    i want to write a file to hdfs, using hadoop pipes. can anyone tell me how
    to do that?
    Im using an external library that writes its output to disk, so probably i
    have to read that output and write it to the distributed filesystem?
    I found only FSDataOutputStream, a java class.
    Can anyone help?
    Moreover, can anyone tell me where i can find goot documentation about
    hadoop pipes? Nearly everything i find is java specific or general
    information about MapReduce. I looked at the hadoop pipes source and it
    looked very restricted, can i do everything in hadoop pipes that possible in
    java?

    Thank for your help
    horson
    --
    View this message in context: http://old.nabble.com/writing-files-to-HDFS-%28from-c%2B%2B-pipes%29-tp26681351p26681351.html
    Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Owen O'Malley at Dec 7, 2009 at 11:26 pm

    On Dec 7, 2009, at 10:44 AM, Prakhar Sharma wrote:

    Quite unfortunately, there is no documentation available for PIpes
    API. Its not just that, the API itself is quite weak and unstable.
    *sigh* I agree that there should be more documentation. I'd love it if
    someone could write some up and submit it.
    Only few examples given in PIpes distro work.
    Hmm... If they are broken, please file jira on them. They all used to
    work fine.
    And it appears there are
    very few ppl who use Hadoop Map/Reduce through the PIpes API.
    Mostly it is used when there is a large base of C++ code to run under
    map/reduce. The largest Hadoop MapReduce application in the world is a
    Pipes-based job (Yahoo's WebMap).

    -- Owen
  • Owen O'Malley at Dec 7, 2009 at 11:30 pm

    On Dec 7, 2009, at 10:05 AM, horson wrote:

    i want to write a file to hdfs, using hadoop pipes. can anyone tell
    me how
    to do that?
    You either use a Java OutputFormat, which is the easiest, or you use
    libhdfs to write to HDFS from C++.
    i looked at the hadoop pipes source and it looked very restricted,
    can i do everything in
    hadoop pipes that possible in java?
    No, not everything is supported. It does support record readers,
    mappers, combiners, reducers, record writers, and counters from C++.
    It also provides the entire job configuration as a string->string map.

    -- Owen
  • Prakhar Sharma at Dec 9, 2009 at 1:54 am
    Hi Owen,
    "It also provides the entire job configuration as a string->string map."
    Can you provide some example as to how to do this?. I am trying to
    write a DNA sequence assembler using Hadoop MapReduce to improve the
    throughput of the assembler. I have to call runTask() repeatedly with
    different settings for different invocations and am not clear how to
    do so.
    (reason for my comments was that I had a hard time making the Pipes
    api and libhdfs work, and still am not clear about some use cases)

    Thanks,
    Prakhar
    On Mon, Dec 7, 2009 at 6:30 PM, Owen O'Malley wrote:
    On Dec 7, 2009, at 10:05 AM, horson wrote:

    i want to write a file to hdfs, using hadoop pipes. can anyone tell me how
    to do that?
    You either use a Java OutputFormat, which is the easiest, or you use libhdfs
    to write to HDFS from C++.
    i looked at the hadoop pipes source and it looked very restricted, can i
    do everything in
    hadoop pipes that possible in java?
    No, not everything is supported. It does support record readers, mappers,
    combiners, reducers, record writers, and counters from C++. It also provides
    the entire job configuration as a string->string map.

    -- Owen

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedDec 7, '09 at 6:06p
activeDec 9, '09 at 1:54a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase