Grokbase Groups Pig user March 2010
FAQ
Hi,

Does anyone have experience running MultiStorage-like UDF on Elastic
MapReduce? Basically we are trying to store output into multiple
directories based on certain field values. We have some success
writing UDF that extends MultiStorage in piggybank to write to HDFS,
but we couldn't get the same UDF to write to S3. We also couldn't find
MultiStorage in Amazon's version of piggybank. Any suggestions on how
we can achieve that on Elastic MapReduce writing to S3? Thanks in
advance!

Thanks,
Jialong

Search Discussions

  • Jennie Cochran-Chinn at Mar 4, 2010 at 7:24 pm
    Amazons extension allows one to write to/read from both s3 or hdfs,
    whereas the last time I checked the non amazon version only allows one
    to do either or but not both. The MultiStorage in the regular piggy
    bank is not written to support the multiple file systems - which would
    be my guess as to why its not in Amazon's version of piggy bank. You
    could try to extend MultiStorage to write to the multiple filesystems
    perhaps.

    Jennie

    On Mar 4, 2010, at 11:15 AM, Jialong Wu wrote:

    Hi,

    Does anyone have experience running MultiStorage-like UDF on Elastic
    MapReduce? Basically we are trying to store output into multiple
    directories based on certain field values. We have some success
    writing UDF that extends MultiStorage in piggybank to write to HDFS,
    but we couldn't get the same UDF to write to S3. We also couldn't find
    MultiStorage in Amazon's version of piggybank. Any suggestions on how
    we can achieve that on Elastic MapReduce writing to S3? Thanks in
    advance!

    Thanks,
    Jialong

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedMar 4, '10 at 7:16p
activeMar 4, '10 at 7:24p
posts2
users2
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase