FAQ
I've been up and down the docs, and I see people using GZipped files. But
when I try to load them, i get garbage. Basically it loads it as raw data
from the local file system.

test = LOAD 'file:///home/hadoop/testme1.
gz' using PigStorage('\u0002');

dump test;

Even when I load it from s3, still no dice.

I'm using pig on amazon. Is it too old for this functionality?

Apache Pig version 0.3.1-amzn (r2485701)
compiled Aug 10 2009, 11:52:03

Hadoop 0.18
Subversion -r
Compiled by root on Sat Jan 16 02:29:24 UTC 2010

Search Discussions

  • Jeff Zhang at Feb 27, 2010 at 6:40 am
    pig 0.3 do not support load gz file in local mode


    On Fri, Feb 26, 2010 at 8:17 PM, Cory Radcliff wrote:

    I've been up and down the docs, and I see people using GZipped files. But
    when I try to load them, i get garbage. Basically it loads it as raw data
    from the local file system.

    test = LOAD 'file:///home/hadoop/testme1.
    gz' using PigStorage('\u0002');

    dump test;

    Even when I load it from s3, still no dice.

    I'm using pig on amazon. Is it too old for this functionality?

    Apache Pig version 0.3.1-amzn (r2485701)
    compiled Aug 10 2009, 11:52:03

    Hadoop 0.18
    Subversion -r
    Compiled by root on Sat Jan 16 02:29:24 UTC 2010


    --
    Best Regards

    Jeff Zhang
  • Cory Radcliff at Feb 27, 2010 at 6:43 pm
    Oh. Ok, I was thinking that gzip didn't work on local files, but its local
    mode.

    Thanks. Can I run newer versions of Pig on older versions of Hadoop?
    On Fri, Feb 26, 2010 at 10:39 PM, Jeff Zhang wrote:

    pig 0.3 do not support load gz file in local mode


    On Fri, Feb 26, 2010 at 8:17 PM, Cory Radcliff wrote:

    I've been up and down the docs, and I see people using GZipped files. But
    when I try to load them, i get garbage. Basically it loads it as raw data
    from the local file system.

    test = LOAD 'file:///home/hadoop/testme1.
    gz' using PigStorage('\u0002');

    dump test;

    Even when I load it from s3, still no dice.

    I'm using pig on amazon. Is it too old for this functionality?

    Apache Pig version 0.3.1-amzn (r2485701)
    compiled Aug 10 2009, 11:52:03

    Hadoop 0.18
    Subversion -r
    Compiled by root on Sat Jan 16 02:29:24 UTC 2010


    --
    Best Regards

    Jeff Zhang
  • Dmitriy Ryaboy at Feb 27, 2010 at 6:56 pm
    You can run 0.4, but it will have the same issue.
    0.5+ are only compatible with Hadoop 0.20+ (in fact 0.5 is essentially just
    0.4 with the 0.20 compatibility patches).

    You can run in hadoop mode, and have hadoop running in pseudo-distributed
    mode locally. That should work.

    -D
    On Sat, Feb 27, 2010 at 10:43 AM, Cory Radcliff wrote:

    Oh. Ok, I was thinking that gzip didn't work on local files, but its local
    mode.

    Thanks. Can I run newer versions of Pig on older versions of Hadoop?
    On Fri, Feb 26, 2010 at 10:39 PM, Jeff Zhang wrote:

    pig 0.3 do not support load gz file in local mode



    On Fri, Feb 26, 2010 at 8:17 PM, Cory Radcliff <penguinone@gmail.com>
    wrote:
    I've been up and down the docs, and I see people using GZipped files.
    But
    when I try to load them, i get garbage. Basically it loads it as raw
    data
    from the local file system.

    test = LOAD 'file:///home/hadoop/testme1.
    gz' using PigStorage('\u0002');

    dump test;

    Even when I load it from s3, still no dice.

    I'm using pig on amazon. Is it too old for this functionality?

    Apache Pig version 0.3.1-amzn (r2485701)
    compiled Aug 10 2009, 11:52:03

    Hadoop 0.18
    Subversion -r
    Compiled by root on Sat Jan 16 02:29:24 UTC 2010


    --
    Best Regards

    Jeff Zhang

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedFeb 27, '10 at 4:17a
activeFeb 27, '10 at 6:56p
posts4
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase