Grokbase Groups Pig user June 2008
FAQ
I;m having a problem loading data from multiple paths in Pig. What I'm
trying to do is to load data from a range of dates, so I would like to
specify an input of two globbed paths:

x = LOAD '2008/05/{26,27,28,29,30,31},2008/06/{1,2}'

Pig doesn't seem to like this though as it's trying to interpret it as
a single path. The best I can do it to use UNION:

x1 = LOAD '2008/05/{26,27,28,29,30,31}'
x2 = LOAD '2008/06/{1,2}'
x = UNION x1, x2

The downside to this is that I want to parameterize my paths, and
having separate script for each number of paths in the input is
cumbersome.

Is there a better way of doing this? Are there any plans to support
multiple paths, and/or PathFilters?

Thanks,

Tom

Search Discussions

  • Olga Natkovich at Jun 4, 2008 at 9:21 pm
    Tom,

    You are correct that currently we only allow a single glob in the load
    statement. It would not be hard to extend it to multiple globs. I have
    created a JIRA for it: https://issues.apache.org/jira/browse/PIG-252;
    maybe somebody will be interested to contribute a patch.

    Olga
    -----Original Message-----
    From: Tom White
    Sent: Wednesday, June 04, 2008 7:56 AM
    To: pig-user@incubator.apache.org
    Subject: Specifying multiple input paths

    I;m having a problem loading data from multiple paths in Pig.
    What I'm trying to do is to load data from a range of dates,
    so I would like to specify an input of two globbed paths:

    x = LOAD '2008/05/{26,27,28,29,30,31},2008/06/{1,2}'

    Pig doesn't seem to like this though as it's trying to
    interpret it as a single path. The best I can do it to use UNION:

    x1 = LOAD '2008/05/{26,27,28,29,30,31}'
    x2 = LOAD '2008/06/{1,2}'
    x = UNION x1, x2

    The downside to this is that I want to parameterize my paths,
    and having separate script for each number of paths in the
    input is cumbersome.

    Is there a better way of doing this? Are there any plans to
    support multiple paths, and/or PathFilters?

    Thanks,

    Tom

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJun 4, '08 at 2:56p
activeJun 4, '08 at 9:21p
posts2
users2
websitepig.apache.org

2 users in discussion

Olga Natkovich: 1 post Tom White: 1 post

People

Translate

site design / logo © 2021 Grokbase