Grokbase Groups Pig user July 2011
FAQ
We have our data in folders partitioned by day:

ie /user/pig/logs/2011/06/30

Is there any way to select the last x amount of days to use as input?

Thanks

Search Discussions

  • Nathan Bijnens at Jul 1, 2011 at 8:43 am
    You could use separate statements and then use UNION to add them together.

    Otherwise you could write an UDF that loads multiple fiels at once.

    Best regards,
    Nathan


    ---
    nathan@nathan.gs : http://nathan.gs : http://twitter.com/nathan_gs

    On Fri, Jul 1, 2011 at 7:06 AM, Mark wrote:

    We have our data in folders partitioned by day:

    ie /user/pig/logs/2011/06/30

    Is there any way to select the last x amount of days to use as input?

    Thanks
  • Gianmarco at Jul 1, 2011 at 10:21 am
    You should be able to do it using some bash trickery.

    %declare inputs `hls | cut -d '/' -f 2- | sort | tail -n $num | sed
    's/^/\//' | sed 's/$/,/' `

    Where $num is the number of files you want.
    Then you can use $inputs in your LOAD statement.

    I haven't tried this, so it might (will) contain bugs.

    --
    Gianmarco De Francisci Morales

    On Fri, Jul 1, 2011 at 07:06, Mark wrote:

    We have our data in folders partitioned by day:

    ie /user/pig/logs/2011/06/30

    Is there any way to select the last x amount of days to use as input?

    Thanks

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 1, '11 at 5:06a
activeJul 1, '11 at 10:21a
posts3
users3
websitepig.apache.org

3 users in discussion

Gianmarco: 1 post Nathan Bijnens: 1 post Mark: 1 post

People

Translate

site design / logo © 2021 Grokbase