Grokbase Groups Pig dev May 2011
FAQ
Need to clarify globbing on command line vs in load statement
-------------------------------------------------------------

Key: PIG-2054
URL: https://issues.apache.org/jira/browse/PIG-2054
Project: Pig
Issue Type: Improvement
Components: documentation
Reporter: Olga Natkovich
Assignee: Corinne Chandel
Fix For: 0.9.0


We had several user reports saying that "globbing in Pig and Hadoop are not the same". They based this assertion on the fact that some patterns work from hadoop command line but would not work in Pig load statement.

Pig uses Hadoop globbing so the functionality is identical; however, when you run on command line, shell can be doing some of the substitution giving impression that things are different.

Example:

hadoop fs -ls /mydata/20110423{00,01,02,03,04,05,06,07,08,09,{10..23}}00/*/part* - this works
LOAD '/mydata/20110423{00,01,02,03,04,05,06,07,08,09,{10..23}}00/*/part*' - this does not

We should add a note to the description of globbing


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Search Discussions

  • Corinne Chandel (JIRA) at May 13, 2011 at 11:25 pm
    [ https://issues.apache.org/jira/browse/PIG-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Corinne Chandel resolved PIG-2054.
    ----------------------------------

    Resolution: Fixed
    Hadoop Flags: [Reviewed]

    Documentation updated.
    Fix will be included in the GA patch for PIG-1772.
    Thanks for review Olga!

    Need to clarify globbing on command line vs in load statement
    -------------------------------------------------------------

    Key: PIG-2054
    URL: https://issues.apache.org/jira/browse/PIG-2054
    Project: Pig
    Issue Type: Improvement
    Components: documentation
    Reporter: Olga Natkovich
    Assignee: Corinne Chandel
    Fix For: 0.9.0


    We had several user reports saying that "globbing in Pig and Hadoop are not the same". They based this assertion on the fact that some patterns work from hadoop command line but would not work in Pig load statement.
    Pig uses Hadoop globbing so the functionality is identical; however, when you run on command line, shell can be doing some of the substitution giving impression that things are different.
    Example:
    hadoop fs -ls /mydata/20110423{00,01,02,03,04,05,06,07,08,09,{10..23}}00/*/part* - this works
    LOAD '/mydata/20110423{00,01,02,03,04,05,06,07,08,09,{10..23}}00/*/part*' - this does not
    We should add a note to the description of globbing
    --
    This message is automatically generated by JIRA.
    For more information on JIRA, see: http://www.atlassian.com/software/jira

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedMay 9, '11 at 6:52p
activeMay 13, '11 at 11:25p
posts2
users1
websitepig.apache.org

1 user in discussion

Corinne Chandel (JIRA): 2 posts

People

Translate

site design / logo © 2021 Grokbase