Grokbase Groups Pig user April 2010
FAQ
Hi,

I'm trying to read in a comma-separated file with a simple command:
a = load 'myfile' using PigStorage(',');

However, some lines in my file have the , inside a quoted string, and
Pig is picking it up as 2 separate tokens:

Example:
A,b,c,d,"string-with,comma",F

I get the "string-with" and "comma" as 2 separate tokens.

Is there a way to configure the load using PigStorage command to respect
"escaped" delimeters? Or do I need to write my loader? Or is there a
better way of doing this?

thanks

toli

Search Discussions

  • Alan Gates at Apr 24, 2010 at 2:29 pm
    PigStorage doesn't have an escaping mechanism at the moment. You
    could create a load function that extends PigStorage and adds escaping
    for field delimiters.

    Alan.
    On Apr 23, 2010, at 7:28 PM, Toli Kuznets wrote:

    Hi,

    I'm trying to read in a comma-separated file with a simple command:
    a = load 'myfile' using PigStorage(',');

    However, some lines in my file have the , inside a quoted string, and
    Pig is picking it up as 2 separate tokens:

    Example:
    A,b,c,d,"string-with,comma",F

    I get the "string-with" and "comma" as 2 separate tokens.

    Is there a way to configure the load using PigStorage command to
    respect
    "escaped" delimeters? Or do I need to write my loader? Or is there a
    better way of doing this?

    thanks

    toli

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedApr 24, '10 at 2:28a
activeApr 24, '10 at 2:29p
posts2
users2
websitepig.apache.org

2 users in discussion

Alan Gates: 1 post Toli Kuznets: 1 post

People

Translate

site design / logo © 2021 Grokbase