Grokbase Groups Pig user July 2010
FAQ
All,



I am using pig embedded in Java and need to use matches in my pig job.
However when I try to use escape characters in the pig line, the
compiler complains. How do I use complex regex while embedding?



Sample code that is throwing errors:



myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
'\Q34.21.12.*\E';");



error: invalid escape sequence.



Thanks,



Matt

Search Discussions

  • Brian Adams at Jul 20, 2010 at 9:14 pm
    double escape, in our setup

    so a word break that is usually \b, needs to be '\\bthe\\b'


    -----Original Message-----
    From: Matthew Smith
    Sent: Tue 7/20/2010 5:03 PM
    To: pig-user@hadoop.apache.org
    Subject: Using Regex

    All,



    I am using pig embedded in Java and need to use matches in my pig job.
    However when I try to use escape characters in the pig line, the
    compiler complains. How do I use complex regex while embedding?



    Sample code that is throwing errors:



    myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
    '\Q34.21.12.*\E';");



    error: invalid escape sequence.



    Thanks,



    Matt
  • Matthew Smith at Jul 20, 2010 at 9:39 pm
    myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
    '\\Q32.21.12.\\E*';");

    throws runtime error:


    Exception in thread "main"
    org.apache.pig.impl.logicalLayer.parser.TokenMgrError: Lexical error at
    line 1, column 45. Encountered: "Q" (81), after : "\'\\"
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParserTokenManager.getNextT
    oken(QueryParserTokenManager.java:1693)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.jj_consume_token(Que
    ryParser.java:7807)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PUnaryCond(QueryPars
    er.java:1581)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PAndCond(QueryParser
    .java:1433)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.POrCond(QueryParser.
    java:1377)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PCond(QueryParser.ja
    va:1343)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.FilterClause(QueryPa
    rser.java:1253)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser
    .java:985)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.jav
    a:795)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.ja
    va:590)
    at
    org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBui
    lder.java:60)
    at org.apache.pig.PigServer.parseQuery(PigServer.java:298)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:266)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:352)

    thoughts?

    -----Original Message-----
    From: Brian Adams
    Sent: Tuesday, July 20, 2010 5:13 PM
    To: pig-user@hadoop.apache.org
    Subject: RE: Using Regex

    double escape, in our setup

    so a word break that is usually \b, needs to be '\\bthe\\b'


    -----Original Message-----
    From: Matthew Smith
    Sent: Tue 7/20/2010 5:03 PM
    To: pig-user@hadoop.apache.org
    Subject: Using Regex

    All,



    I am using pig embedded in Java and need to use matches in my pig job.
    However when I try to use escape characters in the pig line, the
    compiler complains. How do I use complex regex while embedding?



    Sample code that is throwing errors:



    myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
    '\Q34.21.12.*\E';");



    error: invalid escape sequence.



    Thanks,



    Matt
  • Dmitriy Ryaboy at Jul 20, 2010 at 10:05 pm
    It's a terrible thing, but keep adding slashes. Seriously.
    First, you need to escape the slash so Java passes it through. Then you need
    to escape each of those slashes so the pig parser passes it through. So 4
    slashes should do it.
    On Tue, Jul 20, 2010 at 2:40 PM, Matthew Smith wrote:

    myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
    '\\Q32.21.12.\\E*';");

    throws runtime error:


    Exception in thread "main"
    org.apache.pig.impl.logicalLayer.parser.TokenMgrError: Lexical error at
    line 1, column 45. Encountered: "Q" (81), after : "\'\\"
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParserTokenManager.getNextT
    oken(QueryParserTokenManager.java:1693)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.jj_consume_token(Que
    ryParser.java:7807)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PUnaryCond(QueryPars
    er.java:1581)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PAndCond(QueryParser
    .java:1433)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.POrCond(QueryParser.
    java:1377)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PCond(QueryParser.ja
    va:1343)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.FilterClause(QueryPa
    rser.java:1253)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser
    .java:985)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.jav
    a:795)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.ja
    va:590)
    at
    org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBui
    lder.java:60)
    at org.apache.pig.PigServer.parseQuery(PigServer.java:298)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:266)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:352)

    thoughts?

    -----Original Message-----
    From: Brian Adams
    Sent: Tuesday, July 20, 2010 5:13 PM
    To: pig-user@hadoop.apache.org
    Subject: RE: Using Regex

    double escape, in our setup

    so a word break that is usually \b, needs to be '\\bthe\\b'


    -----Original Message-----
    From: Matthew Smith
    Sent: Tue 7/20/2010 5:03 PM
    To: pig-user@hadoop.apache.org
    Subject: Using Regex

    All,



    I am using pig embedded in Java and need to use matches in my pig job.
    However when I try to use escape characters in the pig line, the
    compiler complains. How do I use complex regex while embedding?



    Sample code that is throwing errors:



    myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
    '\Q34.21.12.*\E';");



    error: invalid escape sequence.



    Thanks,



    Matt



  • Matthew Smith at Jul 20, 2010 at 10:31 pm
    Four slashes did it. Thanks!

    -----Original Message-----
    From: Dmitriy Ryaboy
    Sent: Tuesday, July 20, 2010 6:04 PM
    To: pig-user@hadoop.apache.org
    Subject: Re: Using Regex

    It's a terrible thing, but keep adding slashes. Seriously.
    First, you need to escape the slash so Java passes it through. Then you
    need
    to escape each of those slashes so the pig parser passes it through. So
    4
    slashes should do it.

    On Tue, Jul 20, 2010 at 2:40 PM, Matthew Smith
    wrote:
    myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
    '\\Q32.21.12.\\E*';");

    throws runtime error:


    Exception in thread "main"
    org.apache.pig.impl.logicalLayer.parser.TokenMgrError: Lexical error at
    line 1, column 45. Encountered: "Q" (81), after : "\'\\"
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParserTokenManager.getNextT
    oken(QueryParserTokenManager.java:1693)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.jj_consume_token(Que
    ryParser.java:7807)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PUnaryCond(QueryPars
    er.java:1581)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PAndCond(QueryParser
    .java:1433)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.POrCond(QueryParser.
    java:1377)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.PCond(QueryParser.ja
    va:1343)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.FilterClause(QueryPa
    rser.java:1253)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser
    .java:985)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.jav
    a:795)
    at
    org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.ja
    va:590)
    at
    org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBui
    lder.java:60)
    at org.apache.pig.PigServer.parseQuery(PigServer.java:298)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:266)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:352)

    thoughts?

    -----Original Message-----
    From: Brian Adams
    Sent: Tuesday, July 20, 2010 5:13 PM
    To: pig-user@hadoop.apache.org
    Subject: RE: Using Regex

    double escape, in our setup

    so a word break that is usually \b, needs to be '\\bthe\\b'


    -----Original Message-----
    From: Matthew Smith
    Sent: Tue 7/20/2010 5:03 PM
    To: pig-user@hadoop.apache.org
    Subject: Using Regex

    All,



    I am using pig embedded in Java and need to use matches in my pig job.
    However when I try to use escape characters in the pig line, the
    compiler complains. How do I use complex regex while embedding?



    Sample code that is throwing errors:



    myServer.registerQuery("filtered = FILTER firstcut BY dIP matches
    '\Q34.21.12.*\E';");



    error: invalid escape sequence.



    Thanks,



    Matt



Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 20, '10 at 8:58p
activeJul 20, '10 at 10:31p
posts5
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase