FAQ
I think this regex fails because of an over-eager optimization:

"4 6" =~ m{
^
(?= .*? (\d) )
(?= .*? (??{ $1 - 2 }) )
}x;

Apparently, when the engine fails when (\d) matches 4, it ends up giving
up entirely, instead of backtracking (or forwardtracking) to allow (\d) to
match the 6.

Am I right, or am I crazy?

--
Jeff "japhy" Pinyan japhy@pobox.com http://www.pobox.com/~japhy/
RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
<stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
[ I'm looking for programming work. If you like my work, let me know. ]

Search Discussions

  • Ronald J Kimball at Sep 23, 2002 at 1:09 am

    On Sun, Sep 22, 2002 at 06:15:12PM -0400, Jeff 'japhy' Pinyan wrote:
    I think this regex fails because of an over-eager optimization:

    "4 6" =~ m{
    ^
    (?= .*? (\d) )
    (?= .*? (??{ $1 - 2 }) )
    }x;

    Apparently, when the engine fails when (\d) matches 4, it ends up giving
    up entirely, instead of backtracking (or forwardtracking) to allow (\d) to
    match the 6.

    Am I right, or am I crazy?
    Something like this came up before, and Ilya explained that the assertions
    in Perl's regular expressions are independent elements. That is, they
    match whatever an independent expression would match at that point. The
    engine does not backtrack into or out of an assertion. A regex like the
    above should not be expected to match.

    Ronald
  • Hv at Sep 23, 2002 at 2:00 am
    Ronald J Kimball wrote:
    :On Sun, Sep 22, 2002 at 06:15:12PM -0400, Jeff 'japhy' Pinyan wrote:
    :> I think this regex fails because of an over-eager optimization:
    [...]
    :Something like this came up before, and Ilya explained that the assertions
    :in Perl's regular expressions are independent elements. That is, they
    :match whatever an independent expression would match at that point. The
    :engine does not backtrack into or out of an assertion. A regex like the
    :above should not be expected to match.

    It is possible that perl6 will support a flag to change this behaviour,
    which would make a small number of useful things possible that are
    difficult to achieve efficiently by other means. I also plan to check
    how difficult it might be to add such an option to 5.9 once I've
    cleaned the code up some.

    Hugo
  • Hv at Sep 23, 2002 at 2:14 am
    "Jeff 'japhy' Pinyan" wrote:
    :I think this regex fails because of an over-eager optimization:
    :
    : "4 6" =~ m{
    : ^
    : (?= .*? (\d) )
    : (?= .*? (??{ $1 - 2 }) )
    : }x;
    :
    :Apparently, when the engine fails when (\d) matches 4, it ends up giving
    :up entirely, instead of backtracking (or forwardtracking) to allow (\d) to
    :match the 6.
    :
    :Am I right, or am I crazy?

    As noted by Ronald, this is the defined behaviour.

    There was a thread about this back in January that had some interesting
    comments; it starts here:
    http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2002-01/msg00276.html

    For a simple example like the above, you can get around it by giving
    both orderings explicitly:
    m{ ^ .*? (?: (\d) .*? (??{ $1 - 2 }) | (-[12]|[0-7]) .*? (??{ $2 + 2 }) ) }x
    ... but it won't always give the same first match, it is rather less easy
    to read, and it won't scale well to more complex problems. Another
    approach is just to split it out to two patterns.

    Hugo
  • Jeff 'japhy' Pinyan at Sep 23, 2002 at 2:17 am
    On Sep 23, hv@crypt.org said:
    For a simple example like the above, you can get around it by giving
    both orderings explicitly:
    m{ ^ .*? (?: (\d) .*? (??{ $1 - 2 }) | (-[12]|[0-7]) .*? (??{ $2 + 2 }) ) }x
    ... but it won't always give the same first match, it is rather less easy
    to read, and it won't scale well to more complex problems. Another
    approach is just to split it out to two patterns.
    I'm trying to solve cryptoquotes, or at least make the guesswork easier.

    --
    Jeff "japhy" Pinyan japhy@pobox.com http://www.pobox.com/~japhy/
    RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/
    ** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
    <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
    [ I'm looking for programming work. If you like my work, let me know. ]

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupperl5-porters @
categoriesperl
postedSep 22, '02 at 10:15p
activeSep 23, '02 at 2:17a
posts5
users3
websiteperl.org

People

Translate

site design / logo © 2021 Grokbase