FAQ
# New Ticket Created by Ivan Tubert
# Please include the string: [perl #15431]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=15431 >



This is a bug report for perl from luisivan.tubert-brohman@yale.edu,
generated with the help of perlbug 1.26 running under perl 5.00503.

Regex bug: Getting two consecutive global matches to match with zero length

Global matches can match zero characters, leaving "pos" unchanged.
However, I have troble getting two succesful zero-character matches
in a row.

I have the following test program:

$s = 'abcde';
print "a?: ";
print $s =~ /\G(a?)/g ? "Matched '$1'\n" : "Failed!\n";
print "b?: ";
print $s =~ /\G(b?)/g ? "Matched '$1'\n" : "Failed!\n";
print "c?: ";
print $s =~ /\G(c?)/g ? "Matched '$1'\n" : "Failed!\n";

which gives the following output, as expected:

a?: Matched 'a'
b?: Matched 'b'
c?: Matched 'c'

Using $s = 'bcde' gives the expected result again:

a?: Matched ''
b?: Matched 'b'
c?: Matched 'c'

Same for $s = 'acde':

a?: Matched 'a'
b?: Matched ''
c?: Matched 'c'

But when $s = 'cde' doesn't do what I expect:

a?: Matched ''
b?: Failed!
c?: Matched 'c'

The same kind of problem happens with $s = 'ade':

a?: Matched 'a'
b?: Matched ''
c?: Failed!

Running the $s = 'cde' case with "use re 'debug'" gives the following:
(...)
Compiling REx `\G(b?)'
size 10 first at 2
1: GPOS(2)
2: OPEN1(4)
4: CURLY {0,1}(8)
6: EXACT <b>(0)
8: CLOSE1(10)
10: END(0)
anchored(GPOS) GPOS minlen 0
(...)
Matching REx `\G(a?)' against `cde'
Setting an EVAL scope, savestack=3
0 <> <cde> | 1: GPOS
0 <> <cde> | 2: OPEN1
0 <> <cde> | 4: CURLY {0,1}
EXACT <a> can match 0 times out of 1...
Setting an EVAL scope, savestack=3
0 <> <cde> | 8: CLOSE1
0 <> <cde> | 10: END
Match successful!
Matching REx `\G(b?)' against `cde'
Setting an EVAL scope, savestack=3
0 <> <cde> | 1: GPOS
0 <> <cde> | 2: OPEN1
0 <> <cde> | 4: CURLY {0,1}
EXACT <b> can match 0 times out of 1...
Setting an EVAL scope, savestack=3
0 <> <cde> | 8: CLOSE1
0 <> <cde> | 10: END
Match possible, but length=0 is smaller than requested=1, failing!
failed...
Match failed
(...)

The fist regular expression matches zero characters as expected. The second
regular expression matches zero characters, but then complains that the
length is smaller than requested=1. I don't think anyone requested that length,
and "minlen" was zero when the regex was compiled!

The same thing happens with perl 5.0, 5.6, and 5.8.

---
Site configuration information for perl 5.00503:

Configured by root at Mon Aug 30 23:08:56 EDT 1999.

Summary of my perl5 (5.0 patchlevel 5 subversion 3) configuration:
Platform:
osname=linux, osvers=2.2.5-22smp, archname=i386-linux
uname='linux porky.devel.redhat.com 2.2.5-22smp #1 smp wed jun 2 09:11:51 edt 1999 i686 unknown '
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef useperlio=undef d_sfio=undef
Compiler:
cc='cc', optimize='-O2', gccversion=egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
cppflags='-Dbool=char -DHAS_BOOL -I/usr/local/include'
ccflags ='-Dbool=char -DHAS_BOOL -I/usr/local/include'
stdchar='char', d_stdstdio=undef, usevfork=false
intsize=4, longsize=4, ptrsize=4, doublesize=8
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
alignbytes=4, usemymalloc=n, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -ldl -lm -lc -lposix -lcrypt
libc=, so=so, useshrplib=false, libperl=libperl.a
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:


---
@INC for perl 5.00503:
/home/ivan/perl/i386-linux
/home/ivan/perl
/home/ivan/perl/lib/site_perl/5.005/i386-linux
/home/ivan/perl/lib/site_perl/5.005
/usr/lib/perl5/5.00503/i386-linux
/usr/lib/perl5/5.00503
/usr/lib/perl5/site_perl/5.005/i386-linux
/usr/lib/perl5/site_perl/5.005
.

---
Environment for perl 5.00503:
HOME=/home/ivan
LANG=en_US
LANGUAGE (unset)
LC_ALL=en_US
LD_LIBRARY_PATH=/usr/local/g98/bsd:/usr/local/g98
LOGDIR (unset)
PATH=/usr/bin:/usr/bin:/bin:/home/ivan/bin:/usr/X11R6/bin:/usr/local/g98/bsd:/usr/local/g98:/home/ivan/bin
PERL5LIB=/home/ivan/perl:/home/ivan/perl/lib/site_perl/5.005
PERL_BADLANG (unset)
SHELL=/bin/tcsh

Search Discussions

  • Ronald J Kimball at Jul 23, 2002 at 11:27 pm

    On Tue, Jul 23, 2002 at 11:10:03PM -0000, Ivan Tubert wrote:
    # New Ticket Created by Ivan Tubert
    # Please include the string: [perl #15431]
    # in the subject line of all future correspondence about this issue.
    # <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=15431 >



    This is a bug report for perl from luisivan.tubert-brohman@yale.edu,
    generated with the help of perlbug 1.26 running under perl 5.00503.

    Regex bug: Getting two consecutive global matches to match with zero length

    Global matches can match zero characters, leaving "pos" unchanged.
    However, I have troble getting two succesful zero-character matches
    in a row.
    This is by design, and as documented in perlre. In order to prevent
    infinite loops when matching zero-length expressions, the match after a
    zero-length match is prohibited from also being a zero-length match.

    For example:

    s/\w??/<$&>/g;

    would match zero characters at the start of the string over and over.


    This endless loop would not occur in your test program, since you only
    apply each regex once, but the regex engine is not smart enough to make
    that distinction.


    Ronald
  • Hv at Jul 24, 2002 at 12:22 am
    Ronald J Kimball wrote:
    :> However, I have troble getting two succesful zero-character matches
    :> in a row.
    :
    :This is by design, and as documented in perlre. In order to prevent
    :infinite loops when matching zero-length expressions, the match after a
    :zero-length match is prohibited from also being a zero-length match.

    Note that the manner in which the information "I've already matched
    a zero-width at this position" is stored means that it is lost on
    assignment. This is probably a bug, and may even get fixed one day
    (though it is difficult), but in the meantime you can probably work
    around the feature by introducing something like C< pos($s) = pos($s) >
    between matches.

    Hugo

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupperl5-porters @
categoriesperl
postedJul 23, '02 at 11:15p
activeJul 24, '02 at 12:22a
posts3
users3
websiteperl.org

People

Translate

site design / logo © 2021 Grokbase