FAQ
# New Ticket Created by Daniel Șuteu
# Please include the string: [perl #128225]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org/Ticket/Display.html?id=128225 >


The following program illustrates the issue:

### BEGIN-CODE ###

my $str = "foo";
$str =~ /(?{ s{}{} })/;

### END-CODE ###

Tested with perl-5.22.2 and perl-5.25.1, no output is produced, but after about a second, a segmentation fault exception is raised:

$ time perl bug.pl
[2] 25184 segmentation fault (core dumped) perl bug.pl
perl bug.pl 1.16s user 0.05s system 98% cpu 1.226 total

Search Discussions

  • Daniel Șuteu via RT at May 23, 2016 at 6:51 pm
    A somewhat related issue is the following:

    ### BEGIN-CODE ###

    m<
         (?{
             m<(?{
                 'print "Just another Perl hacker\n"'
             })>;

             s//$^R/ee;
         })
    x;
    //;
    //;
    //;
    //;
    //;
    //;
    //;
    //;
    //;

    ### END-CODE ###

    The output is "Just another Perl hacker", printed 10 times, for each empty match ("//").
  • Daniel Șuteu via RT at May 25, 2016 at 10:34 pm

    On Mon May 23 14:22:37 2016, demerphq wrote:
    For what its worth i believe this is safe if you wrap the s/// in a
    subcall. Obviously this is just a work around and we should fix the core
    cause....
    On 23 May 2016 14:51, "Daniel Șuteu via RT" wrote:

    A somewhat related issue is the following:

    ### BEGIN-CODE ###

    m<
    (?{
    m<(?{
    'print "Just another Perl hacker\n"'
    })>;

    s//$^R/ee;
    })
    x;
    //;
    //;
    //;
    //;
    //;
    //;
    //;
    //;
    //;

    ### END-CODE ###

    The output is "Just another Perl hacker", printed 10 times, for each empty
    match ("//").

    Seems like not s/// is the issue here, as the following code, where s/// was replaced with eval(), behaves the same:

    ### BEGIN-CODE ###

    m{
         (?{
             m<(?{
                 'print "Just another Perl hacker\n"'
             })>;
             eval $^R;
         })
    }x;

    //;//;//;//;//;

    ### END-CODE ###

    Probably I should open a new ticket for this?

    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Daniel Șuteu via RT at May 25, 2016 at 10:42 pm

    On Wed May 25 15:33:56 2016, trizenx@gmail.com wrote:
    On Mon May 23 14:22:37 2016, demerphq wrote:
    For what its worth i believe this is safe if you wrap the s/// in a
    subcall. Obviously this is just a work around and we should fix the
    core
    cause....
    On 23 May 2016 14:51, "Daniel Șuteu via RT" <perlbug-
    comment@perl.org>
    wrote:
    A somewhat related issue is the following:

    ### BEGIN-CODE ###

    m<
    (?{
    m<(?{
    'print "Just another Perl hacker\n"'
    })>;

    s//$^R/ee;
    })
    x;
    //;
    //;
    //;
    //;
    //;
    //;
    //;
    //;
    //;

    ### END-CODE ###

    The output is "Just another Perl hacker", printed 10 times, for
    each empty
    match ("//").

    Seems like not s/// is the issue here, as the following code, where
    s/// was replaced with eval(), behaves the same:

    ### BEGIN-CODE ###

    m{
    (?{
    m<(?{
    'print "Just another Perl hacker\n"'
    })>;
    eval $^R;
    })
    }x;

    //;//;//;//;//;

    ### END-CODE ###

    Probably I should open a new ticket for this?

    Code simplified to:

    ### BEGIN-CODE ###

    m{
         (?{ print "hi\n" })
    }x;

    //;//;//;//;//;

    ### END-CODE ###

    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Father Chrysostomos via RT at May 25, 2016 at 10:55 pm

    On Wed May 25 15:42:20 2016, trizenx@gmail.com wrote:
    Code simplified to:

    ### BEGIN-CODE ###

    m{
    (?{ print "hi\n" })
    }x;

    //;//;//;//;//;

    ### END-CODE ###
    I don‘t see what the bug is here. The empty pattern re-uses the last successful match.

    --

    Father Chrysostomos


    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Demerphq at May 26, 2016 at 12:25 am

    On 26 May 2016 at 00:55, Father Chrysostomos via RT wrote:
    On Wed May 25 15:42:20 2016, trizenx@gmail.com wrote:
    Code simplified to:

    ### BEGIN-CODE ###

    m{
    (?{ print "hi\n" })
    }x;

    //;//;//;//;//;

    ### END-CODE ###
    I don‘t see what the bug is here.
    I dont get it either.
    The empty pattern re-uses the last successful match.
    I don't get why you bring this up either.

    Yves
    --
    perl -Mre=debug -e "/just|another|perl|hacker/"
  • Daniel Șuteu via RT at May 25, 2016 at 11:33 pm

    On Wed May 25 15:55:20 2016, sprout wrote:
    On Wed May 25 15:42:20 2016, trizenx@gmail.com wrote:
    Code simplified to:

    ### BEGIN-CODE ###

    m{
    (?{ print "hi\n" })
    }x;

    //;//;//;//;//;

    ### END-CODE ###
    I don‘t see what the bug is here. The empty pattern re-uses the last
    successful match.
    I never heard of this behavior before. Is this officially documented?

    Personally, I see it as a security issue. For example, consider the following artificial scenario:

    ### BEGIN-CODE ###

    /(?{ print "sending money\n" })/x;

    print "Insert regex: ";
    chomp(my $regex = <STDIN>); # just press ENTER
    /\Q$regex/; # will send money again

    ### END-CODE ###

    If a user inserts a regular expression that happens to coincide with the last regular expression that successfully matched, but also executed some code in (?{}), the same code will be executed again, which is something that I don't think it should happen.

    In the above scenario, a user can take advantage of this behavior and exploit it in his favor, making it a security hole.

    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Father Chrysostomos via RT at May 26, 2016 at 1:03 am

    On Wed May 25 16:33:50 2016, trizenx@gmail.com wrote:
    On Wed May 25 15:55:20 2016, sprout wrote:
    On Wed May 25 15:42:20 2016, trizenx@gmail.com wrote:
    Code simplified to:

    ### BEGIN-CODE ###

    m{
    (?{ print "hi\n" })
    }x;

    //;//;//;//;//;

    ### END-CODE ###
    I don‘t see what the bug is here. The empty pattern re-uses the last
    successful match.
    I never heard of this behavior before. Is this officially documented?
    perl.git$ ack 'last successful' pod
    pod/perlfunc.pod
    7436:interpretation as the last successful match.

    pod/perlop.pod
    2078:evaluates to the empty string, the last successfully executed regular

    pod/perlretut.pod
    1559:the regexp in the I<last successful match> is used instead. So we have
    Personally, I see it as a security issue. For example, consider the
    following artificial scenario:

    ### BEGIN-CODE ###

    /(?{ print "sending money\n" })/x;

    print "Insert regex: ";
    chomp(my $regex = <STDIN>); # just press ENTER
    /\Q$regex/; # will send money again

    ### END-CODE ###
    You have to use (?:) in cases like that:

    /(?{ print "sending money\n" })/x;

    print "Insert regex: ";
    chomp(my $regex = <STDIN>); # just press ENTER
    /(?:\Q$regex\E)/; # will send money again
    If a user inserts a regular expression that happens to coincide with
    the last regular expression that successfully matched, but also
    executed some code in (?{}), the same code will be executed again,
    which is something that I don't think it should happen.
    Neither do I (at least with /$foo/; with // it should stay as it is), but it is hard to change this because of backward compatibility.

    That’s a separate issue from your original post. If you want to continue discussing this particular point, please open a new ticket.

    --

    Father Chrysostomos


    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Demerphq at May 26, 2016 at 4:10 am

    On 25 May 2016 21:03, "Father Chrysostomos via RT" wrote:
    On Wed May 25 16:33:50 2016, trizenx@gmail.com wrote:
    On Wed May 25 15:55:20 2016, sprout wrote:
    On Wed May 25 15:42:20 2016, trizenx@gmail.com wrote:
    Code simplified to:

    ### BEGIN-CODE ###

    m{
    (?{ print "hi\n" })
    }x;

    //;//;//;//;//;

    ### END-CODE ###
    I don‘t see what the bug is here. The empty pattern re-uses the last
    successful match.
    I never heard of this behavior before. Is this officially documented?
    perl.git$ ack 'last successful' pod
    pod/perlfunc.pod
    7436:interpretation as the last successful match.

    pod/perlop.pod
    2078:evaluates to the empty string, the last successfully executed regular

    pod/perlretut.pod
    1559:the regexp in the I<last successful match> is used instead. So we
    have
    Personally, I see it as a security issue. For example, consider the
    following artificial scenario:

    ### BEGIN-CODE ###

    /(?{ print "sending money\n" })/x;

    print "Insert regex: ";
    chomp(my $regex = <STDIN>); # just press ENTER
    /\Q$regex/; # will send money again

    ### END-CODE ###
    You have to use (?:) in cases like that:

    /(?{ print "sending money\n" })/x;

    print "Insert regex: ";
    chomp(my $regex = <STDIN>); # just press ENTER
    /(?:\Q$regex\E)/; # will send money again
    If a user inserts a regular expression that happens to coincide with
    the last regular expression that successfully matched, but also
    executed some code in (?{}), the same code will be executed again,
    which is something that I don't think it should happen.
    Neither do I (at least with /$foo/; with // it should stay as it is), but
    it is hard to change this because of backward compatibility.

    Fwiw i dont buy the back compat argument on this one. I have never seen
    this feature deliberately used, most people are unaware of it and when they
    discover it they consider it a bug like in this thread. In fact the only
    time I have seen it used is in toy code that I wrote to demonstrate the
    feature. I am convinced that nobody would notice and that the *many* issues
    that have come from it over the years justifies removing it entirely.
    That’s a separate issue from your original post. If you want to continue
    discussing this particular point, please open a new ticket.

    I agree. But feel free to quote my opinion on it when you do.

    Yved
  • Dan Collins via RT at May 23, 2016 at 7:54 pm
    This appears to be an overflow of the stack caused by infinite recursion, as evidenced by the following repeating stack frames:

    #11330 0x00000000006e43f3 in S_regmatch (reginfo=0x7fffffee9860, startpos=0xaba170 "foo", prog=0xaa37a0) at regexec.c:6731
    #11331 0x00000000006d7f00 in S_regtry (reginfo=0x7fffffee9860, startposp=0x7fffffee96c8) at regexec.c:3615
    #11332 0x00000000006d7957 in Perl_regexec_flags (rx=0xab2ae0, stringarg=0xaba170 "foo", strend=0xaba173 "", strbeg=0xaba170 "foo", minend=0, sv=0xab29c0, data=0x0, flags=1) at regexec.c:3482
    #11333 0x00000000005b8abd in Perl_pp_subst () at pp_hot.c:2982
    #11334 0x0000000000559af3 in Perl_runops_debug () at dump.c:2239

    Here is some GDB output from the very beginning of the backtrace:

    Breakpoint 1, Perl_regexec_flags (rx=0xab2ae0, stringarg=0xaba170 "foo", strend=0xaba173 "", strbeg=0xaba170 "foo", minend=0, sv=0xab29c0, data=0x0, flags=1) at regexec.c:2878
    2878 {
    (gdb) bt
    #0 Perl_regexec_flags (rx=0xab2ae0, stringarg=0xaba170 "foo", strend=0xaba173 "", strbeg=0xaba170 "foo", minend=0, sv=0xab29c0, data=0x0, flags=1) at regexec.c:2878
    #1 0x00000000005b8abd in Perl_pp_subst () at pp_hot.c:2982
    #2 0x0000000000559af3 in Perl_runops_debug () at dump.c:2239
    #3 0x00000000006e43f3 in S_regmatch (reginfo=0x7fffffffe160, startpos=0xaba170 "foo", prog=0xaa37a0) at regexec.c:6731
    #4 0x00000000006d7f00 in S_regtry (reginfo=0x7fffffffe160, startposp=0x7fffffffdfc8) at regexec.c:3615
    #5 0x00000000006d7957 in Perl_regexec_flags (rx=0xab2ae0, stringarg=0xaba170 "foo", strend=0xaba173 "", strbeg=0xaba170 "foo", minend=0, sv=0xab29c0, data=0x0, flags=97) at regexec.c:3482
    #6 0x00000000005afefd in Perl_pp_match () at pp_hot.c:1819
    #7 0x0000000000559af3 in Perl_runops_debug () at dump.c:2239
    #8 0x0000000000462138 in S_run_body (oldscope=1) at perl.c:2517
    #9 0x0000000000461763 in perl_run (my_perl=0xa9c010) at perl.c:2440
    #10 0x000000000041e8f0 in main (argc=4, argv=0x7fffffffe608, env=0x7fffffffe630) at perlmain.c:116
    (gdb) info locals
    prog = 0xab2750
    s = 0xabbaf0 "н\252"
    c = 0x0
    startpos = 0x0
    minlen = 4801052
    dontbother = 11193440
    utf8_target = false
    multiline = 11216720
    progi = 0xab2750
    reginfo_buf = {prog = 0x0, strbeg = 0x3 <error: cannot access memory@address 0x3>, strend = 0x1a <error: cannot access memory@address 0x1a>,
       till = 0x50 <error: cannot access memory@address 0x50>, sv = 0x0, ganch = 0x3000000003 <error: cannot access memory@address 0x3000000003>, cutpoint = 0x0, info_aux = 0x0,
       info_aux_eval = 0x5600000000, poscache_maxiter = 11217344, poscache_iter = 0, poscache_size = 0, intuit = false, is_utf8_pat = false, is_utf8_target = false, warned = false}
    reginfo = 0xa9ef00
    swap = 0x0
    oldsave = 0
    re_debug_flags = 140737488343936
    __PRETTY_FUNCTION__ = "Perl_regexec_flags"
    (gdb) f 3
    #3 0x00000000006e43f3 in S_regmatch (reginfo=0x7fffffffe160, startpos=0xaba170 "foo", prog=0xaa37a0) at regexec.c:6731
    6731 CALLRUNOPS(aTHX); /* Scalar context. */
    (gdb) info locals
    ocurcop = 0xabbed8
    nop = 0xabc048
    newcv = 0xa9e370
    sp = 0xab9e30
    before = 0
    oop = 0xabc198
    ret = 0xab63f0
    re_sv = 0xaa37a0
    startpoint = 0x0
    re = 0xaa37ac
    rei = 0xffffffff
    arg = 0
    utf8_target = false
    uniflags = 1
    rex_sv = 0xab2ae0
    rex = 0xabcc38
    rexi = 0xaa3770
    st = 0xaba318
    scan = 0xaa37a0
    next = 0xaa37ac
    n = 0
    ln = 0
    locinput = 0xaba170 "foo"
    pushinput = 0x200000011 <error: cannot access memory@address 0x200000011>
    nextchr = 102
    result = false
    depth = 0
    nochange_depth = 0
    max_nochange_depth = 10
    yes_state = 0x0
    mark_state = 0x0
    cur_eval = 0x0
    cur_curlyx = 0x0
    state_num = 68
    no_final = false
    do_cutgroup = false
    startpoint = 0xaba170 "foo"
    popmark = 0x0
    sv_commit = 0x0
    sv_yes_mark = 0x0
    lastopen = 0
    has_cutgroup = false
    oreplsv = 0xa9e160
    __PRETTY_FUNCTION__ = "S_regmatch"
    sw = false
    minmod = false
    logical = 0
    last_pad = 0xa9e388
    multicall_cop = 0xabbb30
    multicall_oldcatch = false
    gimme = 2 '\002'
    caller_cv = 0xa9e370
    last_pushed_cv = 0xa9e370
    runops_cp = 16
    maxopenparen = 0
    to_complement = 0
    classnum = _CC_ENUM_WORDCHAR
    is_utf8_pat = false
    match = false
    re_debug_flags = 0
    (gdb) l
    6726 * first op of the block of interest, rather than the
    6727 * first op of the sub. Also, we don't want to free
    6728 * the savestack frame */
    6729 before = (IV)(SP-PL_stack_base);
    6730 PL_op = nop;
    6731 CALLRUNOPS(aTHX); /* Scalar context. */
    6732 SPAGAIN;
    6733 if ((IV)(SP-PL_stack_base) == before)
    6734 ret = &PL_sv_undef; /* protect against empty (?{}) blocks. */
    6735 else {
    (gdb)

    Valgrind agrees that the segfault is caused by the stack overflow:

    ==64578== Stack overflow in thread #1: can't grow stack to 0xffe801000
    ==64578==
    ==64578== Process terminating with default action of signal 11 (SIGSEGV)
    ==64578== Access not within mapped region at address 0xFFE801FF8
    ==64578== Stack overflow in thread #1: can't grow stack to 0xffe801000
    ==64578== at 0x56DE00: Perl_mg_find (mg.c:414)
    ==64578== If you believe this happened as a result of a stack
    ==64578== overflow in your program's main thread (unlikely but
    ==64578== possible), you can try to increase the size of the
    ==64578== main thread stack using the --main-stacksize= flag.
    ==64578== The main thread stack size used in this run was 8388608.
    ==64578== Stack overflow in thread #1: can't grow stack to 0xffe801000
    ==64578==
    ==64578== Process terminating with default action of signal 11 (SIGSEGV)
    ==64578== Access not within mapped region at address 0xFFE801FF0
    ==64578== Stack overflow in thread #1: can't grow stack to 0xffe801000
    ==64578== at 0x4A24690: _vgnU_freeres (vg_preloaded.c:58)
    ==64578== If you believe this happened as a result of a stack
    ==64578== overflow in your program's main thread (unlikely but
    ==64578== possible), you can try to increase the size of the
    ==64578== main thread stack using the --main-stacksize= flag.
    ==64578== The main thread stack size used in this run was 8388608.
    ==64578==
    ==64578== HEAP SUMMARY:
    ==64578== in use at exit: 7,415,854 bytes in 11,349 blocks
    ==64578== total heap usage: 11,469 allocs, 120 frees, 8,271,345 bytes allocated
    ==64578==
    ==64578== LEAK SUMMARY:
    ==64578== definitely lost: 0 bytes in 0 blocks
    ==64578== indirectly lost: 0 bytes in 0 blocks
    ==64578== possibly lost: 0 bytes in 0 blocks
    ==64578== still reachable: 7,415,854 bytes in 11,349 blocks
    ==64578== suppressed: 0 bytes in 0 blocks
    ==64578== Rerun with --leak-check=full to see details of leaked memory
    ==64578==
    ==64578== For counts of detected and suppressed errors, rerun with: -v
    ==64578== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
    Segmentation fault

    **BISECT**

    This error has been present at least since 5.8.0.

    ---
    via perlbug: queue: perl5 status: new
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Dave Mitchell at May 30, 2016 at 12:08 pm

    On Mon, May 23, 2016 at 10:48:26AM -0700, Daniel Șuteu wrote:
    my $str = "foo";
    $str =~ /(?{ s{}{} })/;
    As has been pointed out elsewhere in this ticket, an empty pattern
    is interpreted as the "last successful match". The re-eval
    mechanism takes this as being the currently executing pattern,
    and so you get infinite recursion.

    Marking the currently executing pattern as 'last successful'
    is necessary within an re-eval so that things like $1 are visible:

         #prints "a"
         "ab" =~ /(.)(?{ print "[$1]\n" })/;

    --
    "You're so sadly neglected, and often ignored.
    A poor second to Belgium, When going abroad."
         -- Monty Python, "Finland"
  • Aristotle Pagaltzis at May 30, 2016 at 1:49 pm

    * Dave Mitchell [2016-05-30 15:09]:
    As has been pointed out elsewhere in this ticket, an empty pattern is
    interpreted as the "last successful match". The re-eval mechanism
    takes this as being the currently executing pattern, and so you get
    infinite recursion.
    Is it possible to detect this case and die instead? It wouldn’t do what
    the user likely expected it to even if it worked, after all.
  • Dave Mitchell at May 30, 2016 at 8:16 pm

    On Mon, May 30, 2016 at 03:49:06PM +0200, Aristotle Pagaltzis wrote:
    * Dave Mitchell [2016-05-30 15:09]:
    As has been pointed out elsewhere in this ticket, an empty pattern is
    interpreted as the "last successful match". The re-eval mechanism
    takes this as being the currently executing pattern, and so you get
    infinite recursion.
    Is it possible to detect this case and die instead? It wouldn’t do what
    the user likely expected it to even if it worked, after all.
    The following seems to detect it for pp_match() while not failing anything
    in the test suite. If no-one objects, I can work it up into a proper fix
    that handles pp_subst() etc, and has tests.

    I'm not sure if there are mutual recursion scenarios which could still
    slip past though.

         $ perl5240 -e'"a" =~ /(?{ m{} })/'
         Segmentation fault (core dumped)
         $ ./perl -e'"a" =~ /(?{ m{} })/'
         panic: XXX curpm recursion
         $



    diff --git a/pp_hot.c b/pp_hot.c
    index 223169b..5292383 100644
    --- a/pp_hot.c
    +++ b/pp_hot.c
    @@ -1767,6 +1767,8 @@ PP(pp_match)
          if (!ReANY(rx)->mother_re && !RX_PRELEN(rx)
           && PL_curpm) {
             pm = PL_curpm;
    + if (pm == PL_reg_curpm)
    + Perl_croak(aTHX_ "panic: XXX curpm recursion\n");
             rx = PM_GETRE(pm);
          }


    --
    Overhead, without any fuss, the stars were going out.
         -- Arthur C Clarke
  • Father Chrysostomos via RT at May 30, 2016 at 9:13 pm

    On Mon May 30 13:17:08 2016, davem wrote:
    The following seems to detect it for pp_match() while not failing anything
    in the test suite. If no-one objects, I can work it up into a proper fix
    that handles pp_subst() etc, and has tests. ...
    $ perl5240 -e'"a" =~ /(?{ m{} })/'
    Segmentation fault (core dumped)
    $ ./perl -e'"a" =~ /(?{ m{} })/'
    panic: XXX curpm recursion
    $
    I think that’s a good idea, but that it should not be a panic message, since a panic suggests that perl is not functioning correctly. Perhaps ‘Use of last successful match is not supported within regexp code blocks’. But I see from perldiag that we call them ‘eval-groups’ in existing messages.

    --

    Father Chrysostomos


    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Father Chrysostomos via RT at May 30, 2016 at 5:18 pm

    On Mon May 30 05:08:27 2016, davem wrote:
    On Mon, May 23, 2016 at 10:48:26AM -0700, Daniel Șuteu wrote:
    my $str = "foo";
    $str =~ /(?{ s{}{} })/;
    As has been pointed out elsewhere in this ticket, an empty pattern
    is interpreted as the "last successful match". The re-eval
    mechanism takes this as being the currently executing pattern,
    and so you get infinite recursion.

    Marking the currently executing pattern as 'last successful'
    is necessary within an re-eval so that things like $1 are visible:

    #prints "a"
    "ab" =~ /(.)(?{ print "[$1]\n" })/;
    Does the last-successful-match logic use PL_curpm?

    --

    Father Chrysostomos


    ---
    via perlbug: queue: perl5 status: open
    https://rt.perl.org/Ticket/Display.html?id=128225
  • Dave Mitchell at May 30, 2016 at 7:12 pm

    On Mon, May 30, 2016 at 10:18:38AM -0700, Father Chrysostomos via RT wrote:
    Does the last-successful-match logic use PL_curpm?
    Yes.

    --
    My Dad used to say 'always fight fire with fire', which is probably why
    he got thrown out of the fire brigade.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupperl5-porters @
categoriesperl
postedMay 23, '16 at 5:48p
activeMay 30, '16 at 9:13p
posts16
users5
websiteperl.org

People

Translate

site design / logo © 2022 Grokbase