FAQ
I've been giving some thought to the semantics behind re_evals, as regards
their behaviour regarding lexical vars and closure; i.e. when do they get
(re)compiled, and what instances of lexical vars to they capture each
time, etc.

I've worked this up into a big series of TODO tests in the davem/re_eval
branch, and the commit message (shown below) summarises the basic
principles of how I think it should eventually work.

Can anyone see anything wrong with this?

commit e04b85764f6d15d12d2a25b56d117f71c4ce6f51
Author: David Mitchell <davem@iabyn.com>
AuthorDate: Mon Aug 8 17:56:10 2011 +0100
Commit: David Mitchell <davem@iabyn.com>
CommitDate: Tue Aug 9 15:29:22 2011 +0100

re_eval and closures: add lots of TODO tests

re_evals currently almost always do the wrong thing as regards what
lexical variable they refer to. This commit adds lots of TODO tests that
show what behaviour I think there should be. Note that because hardly any
of these tests pass yet, I haven't been able to verify whether they have
any subtle typos etc.

The basic philosophy behind these tests is:

* literal code is compiled once at compile-time and shares the same
lexical environment as its surroundings; i.e.

/A(?{..$x..})B/

is like

/A/ && do {..$x..} && /B/

* qr is treated as a closure: compiling once, but capturing its
environment anew each time it is instantiated; i.e.

for my $x (...) { push @r, qr/A(?{..$x..}B)/ }

is like

for my $x (...) { push @r, sub { /A/ && do {..$x..} && /B/ } }

* run-time code is recompiled each time the regex is compiled; literal
code in the same expression isn't recompiled; i.e.

$code = '(?{ BEGIN{$y++} })';
for (1..3) { /(?{ BEGIN{$x++}})$code/ }
# x==1, y==3

* an embedded qr is not stringified, so the qr retains its original
lexical environment; i.e.

$x = 1;
{ my $x = 2: $r = qr/(??{$x})/ }
/A$r/; # matches A2, not A1


Affected files ...

M t/re/pat_re_eval.t


--
It's not that I'm afraid to die, I just don't want to be there when it
happens.
-- Woody Allen

Search Discussions

  • Nicholas Clark at Aug 15, 2011 at 2:16 pm

    On Mon, Aug 15, 2011 at 11:13:43AM +0100, Dave Mitchell wrote:

    Can anyone see anything wrong with this?
    I've read it through once and can't see anything wrong.
    * run-time code is recompiled each time the regex is compiled; literal
    code in the same expression isn't recompiled; i.e.

    $code = '(?{ BEGIN{$y++} })';
    for (1..3) { /(?{ BEGIN{$x++}})$code/ }
    # x==1, y==3
    I hadn't considered BEGIN blocks. You're evil. (Or at least, covering all
    the corner cases)
    * an embedded qr is not stringified, so the qr retains its original
    lexical environment; i.e.

    $x = 1;
    { my $x = 2: $r = qr/(??{$x})/ }
    /A$r/; # matches A2, not A1
    I think that this is how it's going to have to be. But I suspect that it's
    going to cause some surprises.

    Nicholas Clark
  • Ricardo Signes at Aug 15, 2011 at 2:27 pm
    * Dave Mitchell [2011-08-15T06:13:43]
    * literal code is compiled once at compile-time and shares the same
    lexical environment as its surroundings; i.e. Yes!
    * qr is treated as a closure: compiling once, but capturing its
    environment anew each time it is instantiated; i.e. Yes!
    * run-time code is recompiled each time the regex is compiled; literal
    code in the same expression isn't recompiled; i.e.
    Probably! I think so, but I have a hard time imagining non-contrived cases
    where it's an issue.
    * an embedded qr is not stringified, so the qr retains its original
    lexical environment; i.e.
    Yes!

    --
    rjbs
  • Dave Mitchell at Aug 16, 2011 at 3:11 pm

    On Mon, Aug 15, 2011 at 09:10:02AM -0400, Ricardo Signes wrote:
    * Dave Mitchell [2011-08-15T06:13:43]
    * run-time code is recompiled each time the regex is compiled; literal
    code in the same expression isn't recompiled; i.e.
    Probably! I think so, but I have a hard time imagining non-contrived cases
    where it's an issue.
    The following demonstrates why run-time code should be recompiled each
    time the pattern is compiled:

    use re 'eval';
    my $code = '(??{$x})';
    for my $x (1,2,3) {
    print "match $x\n" if "A$x" =~ /A$code/;
    }


    which currently outputs:

    match 1

    while I want it to match all three times.



    --
    Hofstadter's Law: It always takes longer than you expect, even when you
    take into account Hofstadter's Law.
  • Ricardo Signes at Aug 16, 2011 at 3:06 pm
    * Dave Mitchell [2011-08-16T08:55:53]
    On Mon, Aug 15, 2011 at 09:10:02AM -0400, Ricardo Signes wrote:
    * Dave Mitchell [2011-08-15T06:13:43]
    * run-time code is recompiled each time the regex is compiled; literal
    code in the same expression isn't recompiled; i.e.
    Probably! I think so, but I have a hard time imagining non-contrived cases
    where it's an issue.
    The following demonstrates why run-time code should be recompiled each
    time the pattern is compiled:
    Thanks, that was a very clear example. If you wrote that in my code, I would
    be very annoyed, but I would at least admit that it should work the say you
    suggested. ;)

    --
    rjbs
  • Chip Salzenberg at Aug 18, 2011 at 5:39 am
    <aol/>
    I would be quite pleased with all these refinements.
  • Father Chrysostomos at Aug 15, 2011 at 4:06 pm

    Dave Mitchell asked:
    Can anyone see anything wrong with this?
    No. What you describe is exactly the model I had in mind when
    I added all those crashing tests.
  • David Nicol at Aug 19, 2011 at 8:12 pm

    On Mon, Aug 15, 2011 at 11:06 AM, Father Chrysostomos wrote:

    Dave Mitchell asked:
    Can anyone see anything wrong with this?
    No. What you describe is exactly the model I had in mind when
    I added all those crashing tests.
    Looks good; me too. The BEGIN semantics seem to follow from the model.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupperl5-porters @
categoriesperl
postedAug 15, '11 at 10:13a
activeAug 19, '11 at 8:12p
posts8
users6
websiteperl.org

People

Translate

site design / logo © 2022 Grokbase