Grokbase
x

Dave Mitchell (d...@iabyn.com)

Profile | Posts (1045)

User Information

Display Name:Dave Mitchell
Partial Email Address:d...@iabyn.com
Posts:
1045 total
13 in Fedora
1032 in Perl 5 Porters

5 Most Recent

All Posts
1) Dave Mitchell Re: [perl #72930] failed system() doesn't set $?
| +1 vote
The system() docs since at least 5.000 have been deeply ambiguous about the relationship between $?...
Perl 5 Porters
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
On Thu, Feb 18, 2010 at 10:57:57PM +0100, Rafael Garcia-Suarez wrote:
> On 18 February 2010 17:43, none via RT <perlbug-followup@perl.org> wrote:
> > When the fork of a system() fails, the return value of system() is
> > (correctly) -1, but the value of $? is (arguably incorrectly) still 0.
>
> $? is supposed to return the status from the child process, which in
> that case couldn't be created. So $? can't be trusted, and I'd suggest
> this is a doc bug.

The system() docs since at least 5.000 have been deeply ambiguous about
the relationship between $? and the return value of system(). Since
5.005_03 we've been promising "You can check all the failure possibilities
by inspecting C<$?>", so I suspect there's a fair chunk of code out there
that does
    system(...):
    various_tests_for_failure($?);
and that all miss the rare but possible case that the fork itself fails.

Note that we already fake up the value of $? under some circumstances:
if the fork succeeds but the exec fails, we pass the errno of the exec in
a pipe back to the parent, then set $? to - 1 and $! to that errno.

I think system should set $? to -1 on fork failure because:

1) it makes things easier; it is no longer the case that $? and system()
always have the same value *except* for a few rare circumstances

2) We make a lot of code already out there that relied on our ambiguous
documentation actual trap some error conditions that they currently
silently miss.

3) I for one was very confused by the perlmonks thread and look me along
while to think of checking for system() return as well as $?

Anyway for info, the history of the system() documentation in perlfunc, ad
regards return value and $?, is roughly as follows:


5.000 has this basic entry:

    The return value is the exit status of the program as
    returned by the wait() call.  To get the actual exit value divide by
    256.


5.005_03: added usage examples:

    @args = ("command", "arg1", "arg2");
    system(@args) == 0
  or die "system @args failed: $?"

    You can check all the failure possibilities by inspecting
    C<$?> like this:

 $exit_value  = $? >> 8;
 $signal_num  = $? & 127;
$dumped_core = $? & 128;


5.6.0 added:

    Return value of -1 indicates a failure to start the program (inspect $!
    for the reason).


5.10.0 amended return value description and updated the $? processing
example:

    Return value of -1 indicates a failure to start the program or an
    error of the wait(2) system call (inspect $! for the reason).


    if ($? == -1) {
print "failed to execute: $!\n";
    }
    elsif ($? & 127) {
printf "child died with signal %d, %s coredump\n",
     ($? & 127),  ($? & 128) ? 'with' : 'without';
    }
    else {
printf "child exited with value %d\n", $? >> 8;
    }


--
Any [programming] language that doesn't occasionally surprise the
novice will pay for it by continually surprising the expert.
-- Larry Wall
2) Dave Mitchell Re: [perl #72928] POSIX::strftime hangs with pattern %z
| +1 vote
This appears to have been fixed in 5.8.9 and 5.10.0
Perl 5 Porters
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
On Thu, Feb 18, 2010 at 08:42:24AM -0800, Krejci, Pavel wrote:
> The POSIX::strftime hangs when the pattern %z is specified:
>
> POSIX::tzset();
> my ($sec, $min, $hour, $day, $mon, $year, $wday,$yday,$isdst) = localtime( time() ) ;
> my $tz_off = POSIX::strftime( "%z", $sec,$min,$hour,$day,$mon,$year);
>
> The strace shows that perl makes infinite loop:
> stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2246, ...}) = 0
> stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2246, ...}) = 0
> stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2246, ...}) = 0
> stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2246, ...}) = 0

This appears to have been fixed in 5.8.9 and 5.10.0

--
It's not that I'm afraid to die, I just don't want to be there when it
happens.
-- Woody Allen
3) Dave Mitchell Re: lexicals in ?{} code blocks in regexps
| +1 vote
yes yes-ish. In fact, in my future fantasy world, any code blocks in a literal qr// never get...
Perl 5 Porters
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
On Mon, Feb 15, 2010 at 05:05:37PM +0100, demerphq wrote:
>  But what should happen here:
>
>     my $x;
>     my $qr=qr/(?{ print "REX x=$x\n" })a/;
>     for my $x (1,2) {
>         print "STR x=$x\n";
>         "aa" =~ /a$qr/;
>     }
>
> should it print out:
>
>     STR x=1
>     REX x=
>     STR x=2
>     REX x=

yes

> I guess what im trying to say is, should the code in a qr be evaled
> once and then treated as closure,

yes-ish.

In fact, in my future fantasy world, any code blocks in a literal qr//
never get evaled. They get compiled once, at the same point as the
surrounding code is parsed and compiled. Then each time an RE object is
generated (e.g.  push @objs, qr/.../ for 1..10) a closure is created, and
the current instances of any lexical vars are captured. So qr// looks and
feels just like sub{}.

Finally, when a regex object is used at runtime within another, eg
    my $re = qr/..../;
    /abc$re/
then the code block parts of it are not stringified and recompiled, but
are embedded in the new RE as-is.

> or should it be rebound to the
> closest context (like an eval block). Both behaviours have something
> to say for them IMO. And really make a difference.

I think my proposal is more consistent with the way closures work, and
thus I prefer it. I always remember being really surprised when I first
discovered that qr//s embedded in another pattern just get stringified
and recompiled.

> Does it matter if the match is something like $_=~m/X$qr/ versus
> $_=~m/$qr/? One might imagine the former rebinds and the latter
> doesnt.

At the moment the second is optimised to not recompile. In an ideal world
the first wouldn't too, although I don't know how viable that is in
general, (perhaps it could use a similar mechanism to  how (??{...}) is
injected back into the pattern?). But certainly the code parts of $qr
shouldn't be stringified and recompiled.

A few years ago I changed pp_regcomp etc to be list operators, so that

    /abc$d/

is now parsed as

    match("abc",$d)

rather than the former

    match("abc".$d),

so it's possible (in principle) to pass the actual $d object to the re
compiler rather than just the stringification of it, for the re compiler
to make whatever use of it it desires (e.g. pulling out the pre-compiled
codeblocks).

> What happens if the code references itself?

If it's a my var, you get the same thing as with an anon sub: if the var
hasn't been introduced yet, you get the outer var:

    my $x = sub {$x->()}; # doesn't recurse

So you'd have to introduce it first:

    my $qr;
    $qr = /....(??{$qr})/

Then it recurses as you'd expect (c.f. the standard balanced parenthesis
demo).

> Another relatively open question is also whether code blocks should
> disable optimisations in the regex engine. We currently disable some
> optimisations, but iirc, not all when we encounter (??{ }) code
> blocks, but im not sure about (?{ .. }), I think we have open "bugs"
> relating to there not being a defined rule and i know there are tests
> marked TODO when codeblocks _stop_ disabling optimisations :-). If we
> are going to "fix" codeblocks in regexes we probably should have a
> well defined behavior in this regard as well.

That wasn't something I'm aware of. I'd hope that issue is somewhat
orthogonal to the fixing of scope/closure behaviour.

--
Modern art:
    "It's easy, I could have done that!"
"Yes, but you didn't"
4) Dave Mitchell Re: lexicals in ?{} code blocks in regexps
| +1 vote
For example: currently, this code: for my $x (1,2) { print "STR x=$x\n"; "a" =~ /(?{ print "REX...
Perl 5 Porters
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
On Mon, Feb 15, 2010 at 01:30:54PM +0000, Dave Mitchell wrote:
> Well, my opinion is that, as regards scope and closure behaviour,
> these two should be equivalent in their treatment of $x:
>
>     "... $x ..."
>     /(?{... $x ...})/

For example: currently, this code:

    for my $x (1,2) {
print "STR x=$x\n";
"a" =~ /(?{ print "REX x=$x\n" })a/;
    }

gives this:

    STR x=1
    REX x=
    STR x=2
    REX x=

--
Modern art:
    "It's easy, I could have done that!"
"Yes, but you didn't"
5) Dave Mitchell Re: lexicals in ?{} code blocks in regexps
| +1 vote
Well, my opinion is that, as regards scope and closure behaviour, these two should be equivalent in...
Perl 5 Porters
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
On Mon, Feb 15, 2010 at 02:19:56PM +0100, demerphq wrote:
> I think it would be good before you put too much effort into this if
> there was some discussion about some of the thornier problems of code
> in regexes.
>
> For instance, what is to happen in a case like:
>
> my ($x,$y,$z);
> my $qr1=qr/X(?{ $x++ })/;
> my $qr2=qr/Y(?{ $y++ })/;
> my $qr3=qr/Z(?{ $z++ })/;
> my @refs=\($x,$y,$z);
> {
>   my ($x,$y,$z);
>   if (/$qr1$qr2$qr3/) {
>         print join("\t",map { $$_ } @refs),"\n";
>         print join("\t",$x,$y,$z),"\n";
>   }
> }
>
> Should it be the inner or outer $x,$y,$z that is updated? Either makes
> sense, which is right? Other thornier questions might come up. Like
> what happens if the code alters the scalar holding the original qr,
> things like that.
>
> It seems to me the real problem here is not so much the question of
> how to implement a solution, but rather what the expected semantics
> are of a useful solution. If we had the latter what was required for
> the former would be to some degree obvious.

Well, my opinion is that, as regards scope and closure behaviour,
these two should be equivalent in their treatment of $x:

    "... $x ..."
    /(?{... $x ...})/

as should these two:

    sub {... $x ...}
    qr/(?{... $x ...})/

i.e at the point a regex object is created, it captures any lexicals, in
the same manner as when an anon sub is created.

So in your example above, it would be the outer $x,$y,$z.

--
Fire extinguisher (n) a device for holding open fire doors.

spacer
Profile | Posts (1045)
Home > People > Dave Mitchell