Hello,
RT#45667 points out that the Perl parser and the Regexp parser treat
m/[#]/x differently. The former treats the # as a comment character
and the latter treats it as a literal. I propose that we adopt the
Perl parser's behavior because 1) doing otherwise would require us to
add more regexp-specific knowledge to the Perl parser and 2) other
Regexp engines may not want to treat [#] the same as the current
default engine.
The patch below adds a (currently failing) test to the regexp parser
to enforce this decision. The patch is against rsync 5.10.0.32199
Chris
(P.S. my first post to P5P)
--- t/op/re_tests.orig 2007-10-28 09:26:00.000000000 -0500
+++ t/op/re_tests 2007-10-28 09:47:49.000000000 -0500
@@ -1338,3 +1338,5 @@
.*\z foo\n y - -
^(?:(\d)x)?\d$ 1 y ${\(defined($1)?1:0)} 0
.*?(?:(\w)|(\w))x abx y $1-$2 b-
+
+'foo [#]'x foo# c - Unmatched [
[P5P] [PATCH] New failing test for RT#45667 (m/[#]/x treated inconsistently)
| Tweet |
|
Search Discussions
-
Rafael Garcia-Suarez at Oct 30, 2007 at 12:53 pm ⇧
it doesn't seem that op/regexp.t supports TODO tests. That wouldOn 28/10/2007, Chris Dolan wrote:
Hello,
RT#45667 points out that the Perl parser and the Regexp parser treat
m/[#]/x differently. The former treats the # as a comment character
and the latter treats it as a literal. I propose that we adopt the
Perl parser's behavior because 1) doing otherwise would require us to
add more regexp-specific knowledge to the Perl parser and 2) other
Regexp engines may not want to treat [#] the same as the current
default engine.
The patch below adds a (currently failing) test to the regexp parser
to enforce this decision. The patch is against rsync 5.10.0.32199
probably be a worthwhile addition...
-
Demerphq at Oct 30, 2007 at 2:04 pm ⇧
Probably is. But im not entirely comfortable with the original patch.On 10/30/07, Rafael Garcia-Suarez wrote:On 28/10/2007, Chris Dolan wrote:it doesn't seem that op/regexp.t supports TODO tests. That would
Hello,
RT#45667 points out that the Perl parser and the Regexp parser treat
m/[#]/x differently. The former treats the # as a comment character
and the latter treats it as a literal. I propose that we adopt the
Perl parser's behavior because 1) doing otherwise would require us to
add more regexp-specific knowledge to the Perl parser and 2) other
Regexp engines may not want to treat [#] the same as the current
default engine.
The patch below adds a (currently failing) test to the regexp parser
to enforce this decision. The patch is against rsync 5.10.0.32199
probably be a worthwhile addition...
Its not clear to me that we can change this behaviour anymore. At
least not in the regex engine. Its actually probably easier to change
this in the perl parser, which we long term intend to change anyway in
order to resolve the whole qr// as closure stuff as well as variable
bindings amongst other issues.
Probably a good idea if Dave Mitchell speaks up on this one as to my
knowledge he has the most developed idea of the future of this.
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/" -
Dave Mitchell at Oct 30, 2007 at 2:15 pm ⇧
The bug's definitely in the perl parser, and I still intend to fix it postOn Tue, Oct 30, 2007 at 03:04:28PM +0100, demerphq wrote:
Probably is. But im not entirely comfortable with the original patch.
Its not clear to me that we can change this behaviour anymore. At
least not in the regex engine. Its actually probably easier to change
this in the perl parser, which we long term intend to change anyway in
order to resolve the whole qr// as closure stuff as well as variable
bindings amongst other issues.
Probably a good idea if Dave Mitchell speaks up on this one as to my
knowledge he has the most developed idea of the future of this.
5.10.0. The fix is kinda orthogonal to the qr// and closure stuff.
Actually, there's a bit of uncertainty as to whether I cans still do that
qr// stuff, now that there's an API been added to the regex parser: I
don't think it can can handle want I would like to pass to it (a mixture
of const strings and snippets of opcode trees), of if I could pass it,
whether other plugged-in engines (sucha as PCRE) wouldn't choke on it.
Anyway, I haven't looked closely at it yet.
--
Gravity is just a theory; teach Intelligent Falling in our schools!
http://www.theonion.com/content/node/39512 -
Demerphq at Oct 30, 2007 at 3:30 pm ⇧
I dont think you should worry about that. I think fixing the generalOn 10/30/07, Dave Mitchell wrote:On Tue, Oct 30, 2007 at 03:04:28PM +0100, demerphq wrote:The bug's definitely in the perl parser, and I still intend to fix it post
Probably is. But im not entirely comfortable with the original patch.
Its not clear to me that we can change this behaviour anymore. At
least not in the regex engine. Its actually probably easier to change
this in the perl parser, which we long term intend to change anyway in
order to resolve the whole qr// as closure stuff as well as variable
bindings amongst other issues.
Probably a good idea if Dave Mitchell speaks up on this one as to my
knowledge he has the most developed idea of the future of this.
5.10.0. The fix is kinda orthogonal to the qr// and closure stuff.
Ok. I thought they were related.
Actually, there's a bit of uncertainty as to whether I cans still do that
qr// stuff, now that there's an API been added to the regex parser: I
don't think it can can handle want I would like to pass to it (a mixture
of const strings and snippets of opcode trees), of if I could pass it,
whether other plugged-in engines (sucha as PCRE) wouldn't choke on it.
Anyway, I haven't looked closely at it yet.
behaviour takes precedence over supporting experimental regex engine
plug ins. I believe that the API has been documented to be a work in
progress anyway, and if it hasnt we should make sure to do so before
5.10 is released. Of course if what you design has an api that makes
it easy for the engine plug ins to adapt it would be a good thing. :-)
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/" -
Dave Mitchell at Oct 30, 2007 at 4:10 pm ⇧
Well, they're slightly related, but the two fixes can be developedOn Tue, Oct 30, 2007 at 04:30:46PM +0100, demerphq wrote:On 10/30/07, Dave Mitchell wrote:Ok. I thought they were related.On Tue, Oct 30, 2007 at 03:04:28PM +0100, demerphq wrote:The bug's definitely in the perl parser, and I still intend to fix it post
Probably is. But im not entirely comfortable with the original patch.
Its not clear to me that we can change this behaviour anymore. At
least not in the regex engine. Its actually probably easier to change
this in the perl parser, which we long term intend to change anyway in
order to resolve the whole qr// as closure stuff as well as variable
bindings amongst other issues.
Probably a good idea if Dave Mitchell speaks up on this one as to my
knowledge he has the most developed idea of the future of this.
5.10.0. The fix is kinda orthogonal to the qr// and closure stuff.
independently of each other. They both involve making the tokeniser be a
bit more clever when processing literal strings. There's a function that
scans through characters in a literal string until it finds something
"interesting". For example, in the double-quoted string "abc$d", it stops
at the $ and returns "abc" as a token. One fix involves stopping that
function thinking that # is interesting when within [], the other involves
making it think that '(?{' is interesting.Okay :-)Actually, there's a bit of uncertainty as to whether I cans still do thatI dont think you should worry about that. I think fixing the general
qr// stuff, now that there's an API been added to the regex parser: I
don't think it can can handle want I would like to pass to it (a mixture
of const strings and snippets of opcode trees), of if I could pass it,
whether other plugged-in engines (sucha as PCRE) wouldn't choke on it.
Anyway, I haven't looked closely at it yet.
behaviour takes precedence over supporting experimental regex engine
plug ins. I believe that the API has been documented to be a work in
progress anyway, and if it hasnt we should make sure to do so before
5.10 is released. Of course if what you design has an api that makes
it easy for the engine plug ins to adapt it would be a good thing. :-)
--
The warp engines start playing up a bit, but seem to sort themselves out
after a while without any intervention from boy genius Wesley Crusher.
-- Things That Never Happen in "Star Trek" #17 -
Chris Dolan at Oct 31, 2007 at 11:09 am ⇧
OK, fair enough. Consistency is what I seek. :-) In that light,On Oct 30, 2007, at 11:08 AM, Dave Mitchell wrote:
Well, they're slightly related, but the two fixes can be developed
independently of each other. They both involve making the tokeniser
be a
bit more clever when processing literal strings. There's a function
that
scans through characters in a literal string until it finds something
"interesting". For example, in the double-quoted string "abc$d", it
stops
at the $ and returns "abc" as a token. One fix involves stopping that
function thinking that # is interesting when within [], the other
involves
making it think that '(?{' is interesting.
here's an alternative proposed test patch. As of 32199, this patch
yields (abbreviated somewhat):
ok 1342
ok 1343
ok 1344
not ok 1345 () 'foo \[#\]'x:foo[#]:y:$&:foo[#] => `foo[', match=1
not ok 1346 () 'foo [#]':foo#:y:$&:foo# => `'', match=
ok 1347
ok 1348
not ok 1349 () 'foo \[#\]':foo[#]:y:$&:foo[#] => `'', match=
Chris
--- t/op/re_tests.orig 2007-10-28 09:26:00.000000000 -0500
+++ t/op/re_tests 2007-10-30 19:55:16.000000000 -0500
@@ -1338,3 +1338,12 @@
.*\z foo\n y - -
^(?:(\d)x)?\d$ 1 y ${\(defined($1)?1:0)} 0
.*?(?:(\w)|(\w))x abx y $1-$2 b-
+
+'foo [#]'x foo# y $& foo#
+'foo [#'x foo# c - Unmatched [
+'foo \[#]'x foo# n - -
+'foo \[#\]'x foo[#] y $& foo[#]
+'foo [#]' foo# y $& foo#
+'foo [#' foo# c - Unmatched [
+'foo \[#]' foo# n - -
+'foo \[#\]' foo[#] y $& foo[#]
-
Demerphq at Oct 31, 2007 at 11:30 am ⇧
I think there is a way to mark the failing ones as todo tests, youdOn 10/31/07, Chris Dolan wrote:On Oct 30, 2007, at 11:08 AM, Dave Mitchell wrote:OK, fair enough. Consistency is what I seek. :-) In that light,
Well, they're slightly related, but the two fixes can be developed
independently of each other. They both involve making the tokeniser
be a
bit more clever when processing literal strings. There's a function
that
scans through characters in a literal string until it finds something
"interesting". For example, in the double-quoted string "abc$d", it
stops
at the $ and returns "abc" as a token. One fix involves stopping that
function thinking that # is interesting when within [], the other
involves
making it think that '(?{' is interesting.
here's an alternative proposed test patch. As of 32199, this patch
yields (abbreviated somewhat):
ok 1342
ok 1343
ok 1344
not ok 1345 () 'foo \[#\]'x:foo[#]:y:$&:foo[#] => `foo[', match=1
not ok 1346 () 'foo [#]':foo#:y:$&:foo# => `'', match=
ok 1347
ok 1348
not ok 1349 () 'foo \[#\]':foo[#]:y:$&:foo[#] => `'', match=
Chris
--- t/op/re_tests.orig 2007-10-28 09:26:00.000000000 -0500
+++ t/op/re_tests 2007-10-30 19:55:16.000000000 -0500
@@ -1338,3 +1338,12 @@
.*\z foo\n y - -
^(?:(\d)x)?\d$ 1 y ${\(defined($1)?1:0)} 0
.*?(?:(\w)|(\w))x abx y $1-$2 b-
+
+'foo [#]'x foo# y $& foo#
+'foo [#'x foo# c - Unmatched [
+'foo \[#]'x foo# n - -
+'foo \[#\]'x foo[#] y $& foo[#]
+'foo [#]' foo# y $& foo#
+'foo [#' foo# c - Unmatched [
+'foo \[#]' foo# n - -
+'foo \[#\]' foo[#] y $& foo[#]
have to poke around in the re_tests test file to see how, if not just
modify regexp.t to do it somehow. Probably its just as simple as
saying 'TODO' on the test name, if not, well thats how i would do it.
Im hoping to find some perl related tuits soon.
yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Related Discussions
Discussion Navigation
| view | thread | post |
Discussion Overview
| group | perl5-porters
|
| categories | perl |
| posted | Oct 28, '07 at 2:59p |
| active | Oct 31, '07 at 11:30a |
| posts | 8 |
| users | 4 |
| website | perl.org |
