FAQ
I am trying to understand the differences in the way the 'split'
function works between Perl5 and Perl6.

Consider this string:

#####
$str = q|This is a string to be split|;
#####

Let's suppose I wish to split this string on the multi-character
delimiter string 'tri'. The results are the same in both languages.

#####
# Case 1
$ perl -e 'my ($str, @rv);$str = q|This is a string to be split|; @rv =
split(q|tri|, $str); print "<$_>" for @rv; print "\n";'
<This is a s><ng to be split>
#####
# Case 2
$ perl6 -e 'my ($str, @rv);$str = q|This is a string to be split|; @rv =
split(q|tri|, $str); print "<$_>" for @rv; print "\n";'
<This is a s><ng to be split>
#####

Now let's suppose that in Perl5 I wish to split the string on a pattern
which is the character class /[tri]/. I get:

#####
# Case 3
$ perl -e 'my ($str, @rv);$str = q|This is a string to be split|; @rv =
split(/[tri]/, $str); print "<$_>" for @rv; print "\n";'
<Th><s ><s a s><><><ng ><o be spl>
#####

The result is a list of strings which do not contain any of 't', 'r' or
'i'. Where two of the delimiters occurred consecutively in the original
string, I get an empty string -- except that empty strings at the end of
the list are dropped.

Now let's run the same code in Perl6:

#####
# Case 4
$ perl6 -e 'my ($str, @rv);$str = q|This is a string to be split|; @rv =
split(/[tri]/, $str); print "<$_>" for @rv; print "\n";'
<This is a s><ng to be split>
#####

I'm surprised to get exactly the same output I got in both languages
when my delimiter was the multi-character string 'tri'. The '[' and ']'
characters do not seem to indicate "character class" at all. It's as if
'/[...]/' magically turns into 'q|...|'. What am I not grasping here?

One more case: When, in Perl6, I surround the brackets with angle
brackets, I get somewhat more expected behavior:

#####
# Case 5
$ perl6 -e 'my ($str, @rv);$str = q|This is a string to be split|; @rv =
split(/<[tri]>/, $str); print "<$_>" for @rv; print "\n";'
<Th><s ><s a s><><><ng ><o be spl><><>
#####

I get something very similar to Case 3, which was written in Perl5,
viz., a list of strings which do not contain any of 't', 'r' or 'i'.
Where two of the delimiters occurred consecutively in the original
string, I get an empty string -- including at the end of the original
string. So, does that mean that, in Perl6, to split a string on a
character class, I have to always indicate (via the angle brackets) that
the character class is a list?

Thank you very much.
Jim Keenan

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 4 | next ›
Discussion Overview
groupperl6-users @
categoriesperl
postedFeb 28, '16 at 1:34a
activeFeb 28, '16 at 3:22p
posts4
users3
websiteperl6.org

People

Translate

site design / logo © 2021 Grokbase