FAQ
I think this was mentioned in a previous thread, but it's hard to search for.

Is it of any interest (i.e., is it a bug) that this matches:

/\N{LATIN SMALL LIGATURE FF}/ =~ /ff/u

...but this does not...

/\N{LATIN SMALL LIGATURE FF}/ =~ /[f][f]/u

?

I realize that this is a tortured case.

--
rjbs

Search Discussions

  • Karl Williamson at Jun 24, 2011 at 3:29 am

    On 06/23/2011 06:01 PM, Ricardo Signes wrote:
    I think this was mentioned in a previous thread, but it's hard to search for.

    Is it of any interest (i.e., is it a bug) that this matches:

    /\N{LATIN SMALL LIGATURE FF}/ =~ /ff/u

    ...but this does not...

    /\N{LATIN SMALL LIGATURE FF}/ =~ /[f][f]/u

    ?

    I realize that this is a tortured case.
    From perlre:
    There are a number of Unicode characters that match multiple
    characters under "/i". For example, "LATIN SMALL LIGATURE FI"
    should match the sequence "fi". Perl is not currently able to do
    this when the multiple characters are in the pattern and are split
    between groupings, or when one or more are quantified. Thus

    "\N{LATIN SMALL LIGATURE FI}" =~ /fi/i; # Matches
    "\N{LATIN SMALL LIGATURE FI}" =~ /[fi][fi]/i; # Doesn't match!
    "\N{LATIN SMALL LIGATURE FI}" =~ /fi*/i; # Doesn't match!

    # The below doesn't match, and it isn't clear what $1 and $2 would
    # be even if it did!!
    "\N{LATIN SMALL LIGATURE FI}" =~ /(f)(i)/i; # Doesn't match!

    Perl doesn't match multiple characters in an inverted bracketed
    character class, which otherwise could be highly confusing. See
    "Negation" in perlrecharclass.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupperl5-porters @
categoriesperl
postedJun 24, '11 at 12:02a
activeJun 24, '11 at 3:29a
posts2
users2
websiteperl.org

2 users in discussion

Ricardo Signes: 1 post Karl Williamson: 1 post

People

Translate

site design / logo © 2021 Grokbase