FAQ
Context: I'm trying to parse some struct Id headers for a new project. the
Id in one of the structures is {non-empty string of letters} + { dash
symbol} + {non-empty string of numbers} like "test-15" for instance. I use
it as an index into a map. Anyway, I try to validate the Id as being
"reasonable" with the regexp. However, my regexp was failing - so I
tried to get it to match a simpler case "test" The regexp I chose for
matching {non-empty string of letters} is "[:alpha:]+" but it's failing to
match "test".
As expected it matches with "[:alpha:]*" but that's not a legitimate map
index(-999 would fail) so I don't want to have to use that. Any help would
be appreciated.

  Playground is http://play.golang.org/p/douGYV7j--

On a more general note, is there a "RE2 for beginners" document anywhere?
  The regexp -> wiki page was useful but kind of sparse on examples :-)
Thanks.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Robert at Apr 30, 2014 at 7:48 pm
    You need to enclose them in extra brackets. So, [:alpha:]* => [[:alpha:]]*
    On Wednesday, April 30, 2014 12:41:42 PM UTC-7, Hotei wrote:

    Context: I'm trying to parse some struct Id headers for a new project.
    the Id in one of the structures is {non-empty string of letters} + { dash
    symbol} + {non-empty string of numbers} like "test-15" for instance. I use
    it as an index into a map. Anyway, I try to validate the Id as being
    "reasonable" with the regexp. However, my regexp was failing - so I
    tried to get it to match a simpler case "test" The regexp I chose for
    matching {non-empty string of letters} is "[:alpha:]+" but it's failing to
    match "test".
    As expected it matches with "[:alpha:]*" but that's not a legitimate map
    index(-999 would fail) so I don't want to have to use that. Any help would
    be appreciated.

    Playground is http://play.golang.org/p/douGYV7j--

    On a more general note, is there a "RE2 for beginners" document anywhere?
    The regexp -> wiki page was useful but kind of sparse on examples :-)
    Thanks.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Hotei at Apr 30, 2014 at 7:56 pm
    Robert,
    Thanks for the nearly instantaneous reply! It worked. However, I'm still
    in the dark as to why extra brackets were required on the first test but it
    worked without them on the others. Still could use a general reference if
    anyone knows of one.
    On Wednesday, April 30, 2014 3:48:45 PM UTC-4, Robert wrote:

    You need to enclose them in extra brackets. So, [:alpha:]* => [[:alpha:]]*

    Playground is http://play.golang.org/p/douGYV7j--

    On a more general note, is there a "RE2 for beginners" document anywhere?
    The regexp -> wiki page was useful but kind of sparse on examples :-)
    Thanks.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Rui Ueyama at Apr 30, 2014 at 8:30 pm

    On Wed, Apr 30, 2014 at 12:56 PM, Hotei wrote:

    Robert,
    Thanks for the nearly instantaneous reply! It worked. However, I'm still
    in the dark as to why extra brackets were required on the first test but it
    worked without them on the others. Still could use a general reference if
    anyone knows of one.
    So you are asking why /[:alpha:]*/ matches "test".

    It's because [:alpha:] is interpreted as a normal character class that
    matches one of :, a, l, p, or h. It's equivalent to /[:ahlp]*/. That
    matches "test" because "test" contains zero repetition of the pattern.

    On Wednesday, April 30, 2014 3:48:45 PM UTC-4, Robert wrote:

    You need to enclose them in extra brackets. So, [:alpha:]* => [[:alpha:]]*

    Playground is http://play.golang.org/p/douGYV7j--

    On a more general note, is there a "RE2 for beginners" document
    anywhere? The regexp -> wiki page was useful but kind of sparse on
    examples :-) Thanks.
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Hotei at Apr 30, 2014 at 8:37 pm
    After skipping past the links to Resident Evil 2 I
    found http://www.regular-expressions.info/ - look like a good place for me
    to start.

    On a more general note, is there a "RE2 for beginners" document
    anywhere? The regexp -> wiki page was useful but kind of sparse on
    examples :-) Thanks.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Gyepi SAM at Apr 30, 2014 at 8:38 pm

    On Wed, Apr 30, 2014 at 12:56:43PM -0700, Hotei wrote:
    However, I'm still
    in the dark as to why extra brackets were required on the first test but it
    worked without them on the others. Still could use a general reference if
    anyone knows of one.
    Your other examples work, but not quite the way you think (or want).

    "[:alpha:]" is a regex that wants to match one of the characters ':', 'a', 'l', 'p', 'h'.
    Note that it does NOT denote the ASCII character class and, obviously,
    "test" contains none of those characters.

    "[:alpha:]+" fails because the first "t" in "test" isn't one of the target characters.

    "[:alpha:]*" says to match zero or more times and it matches zero times.

    "[:alpha:]?" says to match once or not at all so it does not match.

    etc, etc.

    Here's a perversely modified version of your tests which succeeds on all
    matches, but should actually fail. http://play.golang.org/p/V1moOE5IIr

    As for documentation, I think you're looking for: https://code.google.com/p/re2/wiki/Syntax
    You may also find regex (7) and perlre(1) useful as well.

    -Gyepi

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Hotei at Apr 30, 2014 at 8:54 pm
    Sam,
    Thanks for the response. Perverse is right :-) That wasn't what I
    expected! The google re2 wiki isn't much help at all to a someone not
    already well versed in the topic - and I burned my last Camel book decades
    ago in frustration over Perl's write-only code so that's a no-go. But I
    found a source on the web that looks useful. Got to give it a little study.
    On Wednesday, April 30, 2014 4:37:57 PM UTC-4, Gyepi SAM wrote:
    On Wed, Apr 30, 2014 at 12:56:43PM -0700, Hotei wrote:
    However, I'm still
    in the dark as to why extra brackets were required on the first test but it
    worked without them on the others. Still could use a general reference if
    anyone knows of one.
    Your other examples work, but not quite the way you think (or want).

    "[:alpha:]" is a regex that wants to match one of the characters ':', 'a',
    'l', 'p', 'h'.
    Note that it does NOT denote the ASCII character class and, obviously,
    "test" contains none of those characters.

    "[:alpha:]+" fails because the first "t" in "test" isn't one of the target
    characters.

    "[:alpha:]*" says to match zero or more times and it matches zero times.

    "[:alpha:]?" says to match once or not at all so it does not match.

    etc, etc.

    Here's a perversely modified version of your tests which succeeds on all
    matches, but should actually fail. http://play.golang.org/p/V1moOE5IIr

    As for documentation, I think you're looking for:
    https://code.google.com/p/re2/wiki/Syntax
    You may also find regex (7) and perlre(1) useful as well.

    -Gyepi
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Speter at Sep 4, 2014 at 1:10 pm
    I also got bitten by this, expecting `[:alpha:]` to match any alphabetic
    character. I figured out that I can get it to work if I use double brackets
    (`[[:alpha:]]`), but it seems to me that the docs are not in line with the
    implementation, or at least are very confusing for newcomers. (Note that I
    don't have extensive experience with regexp in any language, so I was just
    relying on godoc at regexp/syntax [1] -- which I believe is based on the
    re2 wiki page referenced above.)

    The reason I thought it should work with single brackets is that the godoc
    page lists `[:alpha:]` under "Single characters", which I interpret as
    meaning that `[:alpha:]` can be used (as it is, without extra brackets) to
    match any single alphabetic character (it can not -- see [2]). Moreover,
    below under "Named character classes as character class elements", it
    states `[[:name:]]` ... `(== [:name:])` which can also be taken to mean
    that single brackets can be used instead of double brackets (although the
    scope and meaning of `==` is not clearly defined). Based on this, I thought
    that the extra brackets would only be necessary when combining sets of
    characters from different classes into a single class.

    Is this an implementation bug (and it should work with single brackets), a
    documentation bug (and the pattern at "Single characters" should be listed
    as `[[:alpha:]]`), or just a misinterpretation of the documentation? Is it
    worth filing an issue?

    [1] http://golang.org/pkg/regexp/syntax/
    [2] http://play.golang.org/p/krE9e9kigU

    Peter


    On Thu, May 1, 2014 at 5:54 AM, Hotei wrote:

    Sam,
    Thanks for the response. Perverse is right :-) That wasn't what I
    expected! The google re2 wiki isn't much help at all to a someone not
    already well versed in the topic - and I burned my last Camel book decades
    ago in frustration over Perl's write-only code so that's a no-go. But I
    found a source on the web that looks useful. Got to give it a little study.

    On Wednesday, April 30, 2014 4:37:57 PM UTC-4, Gyepi SAM wrote:
    On Wed, Apr 30, 2014 at 12:56:43PM -0700, Hotei wrote:
    However, I'm still
    in the dark as to why extra brackets were required on the first test but it
    worked without them on the others. Still could use a general reference if
    anyone knows of one.
    Your other examples work, but not quite the way you think (or want).

    "[:alpha:]" is a regex that wants to match one of the characters ':',
    'a', 'l', 'p', 'h'.
    Note that it does NOT denote the ASCII character class and, obviously,
    "test" contains none of those characters.

    "[:alpha:]+" fails because the first "t" in "test" isn't one of the
    target characters.

    "[:alpha:]*" says to match zero or more times and it matches zero times.

    "[:alpha:]?" says to match once or not at all so it does not match.

    etc, etc.

    Here's a perversely modified version of your tests which succeeds on all
    matches, but should actually fail. http://play.golang.org/p/V1moOE5IIr

    As for documentation, I think you're looking for:
    https://code.google.com/p/re2/wiki/Syntax
    You may also find regex (7) and perlre(1) useful as well.

    -Gyepi

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Speter at Sep 4, 2014 at 1:26 pm
    It seems that there is an issue already:
    https://code.google.com/p/go/issues/detail?id=8505
    Sorry for the noise.

    Peter

    On Thu, Sep 4, 2014 at 10:10 PM, speter wrote:

    I also got bitten by this, expecting `[:alpha:]` to match any alphabetic
    character. I figured out that I can get it to work if I use double brackets
    (`[[:alpha:]]`), but it seems to me that the docs are not in line with the
    implementation, or at least are very confusing for newcomers. (Note that I
    don't have extensive experience with regexp in any language, so I was just
    relying on godoc at regexp/syntax [1] -- which I believe is based on the
    re2 wiki page referenced above.)

    The reason I thought it should work with single brackets is that the godoc
    page lists `[:alpha:]` under "Single characters", which I interpret as
    meaning that `[:alpha:]` can be used (as it is, without extra brackets) to
    match any single alphabetic character (it can not -- see [2]). Moreover,
    below under "Named character classes as character class elements", it
    states `[[:name:]]` ... `(== [:name:])` which can also be taken to mean
    that single brackets can be used instead of double brackets (although the
    scope and meaning of `==` is not clearly defined). Based on this, I thought
    that the extra brackets would only be necessary when combining sets of
    characters from different classes into a single class.

    Is this an implementation bug (and it should work with single brackets), a
    documentation bug (and the pattern at "Single characters" should be listed
    as `[[:alpha:]]`), or just a misinterpretation of the documentation? Is it
    worth filing an issue?

    [1] http://golang.org/pkg/regexp/syntax/
    [2] http://play.golang.org/p/krE9e9kigU

    Peter


    On Thu, May 1, 2014 at 5:54 AM, Hotei wrote:

    Sam,
    Thanks for the response. Perverse is right :-) That wasn't what I
    expected! The google re2 wiki isn't much help at all to a someone not
    already well versed in the topic - and I burned my last Camel book decades
    ago in frustration over Perl's write-only code so that's a no-go. But I
    found a source on the web that looks useful. Got to give it a little study.

    On Wednesday, April 30, 2014 4:37:57 PM UTC-4, Gyepi SAM wrote:
    On Wed, Apr 30, 2014 at 12:56:43PM -0700, Hotei wrote:
    However, I'm still
    in the dark as to why extra brackets were required on the first test but it
    worked without them on the others. Still could use a general
    reference if
    anyone knows of one.
    Your other examples work, but not quite the way you think (or want).

    "[:alpha:]" is a regex that wants to match one of the characters ':',
    'a', 'l', 'p', 'h'.
    Note that it does NOT denote the ASCII character class and, obviously,
    "test" contains none of those characters.

    "[:alpha:]+" fails because the first "t" in "test" isn't one of the
    target characters.

    "[:alpha:]*" says to match zero or more times and it matches zero times.

    "[:alpha:]?" says to match once or not at all so it does not match.

    etc, etc.

    Here's a perversely modified version of your tests which succeeds on all
    matches, but should actually fail. http://play.golang.org/p/V1moOE5IIr

    As for documentation, I think you're looking for:
    https://code.google.com/p/re2/wiki/Syntax
    You may also find regex (7) and perlre(1) useful as well.

    -Gyepi

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedApr 30, '14 at 7:41p
activeSep 4, '14 at 1:26p
posts9
users5
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase