FAQ
Hello

Sometimes I'd like to properly indend and comment my regexps, especially
the more complex ones.

In the past I have used Perl's x flag, which is supported by many engines.
It's a very simple flag that ignores unquoted whitespace and line comments,
but that's enough to allow writing regexps in a very readable way.

Are there any plans to add it to the regexp package? Does anybody know of
any third-party library that fills this purpose? Otherwise I'll just go
ahead and write it myself.

Tobia

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Konstantin Kulikov at Aug 29, 2014 at 4:16 pm
    Try capture groups (?P<name>re)
    On Fri, Aug 29, 2014 at 6:30 PM, Tobia wrote:
    Hello

    Sometimes I'd like to properly indend and comment my regexps, especially the
    more complex ones.

    In the past I have used Perl's x flag, which is supported by many engines.
    It's a very simple flag that ignores unquoted whitespace and line comments,
    but that's enough to allow writing regexps in a very readable way.

    Are there any plans to add it to the regexp package? Does anybody know of
    any third-party library that fills this purpose? Otherwise I'll just go
    ahead and write it myself.

    Tobia

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Tobia at Aug 29, 2014 at 4:34 pm

    Konstantin Kulikov wrote:
    Try capture groups (?P<name>re)
    That's not the issue.

    The problem is the difference in readability—and therefore
    maintainability—between this:

    `\b(?:word1\W+(?:\w+\W+){1,6}?word2|word2\W+(?:\w+\W+){1,6}?word1)\b`

    and this:

    `
         \b # beginning of word
         (?:
             word1 # first word
             \W+ # whitespace or punctuation
             (?:
                 \w+ # between 1 and 6 intervening words
                 \W+
             ){1,6}?
             word2 # second word
             word2 # same pattern as above,
             \W+ # with the two words exchanged
             (?:
                 \w+
                 \W+
             ){1,6}?
             word1
         )
         \b # end of word
    `

    Tobia

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Rui Ueyama at Aug 29, 2014 at 5:31 pm
    It seems easy to me to write a function to remove line comments and
    whitespace characters. After that you can pass the resulting string to
    regexp.Compile. Is there any reason you can't do that?

    On Fri, Aug 29, 2014 at 9:34 AM, Tobia wrote:

    Konstantin Kulikov wrote:
    Try capture groups (?P<name>re)
    That's not the issue.

    The problem is the difference in readability—and therefore
    maintainability—between this:

    `\b(?:word1\W+(?:\w+\W+){1,6}?word2|word2\W+(?:\w+\W+){1,6}?word1)\b`

    and this:

    `
    \b # beginning of word
    (?:
    word1 # first word
    \W+ # whitespace or punctuation
    (?:
    \w+ # between 1 and 6 intervening words
    \W+
    ){1,6}?
    word2 # second word
    word2 # same pattern as above,
    \W+ # with the two words exchanged
    (?:
    \w+
    \W+
    ){1,6}?
    word1
    )
    \b # end of word
    `

    Tobia

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Mateusz Czapliński at Sep 1, 2014 at 10:45 am
    I believe you could try more or less something like below (sorry about
    word1/word2, I didn't really analyze what's going on there, but you should
    get the idea anyway, and hopefully be able to adjust):

         `\b` + // beginning of word
         `(?:` +
             word1 // first word
             `\W+` + // whitespace or punctuation
             `(?:` +
                 `\w+` + // between 1 and 6 intervening words
                 `\W+` +
             `){1,6}?` +
             word2 // second word
         `|` +
             word2 // same pattern as above,
             `\W+` + // with the two words exchanged
             `(?:` +
                 `\w+` +
                 `\W+` +
             `){1,6}?` +
             word1
         `)` +
         `\b` // end of word

    /M.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Tobia at Sep 1, 2014 at 10:51 am

    On 31/08/2014 09:35, John Souvestre wrote:

    I agree! Perhaps open an issue requesting that it be added to the regexp
    package?
    Yes, I will write a patch that adds something like CompileVerbose() and
    MustCompileVerbose() and submit it as an issue.

    I could write it as a wrapper around Compile, but that would mean parsing
    the regexp twice and duplicating it. I believe it can be done more
    efficiently as a flag of the existing compiler.

    On Monday, September 1, 2014 12:39:50 PM UTC+2, Mateusz Czapliński wrote:

    I believe you could try more or less something like below

    `\b` + // beginning of word
    `(?:` +
    That's not bad, but it's a bit too noisy.

    T.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Ingo Oeser at Sep 1, 2014 at 11:30 am
    That sounds like a very good solution:

      * needs no changes
      * allows to syntax highlight verbose regex correctly in all tools
      * doesn't bloat the binary
      * doesn't occur any parsing overhead at runtime
      * allows detecting comment/regex syntax issues to be determined at compile time

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Jakob Borg at Sep 1, 2014 at 12:46 pm

    2014-09-01 12:39 GMT+02:00 Mateusz Czapliński <czapkofan@gmail.com>:
    I believe you could try more or less something like below (sorry about
    word1/word2, I didn't really analyze what's going on there, but you should
    get the idea anyway, and hopefully be able to adjust):

    `\b` + // beginning of word
    `(?:` +
    word1 // first word
    `\W+` + // whitespace or punctuation
    `(?:` +
    `\w+` + // between 1 and 6 intervening words
    `\W+` +
    `){1,6}?` +
    word2 // second word
    `|` +
    word2 // same pattern as above,
    `\W+` + // with the two words exchanged
    `(?:` +
    `\w+` +
    `\W+` +
    `){1,6}?` +
    word1
    `)` +
    `\b` // end of word
    You could do that, but gofmt would eat the indentation in an instant.
    The "preprocessor function" proposed earlier, while ugly in other
    ways, wouldn't suffer from that at least.

    //jb

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Chris dollin at Sep 1, 2014 at 1:05 pm

    On 1 September 2014 13:46, Jakob Borg wrote:
    2014-09-01 12:39 GMT+02:00 Mateusz Czapliński <czapkofan@gmail.com>:
    I believe you could try more or less something like below (sorry about
    word1/word2, I didn't really analyze what's going on there, but you should
    get the idea anyway, and hopefully be able to adjust):

    `\b` + // beginning of word
    `(?:` +
    word1 // first word
    `\W+` + // whitespace or punctuation
    `(?:` +
    `\w+` + // between 1 and 6 intervening words
    `\W+` +
    `){1,6}?` +
    word2 // second word
    `|` +
    word2 // same pattern as above,
    `\W+` + // with the two words exchanged
    `(?:` +
    `\w+` +
    `\W+` +
    `){1,6}?` +
    word1
    `)` +
    `\b` // end of word
    You could do that, but gofmt would eat the indentation in an instant.
    The "preprocessor function" proposed earlier, while ugly in other
    ways, wouldn't suffer from that at least.
    I would [1] just write a `` string -- which gofmt can't munge -- and a
    function to strip the spare spaces and comments from it. Yes, it's
    an extra pass over the expression string, but I wouldn't expect that
    to matter for almost all programs.

    Chris

    [1] And indeed have, somewhere.

    --
    Chris "allusive" Dollin

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Tobia at Sep 1, 2014 at 1:26 pm

    On Monday, September 1, 2014 3:06:00 PM UTC+2, ehedgehog wrote:
    I would [1] just write a `` string -- which gofmt can't munge -- and a
    function to strip the spare spaces and comments from it. Yes, it's
    an extra pass over the expression string, but I wouldn't expect that
    to matter for almost all programs.
    I've done it too for past projects, but not in a complete, Perl compatible
    way. Since there's a standard for that (Perl's x flag) and since embedding
    its logic into the existing compiler would require no extra pass and almost
    no overhead, I'll just write a patch that does that.

    Tobia


    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Rob Pike at Sep 1, 2014 at 5:13 pm
    It can reside in a separate package with almost zero extra trouble. It
    doesn't belong in the standard regexp package.

    -rob

    On Mon, Sep 1, 2014 at 6:26 AM, Tobia wrote:
    On Monday, September 1, 2014 3:06:00 PM UTC+2, ehedgehog wrote:

    I would [1] just write a `` string -- which gofmt can't munge -- and a
    function to strip the spare spaces and comments from it. Yes, it's
    an extra pass over the expression string, but I wouldn't expect that
    to matter for almost all programs.

    I've done it too for past projects, but not in a complete, Perl compatible
    way. Since there's a standard for that (Perl's x flag) and since embedding
    its logic into the existing compiler would require no extra pass and almost
    no overhead, I'll just write a patch that does that.

    Tobia


    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedAug 29, '14 at 2:30p
activeSep 1, '14 at 5:13p
posts11
users8
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase