FAQ
https://github.com/opennota/re2dfa

This command-line tool takes a regular expression, transforms it into a
deterministic finite state machine and outputs a Go source code for it.

While not supporting some features of the regexp package, like capturing
groups, the generated FSMs are generally 5x-10x faster than the
corresponding function from the regexp package with the equivalent regular
expression.

The generated functions return the length of the match or -1 if no match is
found. All patterns are anchored at the beginning of the data, as if the
pattern starts with ^. For complex regular expressions with multiple broad
unicode ranges the generation of an FSM can be slow.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Mateusz Czapliński at Jun 30, 2015 at 11:10 pm

    On Monday, June 22, 2015 at 3:14:04 PM UTC+2, opennota wrote:
    https://github.com/opennota/re2dfa
    This command-line tool takes a regular expression, transforms it into a
    deterministic finite state machine and outputs a Go source code for it.
    Could you possibly expose an interface allowing to feed such a FSM
    char-by-char, just until a match is confirmed?

    While not supporting some features of the regexp package, like capturing
    groups, the generated FSMs are generally 5x-10x faster than the
    corresponding function from the regexp package with the equivalent regular
    expression.
    Ouch, no captures? That's a pity.

    Could you possibly explain exactly what is, and what is not supported in
    your docs and possibly the readme file?

    tx
    /M.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Opennota at Jul 2, 2015 at 3:54 pm

    On Wednesday, July 1, 2015, Mateusz Czapliński wrote:

    Could you possibly expose an interface allowing to feed such a FSM
    char-by-char, just until a match is confirmed?
    It shouldn't be too difficult. You can write your own function for code
    generation, using codegen.GoGenerate() as an example. I'm curious, what is
    the use case for this?

    While not supporting some features of the regexp package, like capturing
    groups, the generated FSMs are generally 5x-10x faster than the
    corresponding function from the regexp package with the equivalent regular
    expression.
    Ouch, no captures? That's a pity.
    Again, not that it is impossible. I've already filed an issue about this.

    Could you possibly explain exactly what is, and what is not supported in
    your docs and possibly the readme file?
    A generated FSM is a function that takes a single parameter of type string
    or []byte and returns the length of the match at the beginning of the
    supplied data or -1 in case there's no match.

    You can use it like regexp.Match()/MatchString() except that you also have
    the length of the match.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Mateusz Czapliński at Jul 5, 2015 at 12:12 am

    On Thursday, July 2, 2015 at 5:54:10 PM UTC+2, opennota wrote:
    On Wednesday, July 1, 2015, Mateusz Czapliński wrote:

    Could you possibly expose an interface allowing to feed such a FSM
    char-by-char, just until a match is confirmed?
    It shouldn't be too difficult. You can write your own function for code
    generation, using codegen.GoGenerate() as an example. I'm curious, what is
    the use case for this?
    Matching regex on streams (e.g. io.Reader), and trying to match multiple
    regexes at once, checking which one matches first.

    Could you possibly explain exactly what is, and what is not supported in
    your docs and possibly the readme file?
    A generated FSM is a function that takes a single parameter of type string
    or []byte and returns the length of the match at the beginning of the
    supplied data or -1 in case there's no match.
    You can use it like regexp.Match()/MatchString() except that you also have
    the length of the match.
    I believe you have misunderstood my question; I'm interested in *what regex
    features* are supported (or not) by your package. Is it full
    https://github.com/google/re2/wiki/Syntax? Is it same as pkg regexp (i.e.
    https://github.com/google/re2/wiki/Syntax without "\C")? Or is it something
    else? You already mentioned no captures. Are there any other things missing
    too? There are many dialects of "regexp" around the world, and it would be
    nice to say clearly and explicitly what dialect your package supports, so
    that users would know what to expect. And it would be nice if you said that
    in readme and/or docs.

    /M.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Opennota at Jul 5, 2015 at 3:16 am

    On Sunday, July 5, 2015, Mateusz Czapliński wrote:
    I'm interested in *what regex features* are supported (or not) by your
    package.
    Ah. You can use in your regular expressions anything that is understood by
    the regexp/syntax package, I believe. Including capture groups.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedJun 22, '15 at 1:14p
activeJul 5, '15 at 3:16a
posts5
users2
websitegolang.org

2 users in discussion

Opennota: 3 posts Mateusz Czapliński: 2 posts

People

Translate

site design / logo © 2021 Grokbase