FAQ
This post [1], while being ostensibly lame with regard to Go, left me
scratching my head
as indeed neither the strings nor encoding/utf8 standard packages seem to
not contain
anything resembling the "substring" functionailty.
To make it clear (the post I'm referring to is, again, lame in this
respect), I mean
taking a substring with regard to characters (or runes, if you want) and
not bytes.

My own simplistic stab at it is [2] but I'd like to hear opinions on this
matter.
Namely improvements on my solution and discussion about why I failed to find
anything as simple as a direct support for substringing in the standard
library.

Maybe someone with a G+ account will be able to constructively comment [1]
as well ;-)

Thanks!

1. http://blog.surgut.co.uk/2015/08/go-enjoy-python3.html
2. http://play.golang.org/p/C8fKqkXdJA

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Staven at Aug 27, 2015 at 4:19 pm
    s := "some string"
    r := []rune(s)
    substr := string(r[2:5])

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Chris Kastorff at Aug 27, 2015 at 4:48 pm
    I came up with this, iterating over each byte in the string and using
    utf8.RuneStart to find unicode point locations.

    https://play.golang.org/p/_5nExNWQPz

    It seems like this approach would be the most efficient, as it doesn't
    do any allocations.
    On Thu, Aug 27, 2015 at 9:18 AM, Staven wrote:
    s := "some string"
    r := []rune(s)
    substr := string(r[2:5])

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Nick Craig-Wood at Aug 27, 2015 at 4:59 pm

    On 27/08/15 17:48, Chris Kastorff wrote:
    I came up with this, iterating over each byte in the string and using
    utf8.RuneStart to find unicode point locations.

    https://play.golang.org/p/_5nExNWQPz
    Ranging through the string gives neater code

    https://play.golang.org/p/kVIsfh3P8z
    It seems like this approach would be the most efficient, as it doesn't
    do any allocations.
    On Thu, Aug 27, 2015 at 9:18 AM, Staven wrote:
    s := "some string"
    r := []rune(s)
    substr := string(r[2:5])

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

    --
    Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Chris Kastorff at Aug 27, 2015 at 5:03 pm
    Ranging over the string gives incorrect results. The range operator on
    strings iterates over decoded utf8 codepoints, not bytes.
    Unfortunately, there's no easy way to reverse that into byte offsets,
    which is needed for the string slicing operation at the end.
    On Thu, Aug 27, 2015 at 9:58 AM, Nick Craig-Wood wrote:
    On 27/08/15 17:48, Chris Kastorff wrote:
    I came up with this, iterating over each byte in the string and using
    utf8.RuneStart to find unicode point locations.

    https://play.golang.org/p/_5nExNWQPz
    Ranging through the string gives neater code

    https://play.golang.org/p/kVIsfh3P8z
    It seems like this approach would be the most efficient, as it doesn't
    do any allocations.
    On Thu, Aug 27, 2015 at 9:18 AM, Staven wrote:
    s := "some string"
    r := []rune(s)
    substr := string(r[2:5])

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

    --
    Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Chris Kastorff at Aug 27, 2015 at 5:22 pm
    Correction, it seems I was wrong on this. While ranging over the
    string does give runes, the indexes returned by the range are indeed
    codepoint start *byte* locations.

    I mistakenly assumed the indexes in main() were the same...
    On Thu, Aug 27, 2015 at 10:03 AM, Chris Kastorff wrote:
    Ranging over the string gives incorrect results. The range operator on
    strings iterates over decoded utf8 codepoints, not bytes.
    Unfortunately, there's no easy way to reverse that into byte offsets,
    which is needed for the string slicing operation at the end.
    On Thu, Aug 27, 2015 at 9:58 AM, Nick Craig-Wood wrote:
    On 27/08/15 17:48, Chris Kastorff wrote:
    I came up with this, iterating over each byte in the string and using
    utf8.RuneStart to find unicode point locations.

    https://play.golang.org/p/_5nExNWQPz
    Ranging through the string gives neater code

    https://play.golang.org/p/kVIsfh3P8z
    It seems like this approach would be the most efficient, as it doesn't
    do any allocations.
    On Thu, Aug 27, 2015 at 9:18 AM, Staven wrote:
    s := "some string"
    r := []rune(s)
    substr := string(r[2:5])

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

    --
    Nick Craig-Wood <nick@craig-wood.com> -- http://www.craig-wood.com/nick
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Rob Pike at Aug 27, 2015 at 8:57 pm
    Be careful: A rune is not necessarily a character. Please see the blog
    posts on the subject: https://blog.golang.org/strings and
    https://blog.golang.org/normalization.

    -rob

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedAug 27, '15 at 4:09p
activeAug 27, '15 at 8:57p
posts7
users5
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase