Reviewers: golang-dev_googlegroups.com,

Hello golang-dev@googlegroups.com,

I'd like you to review this change to

spec: ignore BOMS outside of string and rune literals.
Happy Birthday UTF-8.

Please review this at http://codereview.appspot.com/6506083/

Affected files:
M doc/go_spec.html

Index: doc/go_spec.html
--- a/doc/go_spec.html
+++ b/doc/go_spec.html
@@ -1,6 +1,6 @@
"Title": "The Go Programming Language Specification",
- "Subtitle": "Version of September 4, 2012",
+ "Subtitle": "Version of September 6, 2012",
"Path": "/ref/spec"

@@ -99,6 +99,12 @@
Implementation restriction: For compatibility with other tools, a
compiler may disallow the NUL character (U+0000) in the source text.
+Implementation restriction: For compatibility with other tools, a
+compiler may ignore any UTF-8-encoded Unicode byte order mark
+(U+FEFF) in the source text outside of <a
+and <a href="#Rune_literals">rune</a> literals.

<h3 id="Characters">Characters</h3>

Search Discussions

  • R at Sep 6, 2012 at 4:48 pm
    Note added for reference:

    This is a spec change. The current implementations reject source code
    containing the (aptly named) BOM. However, that behavior violates the
    Unicode spec, which requires us to accept, harmlessly, a BOM at the
    beginning of a file, thank you Notepad.

    We could permit (and ignore) an initial BOM only and that would satisfy
    Unicode. However, the issue will probably come up again if two
    BOM-marked files are concatenated with a Unicode-unaware tool, or if
    some nonconformant program writes it out. The RFC for syslog encourages
    putting a BOM on every message, which is not only wrong it's crazy, but
    that tells us there are people out there who think BOMs provide
    seasoning to the data and we'll probably see a Go source file with extra
    flavor one day.

    Therefore, in the hope that we will never have to bother with these
    aBOMinations again, we have decided to ignore them wherever they appear,
    with the curious but important exception of string and rune literals,
    where they could conceivably be placed on purpose.

  • Russ Cox at Sep 6, 2012 at 5:20 pm
  • Jan Mercl at Sep 6, 2012 at 5:34 pm
    Sad pragmatism.

  • R at Sep 6, 2012 at 5:37 pm
    *** Submitted as
    http://code.google.com/p/go/source/detail?r=4d451585dd02 ***

    spec: ignore BOMS outside of string and rune literals.
    Happy Birthday UTF-8.

    R=golang-dev, rsc, 0xjnml

  • David Symonds at Sep 6, 2012 at 10:26 pm

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
postedSep 6, '12 at 4:25p
activeSep 6, '12 at 10:26p



site design / logo © 2022 Grokbase