FAQ

Am 24.08.14 03:11, schrieb Greg Ewing:
Isaac Morland wrote:
In HTML 5 it allows non-ASCII-compatible encodings as long as U+FEFF
(byte order mark) is used:

http://www.w3.org/TR/html-markup/syntax.html#encoding-declaration

Not sure about XML.
According to Appendix F here:

http://www.w3.org/TR/xml/#sec-guessing

an XML parser needs to be prepared to try all the encodings it
supports until it finds one that works well enough to decode
the XML declaration, then it can find out the exact encoding
used.

That's not what this section says. Instead, it says that
you need to auto-detect UCS-4, UTF-16, UTF-8 from the BOM,
or guess them or EBCDIC from the encoding of '<?'. This should
be enough to actually parse the encoding declaration. Other
non-ASCII-compatible encodings can only be used if declared
in an upper-level protocol (such as HTTP).


The parser is not expected to try out all encodings it supports.


Regards,
Martin

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

People

Translate

site design / logo © 2017 Grokbase