On Tue, Sep 14, 2010 at 10:32 PM, Itagaki Takahiro wrote:
Here is a patch for basic JSON support. It adds only those features:
* Add "json" data type, that is binary-compatible with text.
* Syntax checking on text to JSON conversion.
* json_pretty() -- print JSON tree with indentation.
We have "JSON datatype (WIP) 01" item:https://commitfest.postgresql.org/action/patch_view?id=351
but it seems too complex for me to apply all of the feature
at once, especially JSON-Path support. So, I'd like to submit
only basic and minimal JSON datatype support at first.
The most different point of my patch is that JSON parser is
reimplemented with flex and bison to make maintenance easier.
Note that the JSON parser accept top-level scalar values
intensionally, because of requirement when we add functions
to extract values from JSON array/object. For example,
CREATE FUNCTION json_to_array(json) RETURNS json
might return naked scalar values in the result array.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-hackers
I have written a patch that amends the basic_json-20100915.patch . It
does the following:
* Fixes bugs in the lexer (namely, ' -0 '::JSON was not accepted, and
' """ '::JSON was accepted).
* Adds a json_validate() function.
* Adds test strings from original JSON patch along with some new ones.
* Tweaks the error message for invalid JSON input.
* Fixes the Makefile so json.c won't fail to build because
json_parser.h and json_scanner.h haven't been built yet.
In the lexer, the following regular expression for parsing a number
This regex doesn't accept '-0', but that's a valid JSON number
according to the standard.
Also, the exclusive state <str> didn't have an <<EOF>> rule, causing
the lexer to treat all characters from an unclosed quote to EOF as
nothing at all.
A less important issue with the lexer is that it doesn't allow the
control character \x7F to appear in strings unescaped. The JSON RFC
doesn't mention \x7F as a control character, and many implementations
I went ahead and added json_validate() now because it's useful for
testing (my test strings use it).
I made the error message say "ERROR: invalid input syntax for JSON"
rather than the redundant "ERROR: syntax error in json: syntax
error". However, the specific parsing error (we haven't defined any
yet) is simply ignored by json_validate and company.
As for the Makefile, I think the reason you weren't getting build
problems but I was is because I normally build with make -j2 (run 2
processes at once), which tends to test makefile rule correctness.
Here's one thing I'm worried about: the bison/flex code in your patch
looks rather similar to the code inhttp://www.jsonlint.com/bin/jsonval.tgz
, which is licensed under the
GPL. In particular, the incorrect number regex I discussed above can
also be found in jsonval verbatim. However, because there are a lot
of differences in both the bison and flex code now, I'm not sure
they're close enough to be "copied", but I am not a lawyer. It might
be a good idea to contact Ben Spencer and ask him for permission to
license our modified version of the code under PostgreSQL's more
relaxed license, just to be on the safe side.
Finally, could you post a patch with your latest work (with or without
my contribution) so we can tinker with it more? Thanks!