Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ? - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ? |
Date | |
Msg-id | 1335950969.3106.438.camel@hvost Whole thread Raw |
In response to | Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ? (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Tue, 2012-05-01 at 19:11 -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > On Tue, May 1, 2012 at 9:56 AM, Joey Adams <joeyadams3.14159@gmail.com>wrote: > >> No, the RFC says (emphasis mine): > >> > >> A JSON *text* is a serialized object or array. > >> > >> If we let the JSON type correspond to a *value* instead, this > >> restriction does not apply, and the JSON type has a useful recursive > >> definition. > > > I think you're playing with words. But in any case, the RFC says this > > regarding generators: > > 5. Generators > > A JSON generator produces JSON text. The resulting text MUST > > strictly conform to the JSON grammar. > > I read over the RFC, and I think the only reason why they restricted > JSON texts to represent just a subset of JSON values is this cute > little hack in section 3 (Encoding): > > Since the first two characters of a JSON text will always be ASCII > characters [RFC0020], it is possible to determine whether an octet > stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking > at the pattern of nulls in the first four octets. > 00 00 00 xx UTF-32BE > 00 xx 00 xx UTF-16BE > xx 00 00 00 UTF-32LE > xx 00 xx 00 UTF-16LE > xx xx xx xx UTF-8 > > They need a guaranteed 2 ASCII characters to make that work, and > they won't necessarily get that many with a bare string literal. > > Since for our purposes there is not, and never will be, any need to > figure out whether a JSON input string is encoded in UTF16 or UTF32, > I find myself agreeing with the camp that says we might as well consider > that our JSON type corresponds to JSON values not JSON texts. I also > notice that json_in() seems to believe that already. > > However, that doesn't mean I'm sold on the idea of getting rid of > array_to_json and row_to_json in favor of a universal "to_json()" > function. In particular, both of those have optional "pretty_bool" > arguments that don't fit nicely at all in a generic conversion > function. The meaning of that flag is very closely tied to the > input being an array or record respectively. The flags probably should not be tied to specific type, as JSON is recursive and as such I think the current one-top-level-element-per row is quite limited form of pretty-printing. I have a table with a field the type of which is an array of type of another table, and what I currently get with pretty=true is hannu=# select row_to_json(test3, true) from test3; -[ RECORD 1 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- row_to_json | {"id":1, | "data3":[{"id":1,"data2":{"id":1,"data":"0.262814193032682","tstamp":"2012-04-05 13:21:03.235204"},"tstamp":"2012-04-05 13:25:03.644497"},{"id":2,"data2":{"id":2,"data":"0.157406373415142","tstamp":"2012-04-05 13:21:05.2033"},"tstamp":"2012-04-0513:25:03.644497"}], | "tstamp":"2012-04-16 14:40:15.795947"} What I would like to get what python's pprint does for the same json: >>> pprint(row) {'id': 1,'data3': [{'data2': {'data': '0.262814193032682', 'id': 1, 'tstamp': '2012-04-0513:21:03.235204'}, 'id': 1, 'tstamp': '2012-04-05 13:25:03.644497'}, {'data2': {'data':'0.157406373415142', 'id': 2, 'tstamp': '2012-04-05 13:21:05.2033'}, 'id': 2, 'tstamp': '2012-04-05 13:25:03.644497'}],'tstamp': '2012-04-16 14:40:15.795947'} If we have a pretty flag why not make it work all the way down the structure ? > I'm inclined to leave these functions as they are, and consider > adding a universal "to_json(anyelement)" (with no options) later. To achieve recursive prettyprinting the better way is to have an universal to_json(anyelement) with a prettyprinting option to_json(datum anyelement, indent int) with the behavior that if indent is NULL or negative integer no pretty-printing is done, if it is 0 printing starts at left margin and if it is a positive integer then this number of spaces is added to the left for each row (except the first one) of the json representation. And it would be overridable for specific types, so that hstore could provide its own to_json(datum hstore, indent int) which would do the correct pretty-printing for hstor-as-json_object representation. > Because it would not have options, it would not be meant to cover > cases where there's value in formatting or conversion options; > so it wouldn't render the existing functions entirely obsolete, > nor would it mean there would be no need for other specialized > conversion functions. I don't object to row_to_json() and array_to_json() functions being there as a convenience and as the two "official" functions guaranteed to return "JSON text". > regards, tom lane -- ------- Hannu Krosing PostgreSQL Unlimited Scalability and Performance Consultant 2ndQuadrant Nordic PG Admin Book: http://www.2ndQuadrant.com/books/
pgsql-hackers by date: