Thread: json casts
I've been on the receiving end of a couple of mumbles about the fact that the JSON rendering code ignores casts of builtin types to JSON. This was originally done as an optimization to avoid doing cache lookups for casts for things we knew quite well how to turn into JSON values (unlike, say, hstore). However, there is at least one concrete case where this has some possibly undesirable consequences, namely timestamps. Many JSON processors, especially JavaScript/ECMAScript processors, require timestamp values to be in ISO 8601 format, with a 'T' between the date part and the time part, and thus they barf on the output we produce for such values. I'm therefore wondering if we should just remove this optimization. I don't have any performance figures, but it seems unlikely to be terribly expensive. Perhaps in 9.5 we should look at caching a lot of the info the json rendering code gathers, but that's something we can't look at now. cheers andrew
On 05/27/2014 10:53 PM, Andrew Dunstan wrote: > I've been on the receiving end of a couple of mumbles about the fact > that the JSON rendering code ignores casts of builtin types to JSON. > This was originally done as an optimization to avoid doing cache lookups > for casts for things we knew quite well how to turn into JSON values > (unlike, say, hstore). However, there is at least one concrete case > where this has some possibly undesirable consequences, namely > timestamps. Many JSON processors, especially JavaScript/ECMAScript > processors, require timestamp values to be in ISO 8601 format, with a > 'T' between the date part and the time part, and thus they barf on the > output we produce for such values. I don't understand what ignoring casts of builtin types to JSON means. Can you give an example? - Heikki
On 05/27/2014 03:57 PM, Heikki Linnakangas wrote: > On 05/27/2014 10:53 PM, Andrew Dunstan wrote: >> I've been on the receiving end of a couple of mumbles about the fact >> that the JSON rendering code ignores casts of builtin types to JSON. >> This was originally done as an optimization to avoid doing cache lookups >> for casts for things we knew quite well how to turn into JSON values >> (unlike, say, hstore). However, there is at least one concrete case >> where this has some possibly undesirable consequences, namely >> timestamps. Many JSON processors, especially JavaScript/ECMAScript >> processors, require timestamp values to be in ISO 8601 format, with a >> 'T' between the date part and the time part, and thus they barf on the >> output we produce for such values. > > I don't understand what ignoring casts of builtin types to JSON means. > Can you give an example? See src/backend/utils/adt/json.c:json_categorize_type() lines 1280-1300. When rendering some value as part of a json string, if a cast exists from the data type to json, then the cast function is used to render the json instead of the type's normal output function, but only if it's not a builtin type. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 05/27/2014 03:57 PM, Heikki Linnakangas wrote: >> On 05/27/2014 10:53 PM, Andrew Dunstan wrote: >>> I've been on the receiving end of a couple of mumbles about the fact >>> that the JSON rendering code ignores casts of builtin types to JSON. >>> This was originally done as an optimization to avoid doing cache lookups >>> for casts for things we knew quite well how to turn into JSON values >>> (unlike, say, hstore). However, there is at least one concrete case >>> where this has some possibly undesirable consequences, namely >>> timestamps. Many JSON processors, especially JavaScript/ECMAScript >>> processors, require timestamp values to be in ISO 8601 format, with a >>> 'T' between the date part and the time part, and thus they barf on the >>> output we produce for such values. >> I don't understand what ignoring casts of builtin types to JSON means. >> Can you give an example? > See src/backend/utils/adt/json.c:json_categorize_type() lines 1280-1300. > When rendering some value as part of a json string, if a cast exists > from the data type to json, then the cast function is used to render the > json instead of the type's normal output function, but only if it's not > a builtin type. How exactly would disabling that code have any effect on timestamp rendering? There's no cast to json from timestamps (nor any other builtin type, except jsonb). I'd be inclined to think a more useful answer to this issue would be to make json.c special-case timestamps, as it already does for numerics. regards, tom lane
On 05/27/2014 11:00 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 05/27/2014 03:57 PM, Heikki Linnakangas wrote: >>> On 05/27/2014 10:53 PM, Andrew Dunstan wrote: >>>> I've been on the receiving end of a couple of mumbles about the fact >>>> that the JSON rendering code ignores casts of builtin types to JSON. >>>> This was originally done as an optimization to avoid doing cache lookups >>>> for casts for things we knew quite well how to turn into JSON values >>>> (unlike, say, hstore). However, there is at least one concrete case >>>> where this has some possibly undesirable consequences, namely >>>> timestamps. Many JSON processors, especially JavaScript/ECMAScript >>>> processors, require timestamp values to be in ISO 8601 format, with a >>>> 'T' between the date part and the time part, and thus they barf on the >>>> output we produce for such values. >>> I don't understand what ignoring casts of builtin types to JSON means. >>> Can you give an example? >> See src/backend/utils/adt/json.c:json_categorize_type() lines 1280-1300. >> When rendering some value as part of a json string, if a cast exists >> from the data type to json, then the cast function is used to render the >> json instead of the type's normal output function, but only if it's not >> a builtin type. > How exactly would disabling that code have any effect on timestamp > rendering? There's no cast to json from timestamps (nor any other > builtin type, except jsonb). I think Andrews idea was, that if cast were used, one could fix the above problem by defining a correct cast. > > I'd be inclined to think a more useful answer to this issue would be to > make json.c special-case timestamps, as it already does for numerics. > > regards, tom lane But I agree that special-casing the code to use the de-facto json standard of using ISO 8601 date representation is a better solution. Just make sure you get the TZ part right - this is another place where PostgreSQL often differs from other systems' understanding of ISO timestamps. Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ
On 05/27/2014 05:43 PM, Hannu Krosing wrote: > On 05/27/2014 11:00 PM, Tom Lane wrote: >>> >>> See src/backend/utils/adt/json.c:json_categorize_type() lines 1280-1300. >>> When rendering some value as part of a json string, if a cast exists >>> from the data type to json, then the cast function is used to render the >>> json instead of the type's normal output function, but only if it's not >>> a builtin type. >> How exactly would disabling that code have any effect on timestamp >> rendering? There's no cast to json from timestamps (nor any other >> builtin type, except jsonb). > I think Andrews idea was, that if cast were used, one could fix the above > problem by defining a correct cast. Right. > > >> I'd be inclined to think a more useful answer to this issue would be to >> make json.c special-case timestamps, as it already does for numerics. >> >> OK, that's another approach. > But I agree that special-casing the code to use the de-facto json standard > of using ISO 8601 date representation is a better solution. > > Just make sure you get the TZ part right - this is another place where > PostgreSQL often differs from other systems' understanding of ISO > timestamps. The target would be something like: to_char($1,'\"YYYY-MM-DD"T"HH:MI:SS.USOF\"') AIUI that should be legal ISO 8601. But I'm happy to defer to experts. Given that this would be a hard coded behaviour change, is it too late to do this for 9.4? cheers andrew
* Andrew Dunstan (andrew@dunslane.net) wrote: > >>I'd be inclined to think a more useful answer to this issue would be to > >>make json.c special-case timestamps, as it already does for numerics. > >> > >> > > OK, that's another approach. I'm all for doing this for JSON, but it'd sure be nice to have the option to get actual ISO 8601 from the built-ins too... > Given that this would be a hard coded behaviour change, is it too > late to do this for 9.4? No, for my 2c. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Andrew Dunstan (andrew@dunslane.net) wrote: >> Given that this would be a hard coded behaviour change, is it too >> late to do this for 9.4? > No, for my 2c. If we do it by adding casts then it'd require an initdb, so I'd vote against that for 9.4. If we just change behavior in json.c then that objection doesn't apply, so I wouldn't complain. regards, tom lane
On 05/27/2014 07:17 PM, Tom Lane wrote: > Stephen Frost <sfrost@snowman.net> writes: >> * Andrew Dunstan (andrew@dunslane.net) wrote: >>> Given that this would be a hard coded behaviour change, is it too >>> late to do this for 9.4? >> No, for my 2c. > If we do it by adding casts then it'd require an initdb, so I'd vote > against that for 9.4. If we just change behavior in json.c then that > objection doesn't apply, so I wouldn't complain. > > I wasn't proposing to add a cast, just to allow users to add one if they wanted. But I'm quite happy to go with special-casing timestamps, and leave the larger question for another time. cheers andrew
On Tue, May 27, 2014 at 5:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'd be inclined to think a more useful answer to this issue would be to > make json.c special-case timestamps, as it already does for numerics. I wonder if anyone besides me is nervous about changing the semantics here. It seems like the sort of backward-compatibility break we normally avoid. If we do make such a compatibility break, it should certainly be release-noted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 05/27/2014 11:55 PM, Robert Haas wrote: > On Tue, May 27, 2014 at 5:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'd be inclined to think a more useful answer to this issue would be to >> make json.c special-case timestamps, as it already does for numerics. > I wonder if anyone besides me is nervous about changing the semantics > here. It seems like the sort of backward-compatibility break we > normally avoid. If we do make such a compatibility break, it should > certainly be release-noted. > Yes, certainly it should be. cheers andrew
On 05/27/2014 07:25 PM, Andrew Dunstan wrote: > > On 05/27/2014 07:17 PM, Tom Lane wrote: >> Stephen Frost <sfrost@snowman.net> writes: >>> * Andrew Dunstan (andrew@dunslane.net) wrote: >>>> Given that this would be a hard coded behaviour change, is it too >>>> late to do this for 9.4? >>> No, for my 2c. >> If we do it by adding casts then it'd require an initdb, so I'd vote >> against that for 9.4. If we just change behavior in json.c then that >> objection doesn't apply, so I wouldn't complain. >> >> > > > I wasn't proposing to add a cast, just to allow users to add one if > they wanted. But I'm quite happy to go with special-casing timestamps, > and leave the larger question for another time. > > Here's a draft patch. I'm still checking to see if there are other places that need to be fixed, but I think this has the main one. cheers andrew
Attachment
On 5/28/14, 6:48 PM, Andrew Dunstan wrote: > > On 05/27/2014 07:25 PM, Andrew Dunstan wrote: >> >> On 05/27/2014 07:17 PM, Tom Lane wrote: >>> Stephen Frost <sfrost@snowman.net> writes: >>>> * Andrew Dunstan (andrew@dunslane.net) wrote: >>>>> Given that this would be a hard coded behaviour change, is it too >>>>> late to do this for 9.4? >>>> No, for my 2c. >>> If we do it by adding casts then it'd require an initdb, so I'd vote >>> against that for 9.4. If we just change behavior in json.c then that >>> objection doesn't apply, so I wouldn't complain. >>> >>> >> >> >> I wasn't proposing to add a cast, just to allow users to add one if >> they wanted. But I'm quite happy to go with special-casing timestamps, >> and leave the larger question for another time. >> >> > > > Here's a draft patch. I'm still checking to see if there are other > places that need to be fixed, but I think this has the main one. This was solved back in the day for the xml type, which has essentially the same requirement, by adding an ISO-8601-compatible output option to EncodeDateTime(). See map_sql_value_to_xml_value() in xml.c. You ought to be able to reuse that. Seems easier than routing through to_char().
Peter Eisentraut <peter_e@gmx.net> writes: > This was solved back in the day for the xml type, which has essentially > the same requirement, by adding an ISO-8601-compatible output option to > EncodeDateTime(). See map_sql_value_to_xml_value() in xml.c. You ought > to be able to reuse that. Seems easier than routing through to_char(). I was wondering if we didn't have another way to do that. to_char() is squirrely enough that I really hate having any other core functionality depending on it. +1 for changing this. regards, tom lane
On 06/03/2014 04:45 PM, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >> This was solved back in the day for the xml type, which has essentially >> the same requirement, by adding an ISO-8601-compatible output option to >> EncodeDateTime(). See map_sql_value_to_xml_value() in xml.c. You ought >> to be able to reuse that. Seems easier than routing through to_char(). > I was wondering if we didn't have another way to do that. to_char() is > squirrely enough that I really hate having any other core functionality > depending on it. +1 for changing this. > > It's a bit of a pity neither of you spoke up in the 6 days since I published the draft patch. And honestly, some of the other code invoked by the XML code looks a bit squirrely too. But, OK, I will look at it. I guess I can assume that the output won't contain anything that needs escaping, so I can just add the leading and trailing quote marks without needing to call escape_json(). cheers andrew