Re: to_json(NULL) should to return JSON null instead NULL - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: to_json(NULL) should to return JSON null instead NULL
Date
Msg-id CAKFQuwaVZLfmkE_07Qx0gwVZLuEsdtGdpAOnPyKxMph+gF6C=w@mail.gmail.com
Whole thread Raw
In response to Re: to_json(NULL) should to return JSON null instead NULL  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Aug 29, 2015 at 4:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 8/29/15 12:29 PM, Pavel Stehule wrote:
>> what is correct from JSON perspective? All fields with NULL

> ISTM that the whole purpose of to_json is to properly jsonify something,
> and the proper json form for "undefined" is 'null', is it not?

What's not entirely clear is what we should do with cases like

regression=#  select array_to_json(null::int[]);
 array_to_json
---------------

(1 row)

regression=#  select row_to_json(null::record);
 row_to_json
-------------

(1 row)

If we leave those alone (and in the latter case, in particular, there is
not enough information available to do much else) then it's not so clear
that changing to_json() is really improving consistency overall.
For instance, do we really want row_to_json(null::record) and
to_json(null::record) giving different results?  Or if we make them
both return "null", that breaks the previous invariant that row_to_json
always yields a JSON object.

An advantage of leaving these things as strict is that the user can easily
substitute whatever specific behavior she wants for NULLs via coalesce(),
as was shown upthread.  If we put in a different behavior, then the
only way to override it would be with a CASE, which is tedious and creates
multiple-evaluation issues.

​Wouldn't NULLIF(any, any) be another means of overriding?​


I'm not necessarily against changing it --- but it doesn't seem entirely
black-and-white to me, and we do now have a couple of versions worth
of precedent we'd be breaking with.

If we do vote to change it, I'd want to do so now (ie in 9.5) rather than
create yet another year's worth of precedent.


​I don't see COALESCE as being particularly problematic since we are going to JSON which means that in (nearly?) all situations all of the potential input values will be homogeneous and a single coalesce can apply the relevant logic at the point of conversion.  Since row_to_json and record_to_json effectively deal with heterogeneous input types ​the same opportunity is not available to them and simply incorporating null into the output for that position is the only reasonable thing to do.

I'm tending to favor removing the strict modifier and having NULL => json 'null' with the advice that the old behavior can be obtained by writing "NULLIF(to_json(...), json 'null')".

If we go this route we should also consider returning json 'null' for (record|array)_to_json(NULL) at the same time.

My experience with the json feature is has strictly been the json_to_record direction...but it seems like a wasted opportunity to be useful in the default case when JSON provides a null concept that is semantically similar to the concept in SQL.  While I'm hesitant to change this without representative use cases to work from the theory argument holds enough weight to strongly consider making the change.

David J.


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: WIP: About CMake v2
Next
From: Jeff Janes
Date:
Subject: Re: buffer README is out of date