Re: to_json(NULL) should to return JSON null instead NULL - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: to_json(NULL) should to return JSON null instead NULL
Date
Msg-id 55E21A64.4090600@BlueTreble.com
Whole thread Raw
In response to Re: to_json(NULL) should to return JSON null instead NULL  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 8/29/15 3:27 PM, Tom Lane wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
>> On 8/29/15 12:29 PM, Pavel Stehule wrote:
>>> what is correct from JSON perspective? All fields with NULL
>
>> ISTM that the whole purpose of to_json is to properly jsonify something,
>> and the proper json form for "undefined" is 'null', is it not?
>
> What's not entirely clear is what we should do with cases like
>
> regression=#  select array_to_json(null::int[]);
>   array_to_json
> ---------------
>
> (1 row)
>
> regression=#  select row_to_json(null::record);
>   row_to_json
> -------------
>
> (1 row)
>
> If we leave those alone (and in the latter case, in particular, there is
> not enough information available to do much else) then it's not so clear
> that changing to_json() is really improving consistency overall.
> For instance, do we really want row_to_json(null::record) and
> to_json(null::record) giving different results?  Or if we make them
> both return "null", that breaks the previous invariant that row_to_json
> always yields a JSON object.

The tricky part is that if you're calling any of those functions to find 
the value to stick inside a JSON array or object then NULL is definitely 
incorrect. IE: if you were trying to turn the results of this

create table t(a int, b text);
insert into t(a) values(1);

Into a JSON object, you'd want '{"a":1,"b":null}', not NULL. Of course 
you'd just use row_to_json() for something that simple, but if you were 
doing something more complex you might have to code a transform yourself.

> An advantage of leaving these things as strict is that the user can easily
> substitute whatever specific behavior she wants for NULLs via coalesce(),
> as was shown upthread.  If we put in a different behavior, then the
> only way to override it would be with a CASE, which is tedious and creates
> multiple-evaluation issues.

Certainly true.

The downside to leaving it alone is this will probably be hard to debug 
if you're using it to build a complex JSON object. One NULL ends up in 
the right place and suddenly your whole output becomes NULL. I think 
this is why %s works the way it does in format as well.

If we do change it I think it best to add an argument to control what it 
does with a NULL so you can get whichever you need. It might be worth 
adding to (array|record)_to_json as well, though I think the use case 
for those is markedly different than for plain to_json, so maybe not.

> I'm not necessarily against changing it --- but it doesn't seem entirely
> black-and-white to me, and we do now have a couple of versions worth
> of precedent we'd be breaking with.
>
> If we do vote to change it, I'd want to do so now (ie in 9.5) rather than
> create yet another year's worth of precedent.

I wonder how much people are actually using to_json(). I've done some 
amount of JSON massaging and don't recall needing it yet.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: buffer README is out of date
Next
From: Tom Lane
Date:
Subject: Re: 9.4 broken on alpha