Thread: json function question

json function question

From
Dan S
Date:
Hi !

I'm running "PostgreSQL 9.5.1 on i686-pc-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 32-bit"

I'm trying out json functions and stumbled on a problem with json_populate_record.
To try out the function I decided to take records from table test convert them to json and immediately repopulate them back into records of type test, but it fails with this message:

ERROR:  malformed array literal: "["abc","def","fgh"]"
DETAIL:  "[" must introduce explicitly-specified array dimensions.
********** Error **********

ERROR: malformed array literal: "["abc","def","fgh"]"
SQL state: 22P02
Detail: "[" must introduce explicitly-specified array dimensions.

Is it a bug or how am I supposed to use the populate function ?
If I try an equivalent example with hstore it works well.

I have this table, data and query:

create table test
(
    id int,
    txt text,
    txt_arr text[],
    f float
);

insert into test
values (1,'jkl','{abc,def,fgh}',3.14159),(2,'hij','{abc,def,fgh}',3.14159),(2,null,null,null),(3,'def',null,0);

select j, json_populate_record(null::test, j)
from
(
    select to_json(t) as j from test t
) r


Best Regards
Dan S

Re: json function question

From
"David G. Johnston"
Date:
On Tue, Feb 23, 2016 at 12:12 PM, Dan S <strd911@gmail.com> wrote:
Hi !

I'm running "PostgreSQL 9.5.1 on i686-pc-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 32-bit"

I'm trying out json functions and stumbled on a problem with json_populate_record.
To try out the function I decided to take records from table test convert them to json and immediately repopulate them back into records of type test, but it fails with this message:

ERROR:  malformed array literal: "["abc","def","fgh"]"
DETAIL:  "[" must introduce explicitly-specified array dimensions.
********** Error **********

ERROR: malformed array literal: "["abc","def","fgh"]"
SQL state: 22P02
Detail: "[" must introduce explicitly-specified array dimensions.

Is it a bug or how am I supposed to use the populate function ?

​Personally, I'd call it a bug - whether design or implementation doesn't matter to me.

What json_populate_record seems to be looking for when faced with a "text[]" typed field is something of the form:

{"key":"{\"abc\",\"def\",\"ghi\"}"}

IOW, this works:

select j, json_populate_record(null::test, j)
from
(
    select '{"id":1,"txt":"jkl","txt_arr":"{\"abc\",\"def\",\"fgh\"}","f":3.14159}'::json AS j
) r

Namely a scalar literal that looks like a PostgreSQL array - as opposed to an actual JSON array.

The literal text is sent through the input function for text[] and gets parsed into a PostgreSQL text array.​

​Given that this presently errors I would suggest we fix this case so that both forms are acceptable to the parser.  In other words, try harder in our effort to coerce between the two formats.

David J.



Re: json function question

From
Tom Lane
Date:
Dan S <strd911@gmail.com> writes:
> I have this table, data and query:

> create table test
> (
>     id int,
>     txt text,
>     txt_arr text[],
>     f float
> );

> insert into test
> values
> (1,'jkl','{abc,def,fgh}',3.14159),(2,'hij','{abc,def,fgh}',3.14159),(2,null,null,null),(3,'def',null,0);

> select j, json_populate_record(null::test, j)
> from
> (
>     select to_json(t) as j from test t
> ) r;

> ERROR:  malformed array literal: "["abc","def","fgh"]"
> DETAIL:  "[" must introduce explicitly-specified array dimensions.

> Is it a bug or how am I supposed to use the populate function ?

AFAICS, json_populate_record has no intelligence about nested container
situations.  It'll basically just push the JSON text representation of any
field of the top-level object at the input converter for the corresponding
composite-type column.  That doesn't work if you're trying to convert a
JSON array to a Postgres array, and it wouldn't work for sub-object to
composite column either, because of syntax discrepancies.

Ideally this would work for arbitrarily-deeply-nested array+record
structures, but it looks like a less than trivial amount of work to make
that happen.

> If I try an equivalent example with hstore it works well.

hstore hasn't got any concept of substructure in its field values, so
it's hard to see how you'd create an "equivalent" situation.

One problem with fixing this is avoiding backwards-compatibility breakage,
but I think we could do that by saying that we only change behavior when
(a) json sub-value is an array and target Postgres type is an array type,
or (b) json sub-value is an object and target Postgres type is a composite
type.  In both cases, current code would fail outright, so there's no
existing use-cases to protect.  For other target Postgres types, we'd
continue to do it as today, so for example conversion to a JSON column
type would continue to work as it does now.

I'm not sure if anything besides json[b]_populate_record needs to change
similarly, but we ought to look at all those conversion functions with
the thought of nested containers in mind.

            regards, tom lane

PS: I'm not volunteering to do the work here, but it seems like a good
change to make.


Re: json function question

From
"David G. Johnston"
Date:
On Tue, Feb 23, 2016 at 12:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dan S <strd911@gmail.com> writes:
> I have this table, data and query:

> create table test
> (
>     id int,
>     txt text,
>     txt_arr text[],
>     f float
> );

> insert into test
> values
> (1,'jkl','{abc,def,fgh}',3.14159),(2,'hij','{abc,def,fgh}',3.14159),(2,null,null,null),(3,'def',null,0);

> select j, json_populate_record(null::test, j)
> from
> (
>     select to_json(t) as j from test t
> ) r;

> ERROR:  malformed array literal: "["abc","def","fgh"]"
> DETAIL:  "[" must introduce explicitly-specified array dimensions.

> Is it a bug or how am I supposed to use the populate function ?

AFAICS, json_populate_record has no intelligence about nested container
situations.  It'll basically just push the JSON text representation of any
field of the top-level object at the input converter for the corresponding
composite-type column.  That doesn't work if you're trying to convert a
JSON array to a Postgres array, and it wouldn't work for sub-object to
composite column either, because of syntax discrepancies.

Ideally this would work for arbitrarily-deeply-nested array+record
structures, but it looks like a less than trivial amount of work to make
that happen.

> If I try an equivalent example with hstore it works well.

hstore hasn't got any concept of substructure in its field values, so
it's hard to see how you'd create an "equivalent" situation.

​Equivalent in the "ability to round-trip" sense.  Since hstore doesn't have nested containers internal serialization of a record to hstore is forced to "stringify" the array which can then be fed back in as-is.  But the [row_]to_json​
 
​logic converts the PostgreSQL arrays to JSON arrays and then we fail to handle them on the return portion of the trip.

Arrays are likely to be a much for common scenario but I agree that dealing with arbitrary depths and objects would make the feature complete.

And yes, back-patching should only occur (and ideally behavior changing) for situations that today raise errors - as the example does.

​David J.

Re: json function question

From
Dan S
Date:
Yes I meant equivalence in the roundtrip conversion sense.

And of course the "feature complete" solution which can handle deep structures would be really nice to have.

Best Regards
Dan S

2016-02-23 21:11 GMT+01:00 David G. Johnston <david.g.johnston@gmail.com>:
On Tue, Feb 23, 2016 at 12:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dan S <strd911@gmail.com> writes:
> I have this table, data and query:

> create table test
> (
>     id int,
>     txt text,
>     txt_arr text[],
>     f float
> );

> insert into test
> values
> (1,'jkl','{abc,def,fgh}',3.14159),(2,'hij','{abc,def,fgh}',3.14159),(2,null,null,null),(3,'def',null,0);

> select j, json_populate_record(null::test, j)
> from
> (
>     select to_json(t) as j from test t
> ) r;

> ERROR:  malformed array literal: "["abc","def","fgh"]"
> DETAIL:  "[" must introduce explicitly-specified array dimensions.

> Is it a bug or how am I supposed to use the populate function ?

AFAICS, json_populate_record has no intelligence about nested container
situations.  It'll basically just push the JSON text representation of any
field of the top-level object at the input converter for the corresponding
composite-type column.  That doesn't work if you're trying to convert a
JSON array to a Postgres array, and it wouldn't work for sub-object to
composite column either, because of syntax discrepancies.

Ideally this would work for arbitrarily-deeply-nested array+record
structures, but it looks like a less than trivial amount of work to make
that happen.

> If I try an equivalent example with hstore it works well.

hstore hasn't got any concept of substructure in its field values, so
it's hard to see how you'd create an "equivalent" situation.

​Equivalent in the "ability to round-trip" sense.  Since hstore doesn't have nested containers internal serialization of a record to hstore is forced to "stringify" the array which can then be fed back in as-is.  But the [row_]to_json​
 
​logic converts the PostgreSQL arrays to JSON arrays and then we fail to handle them on the return portion of the trip.

Arrays are likely to be a much for common scenario but I agree that dealing with arbitrary depths and objects would make the feature complete.

And yes, back-patching should only occur (and ideally behavior changing) for situations that today raise errors - as the example does.

​David J.

Re: json function question

From
Merlin Moncure
Date:
On Tue, Feb 23, 2016 at 1:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dan S <strd911@gmail.com> writes:
>> I have this table, data and query:
>
>> create table test
>> (
>>     id int,
>>     txt text,
>>     txt_arr text[],
>>     f float
>> );
>
>> insert into test
>> values
>> (1,'jkl','{abc,def,fgh}',3.14159),(2,'hij','{abc,def,fgh}',3.14159),(2,null,null,null),(3,'def',null,0);
>
>> select j, json_populate_record(null::test, j)
>> from
>> (
>>     select to_json(t) as j from test t
>> ) r;
>
>> ERROR:  malformed array literal: "["abc","def","fgh"]"
>> DETAIL:  "[" must introduce explicitly-specified array dimensions.
>
>> Is it a bug or how am I supposed to use the populate function ?
>
> AFAICS, json_populate_record has no intelligence about nested container
> situations.  It'll basically just push the JSON text representation of any
> field of the top-level object at the input converter for the corresponding
> composite-type column.  That doesn't work if you're trying to convert a
> JSON array to a Postgres array, and it wouldn't work for sub-object to
> composite column either, because of syntax discrepancies.
>
> Ideally this would work for arbitrarily-deeply-nested array+record
> structures, but it looks like a less than trivial amount of work to make
> that happen.
>
>> If I try an equivalent example with hstore it works well.
>
> hstore hasn't got any concept of substructure in its field values, so
> it's hard to see how you'd create an "equivalent" situation.
>
> One problem with fixing this is avoiding backwards-compatibility breakage,
> but I think we could do that by saying that we only change behavior when
> (a) json sub-value is an array and target Postgres type is an array type,
> or (b) json sub-value is an object and target Postgres type is a composite
> type.  In both cases, current code would fail outright, so there's no
> existing use-cases to protect.  For other target Postgres types, we'd
> continue to do it as today, so for example conversion to a JSON column
> type would continue to work as it does now.

I hope so.  When we debated these interfaces the current behavior
accepted on the principle that nested structures could be deserialized
at some point in the future.  I think the endgame here is to be able
to do, foo::json[b]::foo for just about any postgres type.

merlin


Re: json function question

From
Andrew Dunstan
Date:

On 02/23/2016 02:54 PM, Tom Lane wrote:
> Dan S <strd911@gmail.com> writes:
>> I have this table, data and query:
>> create table test
>> (
>>      id int,
>>      txt text,
>>      txt_arr text[],
>>      f float
>> );
>> insert into test
>> values
>> (1,'jkl','{abc,def,fgh}',3.14159),(2,'hij','{abc,def,fgh}',3.14159),(2,null,null,null),(3,'def',null,0);
>> select j, json_populate_record(null::test, j)
>> from
>> (
>>      select to_json(t) as j from test t
>> ) r;
>> ERROR:  malformed array literal: "["abc","def","fgh"]"
>> DETAIL:  "[" must introduce explicitly-specified array dimensions.
>> Is it a bug or how am I supposed to use the populate function ?
> AFAICS, json_populate_record has no intelligence about nested container
> situations.  It'll basically just push the JSON text representation of any
> field of the top-level object at the input converter for the corresponding
> composite-type column.  That doesn't work if you're trying to convert a
> JSON array to a Postgres array, and it wouldn't work for sub-object to
> composite column either, because of syntax discrepancies.
>
> Ideally this would work for arbitrarily-deeply-nested array+record
> structures, but it looks like a less than trivial amount of work to make
> that happen.
>
>> If I try an equivalent example with hstore it works well.
> hstore hasn't got any concept of substructure in its field values, so
> it's hard to see how you'd create an "equivalent" situation.
>
> One problem with fixing this is avoiding backwards-compatibility breakage,
> but I think we could do that by saying that we only change behavior when
> (a) json sub-value is an array and target Postgres type is an array type,
> or (b) json sub-value is an object and target Postgres type is a composite
> type.  In both cases, current code would fail outright, so there's no
> existing use-cases to protect.  For other target Postgres types, we'd
> continue to do it as today, so for example conversion to a JSON column
> type would continue to work as it does now.
>
> I'm not sure if anything besides json[b]_populate_record needs to change
> similarly, but we ought to look at all those conversion functions with
> the thought of nested containers in mind.
>
>             regards, tom lane
>
> PS: I'm not volunteering to do the work here, but it seems like a good
> change to make.
>


Historically, we had row_to_json before we had json_populate_record, and
a complete round-trip wasn't part of the design anyway AFAIR. Handling
nested composites and arrays would be a fairly large piece of work, and
I'm not available to do it either.

A much simpler way to get some round-trip-ability would be to have a row
to json converter that would stringify instead of decomposing nested
complex objects, much as hstore does. That would be fairly simple to do,
and the results should be able to be fed straight back into
json(b)_populate_record. I'm not volunteering to do that either, but the
work involved would probably be measured in hours rather than days or
weeks. Of course, the json produced by this would be ugly and the
stringified complex objects would be opaque to other json processors.
OTOH, many round-trip applications don't need to process the serialized
object on the way around. So this wouldn't be a cure-all but it might
meet some needs.

Having json(b)_populate_record recursively process nested complex
objects would be a large undertaking. One thing to consider is that json
arrays are quite different from Postgres arrays: they are essentially
one-dimensional heterogenous lists, not multi-dimensional homogeneous
matrices. So while a Postgres array that's been converted to a json
array should in principle be convertible back, an arbitrary json array
could easily not be.

cheers

andrew


Re: json function question

From
"David G. Johnston"
Date:
On Wednesday, February 24, 2016, Andrew Dunstan <andrew@dunslane.net> wrote:

Having json(b)_populate_record recursively process nested complex objects would be a large undertaking. One thing to consider is that json arrays are quite different from Postgres arrays: they are essentially one-dimensional heterogenous lists, not multi-dimensional homogeneous matrices. So while a Postgres array that's been converted to a json array should in principle be convertible back, an arbitrary json array could easily not be.


An arbitrary json array should be one-dimensional and homogeneous - seems like that should be easy to import.  The true concern is that not all PostgreSQL arrays are capable of being represented in json. 

I'd be happy with just accepting json arrays and ignoring complex and nested types.  While round-trip is nice externally supplied json that uses arrays should be something we can import directly.

David J.

Re: json function question

From
Andrew Dunstan
Date:

On 02/24/2016 09:11 AM, David G. Johnston wrote:
> On Wednesday, February 24, 2016, Andrew Dunstan <andrew@dunslane.net
> <mailto:andrew@dunslane.net>> wrote:
>
>
>     Having json(b)_populate_record recursively process nested complex
>     objects would be a large undertaking. One thing to consider is
>     that json arrays are quite different from Postgres arrays: they
>     are essentially one-dimensional heterogenous lists, not
>     multi-dimensional homogeneous matrices. So while a Postgres array
>     that's been converted to a json array should in principle be
>     convertible back, an arbitrary json array could easily not be.
>
>
> An arbitrary json array should be one-dimensional and homogeneous -
> seems like that should be easy to import.  The true concern is that
> not all PostgreSQL arrays are capable of being represented in json.



Neither of these things are true AFAIK.

1. The following is a 100% legal json array, about as heterogenous as
can be:

    [ "a" , 1, true, null, [2,false], {"b": null} ]


2. Having implemented the routines to convert Postgres arrays to json
I'm not aware of any which can't be converted. Please supply an example
of one that can't.


cheers

andrew




Re: json function question

From
Tom Lane
Date:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Wednesday, February 24, 2016, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Having json(b)_populate_record recursively process nested complex objects
>> would be a large undertaking. One thing to consider is that json arrays are
>> quite different from Postgres arrays: they are essentially one-dimensional
>> heterogenous lists, not multi-dimensional homogeneous matrices. So while a
>> Postgres array that's been converted to a json array should in principle be
>> convertible back, an arbitrary json array could easily not be.

> An arbitrary json array should be one-dimensional and homogeneous - seems
> like that should be easy to import.  The true concern is that not all
> PostgreSQL arrays are capable of being represented in json.

I think we can put it on the user's head that the target Postgres array
type specified in json(b)_populate_record's arguments must be capable of
absorbing all elements of the matching JSON array.

Andrew raises a larger point: if the goal is that
json_populate_record(row_to_json()) be an identity with "deep" conversion
of container types, that puts constraints on row_to_json's behavior, which
we could not change without creating backwards-compatibility issues.
However, it looks to me like row_to_json already does pretty much the
right thing with nested array/record types:

regression=# select row_to_json(row(1,array[2,3],'(0,1)'::int8_tbl,array[(1,2),(3,4)]::int8_tbl[]));
                                   row_to_json
---------------------------------------------------------------------------------
 {"f1":1,"f2":[2,3],"f3":{"q1":0,"q2":1},"f4":[{"q1":1,"q2":2},{"q1":3,"q2":4}]}
(1 row)

So the complaint here is that json_populate_record fails to be an inverse
of row_to_json.

I'm not sure about Andrew's estimate that it'd be a large amount of work
to fix this.  It would definitely require some restructuring of the code
to make populate_record_worker (or some portion thereof) recursive, and
probably some entirely new code for array conversion; and making
json_populate_recordset behave similarly might take refactoring too.

            regards, tom lane


Re: json function question

From
"David G. Johnston"
Date:
On Wednesday, February 24, 2016, Andrew Dunstan <andrew@dunslane.net> wrote:


On 02/24/2016 09:11 AM, David G. Johnston wrote:
On Wednesday, February 24, 2016, Andrew Dunstan <andrew@dunslane.net <mailto:andrew@dunslane.net>> wrote:


    Having json(b)_populate_record recursively process nested complex
    objects would be a large undertaking. One thing to consider is
    that json arrays are quite different from Postgres arrays: they
    are essentially one-dimensional heterogenous lists, not
    multi-dimensional homogeneous matrices. So while a Postgres array
    that's been converted to a json array should in principle be
    convertible back, an arbitrary json array could easily not be.


An arbitrary json array should be one-dimensional and homogeneous - seems like that should be easy to import.  The true concern is that not all PostgreSQL arrays are capable of being represented in json.



Neither of these things are true AFAIK.

1. The following is a 100% legal json array, about as heterogenous as can be:

   [ "a" , 1, true, null, [2,false], {"b": null} ]


2. Having implemented the routines to convert Postgres arrays to json I'm not aware of any which can't be converted. Please supply an example of one that can't.


Yeah, I confused heterogeneous and homogeneous.  That said we already only promise best effort so if presented with a non-complex array it would be nice to import it as part of that effort.  Even if something more complex cannot be and continues to fail.

David J.

Re: json function question

From
Andrew Dunstan
Date:

On 02/24/2016 09:41 AM, Tom Lane wrote:

> However, it looks to me like row_to_json already does pretty much the
> right thing with nested array/record types:
>
> regression=# select row_to_json(row(1,array[2,3],'(0,1)'::int8_tbl,array[(1,2),(3,4)]::int8_tbl[]));
>                                     row_to_json
> ---------------------------------------------------------------------------------
>   {"f1":1,"f2":[2,3],"f3":{"q1":0,"q2":1},"f4":[{"q1":1,"q2":2},{"q1":3,"q2":4}]}
> (1 row)
>
> So the complaint here is that json_populate_record fails to be an inverse
> of row_to_json.


Right.


>
> I'm not sure about Andrew's estimate that it'd be a large amount of work
> to fix this.  It would definitely require some restructuring of the code
> to make populate_record_worker (or some portion thereof) recursive, and
> probably some entirely new code for array conversion; and making
> json_populate_recordset behave similarly might take refactoring too.
>
>


One possible shortcut if we were just handling arrays and not nested
composites would be to mangle the json array to produce a Postgres array
literal. But if we're handling nested composites as well that probably
won't pass muster and we would need to decompose all the objects fully
and reassemble them into Postgres objects. Maybe it won't take as long
as I suspect. If anyone actually does it I'll be interested to find out
how long it took them :-)

cheers

andrew