Thread: Bypassing NULL elements in row_to_json function
I'm looking at the possibility of using JSON as a data exchange format with some apps running on both PCs and Macs. . The table I would be exporting has a lot of NULL values in it. Is there any way to skip the NULL values in the row_to_json function and include only the fields that are non-null? -- Mike Nolan nolan@tssi.com
On 08/04/2016 16:31, Michael Nolan wrote: > I'm looking at the possibility of using JSON as a data exchange format > with some apps running on both PCs and Macs. . > > The table I would be exporting has a lot of NULL values in it. Is > there any way to skip the NULL values in the row_to_json function and > include only the fields that are non-null? You could use a CTE to filter out the nulls (not tested - I haven't used JSON in PG (yet!)): with no_nulls as ( select ... from my_table where whatever is not null ) select row_to_json(....) from no_nulls; Ray. -- Raymond O'Donnell :: Galway :: Ireland rod@iol.ie
On 08/04/2016 16:31, Michael Nolan wrote:
> I'm looking at the possibility of using JSON as a data exchange format
> with some apps running on both PCs and Macs. .
>
> The table I would be exporting has a lot of NULL values in it. Is
> there any way to skip the NULL values in the row_to_json function and
> include only the fields that are non-null?
You could use a CTE to filter out the nulls (not tested - I haven't used
JSON in PG (yet!)):
with no_nulls as (
select ... from my_table
where whatever is not null
)
select row_to_json(....) from no_nulls;
One of us is confused. I'm reading this as "I want a row_to_json" call to generate objects with different keys depending on whether a given key would have a null - in which case exclude the key.
I
think one would have to simply allow row_to_json to populate the keys with null values and then post-process them away:
json_strip_nulls(row_to_json(...))
David J.
On 04/08/2016 08:31 AM, Michael Nolan wrote: > I'm looking at the possibility of using JSON as a data exchange format > with some apps running on both PCs and Macs. . > > The table I would be exporting has a lot of NULL values in it. Is > there any way to skip the NULL values in the row_to_json function and > include only the fields that are non-null? I guess it depends on your data. Are the NULLs all in one field or scattered across fields? Imagining this scenario: fld_1 fld_2 fld_3 'val1_1' NULL 'val1_3 NULL 'val2_2' 'val2_3' 'val3_3' 'val3_2' NULL How do you deal with the holes(NULL) on the receiving end? > -- > Mike Nolan > nolan@tssi.com > > -- Adrian Klaver adrian.klaver@aklaver.com
It looks like json_strip_nulls() may be what I need, I'm currently on 9.3, which doesn't have that function but may be in a position to upgrade to 9.5 this summer. I think the apps that would be receiving the data can deal with any resulting 'holes' in the data set by just setting them to null.
--
Mike Nolan
I was able to try it on a test server, the combination of row_to_json() and json_strip_nulls() worked exactly as I had hoped. Stripping nulls reduced the JSON file by over 50%. (The data I needed to export has around 900,000 rows, so it gets quite large.)
I've got a test file I can make available to app developers.
My next task is to find out if validating and importing a JSON file into a table is as easy as exporting a table in JSON turned out to be. Thanks for the help.
--
2nd Followup: It turns out that loading a table from a JSON string is more complicated than going from a table to JSON, perhaps for good reason. There does not appear to be a direct inverse to the row_to_json() function, but it wasn't difficult for me to write a PHP program that takes the JSON file I created the other day and converts it back to a series of inserts, recreating the original table.
Of course this simple program does NO validation (not that this file needed any), so if the JSON string is not well-formed for any of a number of reasons, or if it is not properly mapped to the table into which the inserts are made, an insert could fail or result in incorrect data.
--
--
Mike Nolan
2nd Followup: It turns out that loading a table from a JSON string is more complicated than going from a table to JSON, perhaps for good reason. There does not appear to be a direct inverse to the row_to_json() function, but it wasn't difficult for me to write a PHP program that takes the JSON file I created the other day and converts it back to a series of inserts, recreating the original table.Of course this simple program does NO validation (not that this file needed any), so if the JSON string is not well-formed for any of a number of reasons, or if it is not properly mapped to the table into which the inserts are made, an insert could fail or result in incorrect data.
--Mike Nolan
json_populate_record(base anyelement, from_json json)
json_populate_recordset(base anyelement, from_json json)
Exists in 9.3 too...though if you are going heavy json I'd suggest doing whatever you can to keep up with the recent releases.
David J.
On Sun, Apr 10, 2016 at 2:30 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
2nd Followup: It turns out that loading a table from a JSON string is more complicated than going from a table to JSON, perhaps for good reason. There does not appear to be a direct inverse to the row_to_json() function, but it wasn't difficult for me to write a PHP program that takes the JSON file I created the other day and converts it back to a series of inserts, recreating the original table.Of course this simple program does NO validation (not that this file needed any), so if the JSON string is not well-formed for any of a number of reasons, or if it is not properly mapped to the table into which the inserts are made, an insert could fail or result in incorrect data.
--Mike Nolanjson_populate_record(base anyelement, from_json json)json_populate_recordset(base anyelement, from_json json)Exists in 9.3 too...though if you are going heavy json I'd suggest doing whatever you can to keep up with the recent releases.David J.
If there's a way to use the json_populate_record() or json_populate_recordset() functions to load a table from a JSON file (eg, using copy), it would be nice if it was better documented. I did find a tool that loads a JSON file into a table (pgfutter), and even loaded one row from that table into another table using json_populate_record(), but the 'subquery returned multiple rows' issue wouldn't let me do the entire table.
But that still doesn't deal with validating individual fields or checking that the JSON is complete and consistent with the table to be loaded.
--
Mike Nolan
On Sun, Apr 10, 2016 at 2:30 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:2nd Followup: It turns out that loading a table from a JSON string is more complicated than going from a table to JSON, perhaps for good reason. There does not appear to be a direct inverse to the row_to_json() function, but it wasn't difficult for me to write a PHP program that takes the JSON file I created the other day and converts it back to a series of inserts, recreating the original table.Of course this simple program does NO validation (not that this file needed any), so if the JSON string is not well-formed for any of a number of reasons, or if it is not properly mapped to the table into which the inserts are made, an insert could fail or result in incorrect data.
--Mike Nolanjson_populate_record(base anyelement, from_json json)json_populate_recordset(base anyelement, from_json json)Exists in 9.3 too...though if you are going heavy json I'd suggest doing whatever you can to keep up with the recent releases.David J.If there's a way to use the json_populate_record() or json_populate_recordset() functions to load a table from a JSON file (eg, using copy), it would be nice if it was better documented. I did find a tool that loads a JSON file into a table (pgfutter), and even loaded one row from that table into another table using json_populate_record(), but the 'subquery returned multiple rows' issue wouldn't let me do the entire table.But that still doesn't deal with validating individual fields or checking that the JSON is complete and consistent with the table to be loaded.
It isn't that involved once you've learned generally how to call normal record functions and also set-returning functions (that later must be in the FROM clause of the query). If you provide what you attempted its becomes easier to explain away your mis-understanding.
It doesn't work with COPY. You have to write an explicit INSERT+SELECT query where the text of the JSON is a parameter. Your client library should let you do this. If you are using "psql", which doesn't support parameters, you up having to store the json in a psql variable and reference that in the function.
The function ensures that column order is consistent so "INSERT INTO %I" is all you need to write.
Data validation is why we invented CHECK constraints - if you need more functionality than the simple mechanical conversion from a json object to a table row you will need to write code somewhere to do the additional work. All json_populate_record(set) promises is that the above command will work.
I suppose the way you'd write your attempt that failed would be similar to:
INSERT INTO %I
SELECT rec.* FROM src_table_with_json LATERAL json_populate_record(null::%I, src_table_with_json.json_column);
Again, seeing what you actually did would be helpful - I'm having trouble imaging what you did to provoke that particular error.
David J.
On 04/10/2016 07:49 AM, Michael Nolan wrote: > > > On Sun, Apr 10, 2016 at 2:30 AM, David G. Johnston > <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote: > > On Sat, Apr 9, 2016 at 9:48 PM, Michael Nolan <htfoot@gmail.com > <mailto:htfoot@gmail.com>>wrote: > > > 2nd Followup: It turns out that loading a table from a JSON > string is more complicated than going from a table to JSON, > perhaps for good reason. There does not appear to be a direct > inverse to the row_to_json() function, but it wasn't difficult > for me to write a PHP program that takes the JSON file I created > the other day and converts it back to a series of inserts, > recreating the original table. > > Of course this simple program does NO validation (not that this > file needed any), so if the JSON string is not well-formed for > any of a number of reasons, or if it is not properly mapped to > the table into which the inserts are made, an insert could fail > or result in incorrect data. > -- > Mike Nolan > > > See: http://www.postgresql.org/docs/9.5/interactive/functions-json.html > > json_populate_record(base anyelement, from_json json) > json_populate_recordset(base anyelement, from_json json) > > Exists in 9.3 too...though if you are going heavy json I'd suggest > doing whatever you can to keep up with the recent releases. > > David J. > > > If there's a way to use the json_populate_record() or > json_populate_recordset() functions to load a table from a JSON file > (eg, using copy), it would be nice if it was better documented. I did > find a tool that loads a JSON file into a table (pgfutter), and even > loaded one row from that table into another table using > json_populate_record(), but the 'subquery returned multiple rows' issue > wouldn't let me do the entire table. Does the receiving table have the same structure as the sending table? Is the receiving table already populated with data? > > But that still doesn't deal with validating individual fields or > checking that the JSON is complete and consistent with the table to be > loaded. Well you know the JSON is not complete as you dropped all the fields in each row that had NULL values. Validation is a more complex subject and honestly something I do not think could be accomplished in straight SQL. In other words it would need to be run through some ETL tool. I use Python so as an example: https://petl.readthedocs.org/en/latest/ In particular: https://petl.readthedocs.org/en/latest/io.html#json-files https://petl.readthedocs.org/en/latest/transform.html#validation > -- > Mike Nolan -- Adrian Klaver adrian.klaver@aklaver.com
Here's what I did:
\d gold1604_test
Table "uscf.gold1604_test"
Column | Type | Modifiers
--------+------+-----------
data | json |
\d gold1604_test
Table "uscf.gold1604_test"
Column | Type | Modifiers
--------+------+-----------
data | json |
Some sample data:
{"id":"10000001","name":"MISNER, J NATHAN","st":"NY","exp":"2012-05-31","sts":
"A"} +
{"id":"10000002","name":"MISNER, JUDY","st":"TN","exp":"2007-07-31","sts":"I"}
+
{"id":"10000003","name":"MISNER, J AMSCHEL","st":"NY","exp":"2007-05-31","sts"
:"A"}+
{"id":"10000001","name":"MISNER, J NATHAN","st":"NY","exp":"2012-05-31","sts":
"A"} +
{"id":"10000002","name":"MISNER, JUDY","st":"TN","exp":"2007-07-31","sts":"I"}
+
{"id":"10000003","name":"MISNER, J AMSCHEL","st":"NY","exp":"2007-05-31","sts"
:"A"}+
uscf-> \d goldmast_test
Table "uscf.goldmast_test"
Column | Type | Modifiers
--------+-----------------------+-----------
id | character varying(8) |
name | character varying(40) |
st | character varying(2) |
exp | date |
sts | character(1) |
supp | date |
rrtg | character varying(8) |
qrtg | character varying(8) |
brtg | character varying(8) |
oqrtg | character varying(8) |
obrtg | character varying(8) |
fid | character varying(12) |
insert into goldmast_test select * from json_populate_record(NULL::"goldmast_test", (select * from gold1604_test limit 1) )
produces:
uscf=> select * from goldmast_test;
id | name | st | exp | sts | supp | rrtg | qrtg | brtg
| oqrtg | obrtg | fid
----------+------------------+----+------------+-----+------+------+------+-----
-+-------+-------+-----
10000001 | MISNER, J NATHAN | NY | 2012-05-31 | A | | | |
| | |
(1 row)
uscf=> select * from goldmast_test;
id | name | st | exp | sts | supp | rrtg | qrtg | brtg
| oqrtg | obrtg | fid
----------+------------------+----+------------+-----+------+------+------+-----
-+-------+-------+-----
10000001 | MISNER, J NATHAN | NY | 2012-05-31 | A | | | |
| | |
(1 row)
The fact that the null values were stripped out is not an issue here.
But,
uscf=> insert into goldmast_test select * from json_populate_record(NULL::"goldmast_test", (select * from gold1604_test limit 2) )
uscf-> \g
ERROR: more than one row returned by a subquery used as an expression
uscf=> insert into goldmast_test select * from json_populate_record(NULL::"goldmast_test", (select * from gold1604_test limit 2) )
uscf-> \g
ERROR: more than one row returned by a subquery used as an expression
Is there a way to get around the one row per subquery issue?
--
--
Mike Nolan
On 04/10/2016 08:39 AM, Michael Nolan wrote: > Here's what I did: > > \d gold1604_test > Table "uscf.gold1604_test" > Column | Type | Modifiers > --------+------+----------- > data | json | > > Some sample data: > {"id":"10000001","name":"MISNER, J > NATHAN","st":"NY","exp":"2012-05-31","sts": > "A"} + > > {"id":"10000002","name":"MISNER, > JUDY","st":"TN","exp":"2007-07-31","sts":"I"} > + > > {"id":"10000003","name":"MISNER, J > AMSCHEL","st":"NY","exp":"2007-05-31","sts" > :"A"}+ > > > uscf-> \d goldmast_test > Table "uscf.goldmast_test" > Column | Type | Modifiers > --------+-----------------------+----------- > id | character varying(8) | > name | character varying(40) | > st | character varying(2) | > exp | date | > sts | character(1) | > supp | date | > rrtg | character varying(8) | > qrtg | character varying(8) | > brtg | character varying(8) | > oqrtg | character varying(8) | > obrtg | character varying(8) | > fid | character varying(12) | > > > > > insert into goldmast_test select * from > json_populate_record(NULL::"goldmast_test", (select * from gold1604_test > limit 1) ) > produces: > uscf=> select * from goldmast_test; > id | name | st | exp | sts | supp | rrtg | > qrtg | brtg > | oqrtg | obrtg | fid > ----------+------------------+----+------------+-----+------+------+------+----- > -+-------+-------+----- > 10000001 | MISNER, J NATHAN | NY | 2012-05-31 | A | | | | > | | | > (1 row) > > The fact that the null values were stripped out is not an issue here. > > But, > uscf=> insert into goldmast_test select * from > json_populate_record(NULL::"goldmast_test", (select * from gold1604_test > limit 2) ) > uscf-> \g > ERROR: more than one row returned by a subquery used as an expression > > Is there a way to get around the one row per subquery issue? Per Davids post: http://www.postgresql.org/docs/9.5/interactive/functions-json.html json_populate_recordset(base anyelement, from_json json) Expands the outermost array of objects in from_json to a set of rows whose columns match the record type defined by base (see note below). select * from json_populate_recordset(null::myrowtype, '[{"a":1,"b":2},{"a":3,"b":4}]') > -- > Mike Nolan -- Adrian Klaver adrian.klaver@aklaver.com
Here's what I did:
\d gold1604_test
Table "uscf.gold1604_test"
Column | Type | Modifiers
--------+------+-----------
data | json |Some sample data:
{"id":"10000001","name":"MISNER, J NATHAN","st":"NY","exp":"2012-05-31","sts":
"A"} +
{"id":"10000002","name":"MISNER, JUDY","st":"TN","exp":"2007-07-31","sts":"I"}
+
{"id":"10000003","name":"MISNER, J AMSCHEL","st":"NY","exp":"2007-05-31","sts"
:"A"}+
(I think) PostgreSQL assumes that there is only a single top-level json element, whether it be an array or an object. The first thing you'd have to do is split on the newline and create a PostgreSQL text array.
But,
uscf=> insert into goldmast_test select * from json_populate_record(NULL::"goldmast_test", (select * from gold1604_test limit 2) )
uscf-> \g
ERROR: more than one row returned by a subquery used as an expressionIs there a way to get around the one row per subquery issue?
Yes, use LATERAL.
Something like the following should work (not tested):
INSERT INTO goldmast_test
SELECT jpr.*
FROM gold1604_test
LATERAL json_populate_record(null::goldmast_test", data) AS jpr
ideally you could just do (not tested):
INSERT INTO goldmast_test
SELECT jpr.*
FROM (SELECT t::json FROM unnest(string_to_array(?::text, E'\n') un (t))) src j
LATERAL json_populate_record(null::goldmast_test", j) AS jpr
Where the "?::text" is placeholder for the textual JSON being handed to the query thus avoiding the temporary "gold1604_test" table.
David J.
In case it wasn't clear, the sample data was 3 rows of data. (There are actually around 890K rows in the table pgfutter built from the JSON file.)
-
-
Mike Nolan
On 04/10/2016 09:24 AM, David G. Johnston wrote: > On Sun, Apr 10, 2016 at 8:39 AM, Michael Nolan <htfoot@gmail.com > <mailto:htfoot@gmail.com>>wrote: > > Here's what I did: > > \d gold1604_test > Table "uscf.gold1604_test" > Column | Type | Modifiers > --------+------+----------- > data | json | > > Some sample data: > {"id":"10000001","name":"MISNER, J > NATHAN","st":"NY","exp":"2012-05-31","sts": > "A"} + > > {"id":"10000002","name":"MISNER, > JUDY","st":"TN","exp":"2007-07-31","sts":"I"} > + > > {"id":"10000003","name":"MISNER, J > AMSCHEL","st":"NY","exp":"2007-05-31","sts" > :"A"}+ > > > (I think) PostgreSQL assumes that there is only a single top-level json > element, whether it be an array or an object. The first thing you'd > have to do is split on the newline and create a PostgreSQL text array. > > > But, > uscf=> insert into goldmast_test select * from > json_populate_record(NULL::"goldmast_test", (select * from > gold1604_test limit 2) ) > uscf-> \g > ERROR: more than one row returned by a subquery used as an expression > > Is there a way to get around the one row per subquery issue? > > > Yes, use LATERAL. > > Something like the following should work (not tested): > > INSERT INTO goldmast_test > SELECT jpr.* > FROM gold1604_test > LATERAL json_populate_record(null::goldmast_test", data) AS jpr I can confirm this works after a little clean up: test=> INSERT INTO goldmast_test SELECT jpr.* FROM gold1604_test, LATERAL json_populate_record(null::"goldmast_test", data) AS jpr; INSERT 0 3 test=> select * from goldmast_test ; id | name | st | exp | sts | supp | rrtg | qrtg | brtg | oqrtg | obrtg | fid ----------+-------------------+----+------------+-----+------+------+------+------+-------+-------+----- 10000001 | MISNER, J NATHAN | NY | 2012-05-31 | A | | | | | | | 10000002 | MISNER, JUDY | TN | 2007-07-31 | I | | | | | | | 10000003 | MISNER, J AMSCHEL | NY | 2007-05-31 | A | | | | | | | (3 rows) While trying to figure out how it works I discovered the LATERAL is not necessary: test=> INSERT INTO goldmast_test SELECT jpr.* FROM gold1604_test, json_populate_record(null::"goldmast_test", data) AS jpr; INSERT 0 3 test=> select * from goldmast_test ; id | name | st | exp | sts | supp | rrtg | qrtg | brtg | oqrtg | obrtg | fid ----------+-------------------+----+------------+-----+------+------+------+------+-------+-------+----- 10000001 | MISNER, J NATHAN | NY | 2012-05-31 | A | | | | | | | 10000002 | MISNER, JUDY | TN | 2007-07-31 | I | | | | | | | 10000003 | MISNER, J AMSCHEL | NY | 2007-05-31 | A | | | | | | | 10000001 | MISNER, J NATHAN | NY | 2012-05-31 | A | | | | | | | 10000002 | MISNER, JUDY | TN | 2007-07-31 | I | | | | | | | 10000003 | MISNER, J AMSCHEL | NY | 2007-05-31 | A | | | | | | | (6 rows) > > ideally you could just do (not tested): > > INSERT INTO goldmast_test > SELECT jpr.* > FROM (SELECT t::json FROM unnest(string_to_array(?::text, E'\n') un > (t))) src j > LATERAL json_populate_record(null::goldmast_test", j) AS jpr > > Where the "?::text" is placeholder for the textual JSON being handed to > the query thus avoiding the temporary "gold1604_test" table. > > David J. > -- Adrian Klaver adrian.klaver@aklaver.com