Thread: Conflict between JSON_AGG and COPY
Dear all,
I've found one case. I don't know this is a bug or I config/query some things wrong.
Let I describe it. I have a table with structure and data is:
id | username | fullname
----+-------------+---------------
1 | john | John
2 | anna | Anna
3 | sussi | Sussi
4 | david | David Beckham
5 | wayne | Wayne Rooney
----+-------------+---------------
1 | john | John
2 | anna | Anna
3 | sussi | Sussi
4 | david | David Beckham
5 | wayne | Wayne Rooney
I want to export it to a file in JSON format so I run the query as below:
COPY (SELECT row_to_json(t) FROM (SELECT json_agg(t1) AS "RECORDS" FROM test_table t1) t) TO '/home/postgres/test1.json';
But the result I got will include "\n" in the result:
{"RECORDS":[{"id":1,"username":"john","fullname":"John"}, \n {"id":2,"username":"anna","fullname":"Anna"}, \n {"id":3,"username":"sussi","fullname":"Sussi"}, \n {"id":4,"username":"david","fullname":"David Beckham"}, \n {"id":5,"username":"wayne","fullname":"Wayne Rooney"}]}
Then, I try to get the same data in the other way:
COPY (WITH t2 AS (select row_to_json(t) as js from test_table t),
t1 AS (SELECT ARRAY_TO_JSON(array_agg(js)) as "RECORDS" FROM t2)
SELECT row_to_json(t1) FROM t1)
TO '/home/postgres/test2.json';
And the result I got is quite match what I expect.
{"RECORDS":[{"id":1,"username":"john","fullname":"John"},{"id":2,"username":"anna","fullname":"Anna"},{"id":3,"username":"sussi","fullname":"Sussi"},{"id":4,"username":"david","fullname":"David Beckham"},{"id":5,"username":"wayne","fullname":"Wayne Rooney"}]}
I think the COPY command does not the `\n` character for pretty in `json_agg` command.
Please help me give me your idea. Am I wrong or this is really a bug?
Thank you and best regards,
Đỗ Ngọc Trí Cường (Mr.) Software Development Dept. | Mobile: +84 9 3800 3394 Phone: +84 28 3715 6322 Email: dntcuong@digi-texx.vn |
DIGI-TEXX | a global BPO provider Address: Anna Building, Quang Trung Software City, District. 12, Ho Chi Minh City, Vietnam Website: www.digi-texx.vn | |
IMPORTANT NOTICE: *This e-mail and any attachments may contain confidential and/or privileged information. If you are not the intended recipient, please delete it and notify the sender immediately. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. *Please consider the environment before printing. |
On 04/07/2018 03:44 AM, Đỗ Ngọc Trí Cường wrote: > Dear all, > > I've found one case. I don't know this is a bug or I config/query some > things wrong. > > Let I describe it. I have a table with structure and data is: > > id | username | fullname > ----+-------------+--------------- > 1 | john | John > 2 | anna | Anna > 3 | sussi | Sussi > 4 | david | David Beckham > 5 | wayne | Wayne Rooney > > I want to export it to a file in JSON format so I run the query as below: > > COPY (SELECT row_to_json(t) FROM (SELECT json_agg(t1) AS "RECORDS" > FROM test_table t1) t) TO '/home/postgres/test1.json'; > > But the result I got will include "\n" in the result: > > {"RECORDS":[{"id":1,"username":"john","fullname":"John"}, \n > {"id":2,"username":"anna","fullname":"Anna"}, \n > {"id":3,"username":"sussi","fullname":"Sussi"}, \n > {"id":4,"username":"david","fullname":"David Beckham"}, \n > {"id":5,"username":"wayne","fullname":"Wayne Rooney"}]} > > Then, I try to get the same data in the other way: > > COPY (WITH t2 AS (select row_to_json(t) as js from test_table t), > t1 AS (SELECT ARRAY_TO_JSON(array_agg(js)) as "RECORDS" > FROM t2) > SELECT row_to_json(t1) FROM t1) > TO '/home/postgres/test2.json'; CREATE TABLE test_table (id integer, username varchar, fullname varchar); INSERT INTO test_table VALUES (1, 'john', 'John'), (2, 'anna', 'Anna'), (3, 'sussi', 'Sussi'), (4, 'david', 'David Beckham'), (5, 'wayne', 'Wayne Rooney'); This can be shortened to: COPY (select array_to_json(array_agg(row_to_json(t))) FROM test_table AS t) TO '/home/postgres/test2.json'; > > And the result I got is quite match what I expect. > > {"RECORDS":[{"id":1,"username":"john","fullname":"John"},{"id":2,"username":"anna","fullname":"Anna"},{"id":3,"username":"sussi","fullname":"Sussi"},{"id":4,"username":"david","fullname":"David > Beckham"},{"id":5,"username":"wayne","fullname":"Wayne Rooney"}]} > > > I think the COPY command does not the `\n` character for pretty in > `json_agg` command. Well in the first case you are starting by concatenating the 5 rows in the table into a single row with the table rows separated by new lines: SELECT json_agg(t1) AS "RECORDS" FROM test_table t1; RECORDS ----------------------------------------------------------- [{"id":1,"username":"john","fullname":"John"}, + {"id":2,"username":"anna","fullname":"Anna"}, + {"id":3,"username":"sussi","fullname":"Sussi"}, + {"id":4,"username":"david","fullname":"David Beckham"}, + {"id":5,"username":"wayne","fullname":"Wayne Rooney"}] (1 row) In the second case you start by maintaining the separate table rows: select row_to_json(t) as js from test_table t; js -------------------------------------------------------- {"id":1,"username":"john","fullname":"John"} {"id":2,"username":"anna","fullname":"Anna"} {"id":3,"username":"sussi","fullname":"Sussi"} {"id":4,"username":"david","fullname":"David Beckham"} {"id":5,"username":"wayne","fullname":"Wayne Rooney"} (5 rows) and then keeping that as an array of arrays: select array_agg(row_to_json(t)) from test_table t; {"{\"id\":1,\"username\":\"john\",\"fullname\":\"John\"}","{\"id\":2,\"username\":\"anna\",\"fullname\":\"Anna\"}","{\"id\":3,\"username\":\"sussi\",\"fullname\":\"Sussi\"}","{\"id\":4,\"username\":\"david\",\"fullname\":\"David Beckham\"}","{\"id\":5,\"username\":\"wayne\",\"fullname\":\"Wayne Rooney\"}"} (1 row) which then gets turned back into JSON: select ARRAY_TO_JSON(array_agg(row_to_json(t))) from test_table t; [{"id":1,"username":"john","fullname":"John"},{"id":2,"username":"anna","fullname":"Anna"},{"id":3,"username":"sussi","fullname":"Sussi"},{"id":4,"username":"david","fullname":"David Beckham"},{"id":5,"username":"wayne","fullname":"Wayne Rooney"}] > > Please help me give me your idea. Am I wrong or this is really a bug? > > Thank you and best regards, > > Đỗ Ngọc Trí*Cường*(Mr.) > > *Software Development Dept.* > > > > Mobile: +84 9 3800 3394 <tel:+84917220706> > > Phone: +84 28 3715 6322 <callto:+84%208%203715%205325> > > Email: dntcuong@digi-texx.vn > > DIGI-TEXX | a global BPO provider > > Address: Anna Building, Quang Trung Software City, > > District. 12, Ho Chi Minh City, Vietnam > > Website: www.digi-texx.vn <http://www.digi-texx.vn/> > > // > > /IMPORTANT NOTICE:/ > > /*This e-mail and any attachments may contain confidential and/or > privileged information. If you are not the intended recipient, please > delete it and notify the sender immediately. Any unauthorized copying, > disclosure or distribution of the material in this e-mail is strictly > forbidden./ > > /*Please consider the environment before printing./ > > -- Adrian Klaver adrian.klaver@aklaver.com
Dear Arian Klaver,
I think there is a misunderstood here.
I think that I quite understand how is the second query run.
The question I asked here is why exporting data, which is generated by a query "JSON_AGG" , with command "COPY". The data contain "\n" as 2 characters instead of "new line" character.
The second query is for the old version of PostgreSQL (9.3 and previous) cause of they don't have JSON_AGG aggregate function. Since 9.4, we have "JSON_AGG" already. So I want to rewrite and reduce the length of the query. But it is don't work as I expected with command COPY.
Thank you and best regards,
Đỗ Ngọc Trí Cường (Mr.) | Software Development Department | +84 28 3715 5325
From: "Adrian Klaver" <adrian.klaver@aklaver.com>
To: "Đỗ Ngọc Trí Cường" <dntcuong@digi-texx.vn>, "pgsql-general" <pgsql-general@lists.postgresql.org>
Sent: Monday, April 9, 2018 12:59:44 AM
Subject: Re: Conflict between JSON_AGG and COPY
To: "Đỗ Ngọc Trí Cường" <dntcuong@digi-texx.vn>, "pgsql-general" <pgsql-general@lists.postgresql.org>
Sent: Monday, April 9, 2018 12:59:44 AM
Subject: Re: Conflict between JSON_AGG and COPY
On 04/07/2018 03:44 AM, Đỗ Ngọc Trí Cường wrote:
Dear all,
I've found one case. I don't know this is a bug or I config/query some
things wrong.
Let I describe it. I have a table with structure and data is:
id | username | fullname
----+-------------+---------------
1 | john | John
2 | anna | Anna
3 | sussi | Sussi
4 | david | David Beckham
5 | wayne | Wayne Rooney
I want to export it to a file in JSON format so I run the query as below:
COPY (SELECT row_to_json(t) FROM (SELECT json_agg(t1) AS "RECORDS"
FROM test_table t1) t) TO '/home/postgres/test1.json';
But the result I got will include "\n" in the result:
{"RECORDS":[{"id":1,"username":"john","fullname":"John"}, \n
{"id":2,"username":"anna","fullname":"Anna"}, \n
{"id":3,"username":"sussi","fullname":"Sussi"}, \n
{"id":4,"username":"david","fullname":"David Beckham"}, \n
{"id":5,"username":"wayne","fullname":"Wayne Rooney"}]}
Then, I try to get the same data in the other way:
COPY (WITH t2 AS (select row_to_json(t) as js from test_table t),
t1 AS (SELECT ARRAY_TO_JSON(array_agg(js)) as "RECORDS"
FROM t2)
SELECT row_to_json(t1) FROM t1)
TO '/home/postgres/test2.json';
CREATE TABLE test_table (id integer, username varchar, fullname varchar);
INSERT INTO
test_table
VALUES
(1, 'john', 'John'),
(2, 'anna', 'Anna'),
(3, 'sussi', 'Sussi'),
(4, 'david', 'David Beckham'),
(5, 'wayne', 'Wayne Rooney');
This can be shortened to:
COPY
(select array_to_json(array_agg(row_to_json(t))) FROM test_table AS t)
TO '/home/postgres/test2.json';
And the result I got is quite match what I expect.
{"RECORDS":[{"id":1,"username":"john","fullname":"John"},{"id":2,"username":"anna","fullname":"Anna"},{"id":3,"username":"sussi","fullname":"Sussi"},{"id":4,"username":"david","fullname":"David
Beckham"},{"id":5,"username":"wayne","fullname":"Wayne Rooney"}]}
I think the COPY command does not the `\n` character for pretty in
`json_agg` command.
Well in the first case you are starting by concatenating the 5 rows in
the table into a single row with the table rows separated by new lines:
SELECT json_agg(t1) AS "RECORDS" FROM test_table t1;
RECORDS
-----------------------------------------------------------
[{"id":1,"username":"john","fullname":"John"}, +
{"id":2,"username":"anna","fullname":"Anna"}, +
{"id":3,"username":"sussi","fullname":"Sussi"}, +
{"id":4,"username":"david","fullname":"David Beckham"}, +
{"id":5,"username":"wayne","fullname":"Wayne Rooney"}]
(1 row)
In the second case you start by maintaining the separate table rows:
select row_to_json(t) as js from test_table t;
js
--------------------------------------------------------
{"id":1,"username":"john","fullname":"John"}
{"id":2,"username":"anna","fullname":"Anna"}
{"id":3,"username":"sussi","fullname":"Sussi"}
{"id":4,"username":"david","fullname":"David Beckham"}
{"id":5,"username":"wayne","fullname":"Wayne Rooney"}
(5 rows)
and then keeping that as an array of arrays:
select array_agg(row_to_json(t)) from test_table t;
{"{\"id\":1,\"username\":\"john\",\"fullname\":\"John\"}","{\"id\":2,\"username\":\"anna\",\"fullname\":\"Anna\"}","{\"id\":3,\"username\":\"sussi\",\"fullname\":\"Sussi\"}","{\"id\":4,\"username\":\"david\",\"fullname\":\"David
Beckham\"}","{\"id\":5,\"username\":\"wayne\",\"fullname\":\"Wayne
Rooney\"}"}
(1 row)
which then gets turned back into JSON:
select ARRAY_TO_JSON(array_agg(row_to_json(t))) from test_table t;
[{"id":1,"username":"john","fullname":"John"},{"id":2,"username":"anna","fullname":"Anna"},{"id":3,"username":"sussi","fullname":"Sussi"},{"id":4,"username":"david","fullname":"David
Beckham"},{"id":5,"username":"wayne","fullname":"Wayne Rooney"}]
Please help me give me your idea. Am I wrong or this is really a bug?
Thank you and best regards,
Đỗ Ngọc Trí*Cường*(Mr.)
*Software Development Dept.*
Mobile: +84 9 3800 3394 <tel:+84917220706>
Phone: +84 28 3715 6322 <callto:+84%208%203715%205325>
Email: dntcuong@digi-texx.vn
DIGI-TEXX | a global BPO provider
Address: Anna Building, Quang Trung Software City,
District. 12, Ho Chi Minh City, Vietnam
Website: www.digi-texx.vn <http://www.digi-texx.vn/>
//
/IMPORTANT NOTICE:/
/*This e-mail and any attachments may contain confidential and/or
privileged information. If you are not the intended recipient, please
delete it and notify the sender immediately. Any unauthorized copying,
disclosure or distribution of the material in this e-mail is strictly
forbidden./
/*Please consider the environment before printing./
--
Adrian Klaver
adrian.klaver@aklaver.com
On 04/08/2018 08:44 PM, Đỗ Ngọc Trí Cường wrote: > Dear Arian Klaver, > > I think there is a misunderstood here. > > I think that I quite understand how is the second query run. > > The question I asked here is why exporting data, which is generated by a > query "JSON_AGG" , with command "COPY". The data contain "\n" as 2 > characters instead of "new line" character. https://www.postgresql.org/docs/10/static/sql-copy.html File Formats Text Format " \n Newline (ASCII 10) Presently, COPY TO will never emit an octal or hex-digits backslash sequence, but it does use the other sequences listed above for those control characters. ... COPY TO will terminate each row with a Unix-style newline (“\n”). " > > The second query is for the old version of PostgreSQL (9.3 and previous) > cause of they don't have JSON_AGG aggregate function. Since 9.4, we have > "JSON_AGG" already. So I want to rewrite and reduce the length of the > query. But it is don't work as I expected with command COPY. > > Thank you and best regards, > > > Đỗ Ngọc Trí*Cường*(Mr.) | *Software Development Department*| +84 28 3715 > 5325 -- Adrian Klaver adrian.klaver@aklaver.com
On 04/08/2018 08:44 PM, Đỗ Ngọc Trí Cường wrote:
Presently, COPY TO will never emit an octal or hex-digits backslash sequence, but it does use the other sequences listed above for those control characters.
The second query is for the old version of PostgreSQL (9.3 and previous) cause of they don't have JSON_AGG aggregate function. Since 9.4, we have "JSON_AGG" already. So I want to rewrite and reduce the length of the query. But it is don't work as I expected with command COPY.
Actually, COPY FROM will interpret the newlines just fine. The issue here is that most other programs that injects csv are not as intelligent and will look at the \n as content as opposed to a control character. Those programs will, however, recognize an actual newline within the quoted field as being valid data: but it doesn't appear that COPY is capable outputting such.
David J.
Đỗ Ngọc Trí Cường wrote: > I want to export it to a file in JSON format so I run the query as below: > COPY (SELECT...) TO '/home/postgres/test1.json' COPY TO applies its own format rules on top of the data, which include among other things, quoting the newline characters. What you seem to expect is to copy a single-row, single-column piece of data verbatim to the output file, but COPY does not offer that. Such an option has already been discussed among developers, for instance in this thread as a 'RAW' format: https://www.postgresql.org/message-id/CAFj8pRAfer%2Bip3JCMTnvzra2QK7W9hk0hw1YqE8%2B%2BPZA1OqSJw%40mail.gmail.com but eventually it was not implemented. The unformatted output can be obtained client-side without COPY: psql -Atc "SELECT row_to_json(...)" > /path/to/file If you really need it server-side, a function that can write a result to a file is required. I don't think it exists as such out of the box, but you may either: - write your own function in any untrusted language to do just that (open file, write the piece of text into it, close it). - or see pg_file_write() from the adminpack contrib module https://www.postgresql.org/docs/current/static/adminpack.html With that, you may call, as a superuser: select pg_file_write('path/to/file.json', (select row_to_json(...))::text, false); But note that the path must be relative to the PGDATA directory. - or export the data as an intermediary large object at the cost of some data churn in the large objects. And in that case, the path is not constrained to postgres data directory. do $$ declare id oid; j json; begin j := (select row_to_json(t) from <your query here>); id := lo_from_bytea(0, convert_to(j::text, 'UTF-8')); perform lo_export(id, '/path/to/file.json'); perform lo_unlink(id); end; $$ language plpgsql; For all these server-side methods, you need to be superuser, just like for COPY TO file anyway. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
On 04/08/2018 08:44 PM, Đỗ Ngọc Trí Cường wrote: > Dear Arian Klaver, > > I think there is a misunderstood here. > > I think that I quite understand how is the second query run. > > The question I asked here is why exporting data, which is generated by a > query "JSON_AGG" , with command "COPY". The data contain "\n" as 2 > characters instead of "new line" character. > > The second query is for the old version of PostgreSQL (9.3 and previous) > cause of they don't have JSON_AGG aggregate function. Since 9.4, we have > "JSON_AGG" already. So I want to rewrite and reduce the length of the > query. But it is don't work as I expected with command COPY. COPY (select json_build_object('RECORDS', ARRAY_AGG(t) ) from test_table t) TO '/home/postgres/test2.json'; cat test2.json {"RECORDS" : [{"id":1,"username":"john","fullname":"John"},{"id":2,"username":"anna","fullname":"Anna"},{"id":3,"username":"sussi","fullname":"Sussi"},{"id":4,"username":"david","fullname":"David Beckham"},{"id":5,"username":"wayne","fullname":"Wayne Rooney"}]} > > Thank you and best regards, > > > Đỗ Ngọc Trí*Cường*(Mr.) | *Software Development Department*| +84 28 3715 > 5325 -- Adrian Klaver adrian.klaver@aklaver.com