Re: COPY JSON: use trailing commas in FORCE_ARRAY output - Mailing list pgsql-hackers

From Chao Li
Subject Re: COPY JSON: use trailing commas in FORCE_ARRAY output
Date
Msg-id F08C3C8E-9769-4A96-B392-DA93E11AC3A5@gmail.com
Whole thread
In response to Re: COPY JSON: use trailing commas in FORCE_ARRAY output  (Daniel Gustafsson <daniel@yesql.se>)
List pgsql-hackers

> On May 6, 2026, at 16:23, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 6 May 2026, at 08:40, Chao Li <li.evan.chao@gmail.com> wrote:
>
>> I was surprised by the comma placement. It is valid JSON, but it looks quite uncommon.
>
> It might look uncommon, but for very wide lines it's IMHO preferrable to not
> have to scroll all the way to the end of the line to know that the line is part
> of an array.
>
>> For comparison, the existing json_agg() places commas at the end of the line:
>
> That's true, but json_agg() and COPY TO in ndjson format have different use
> cases.
>
>> ..it should not have any performance impact.
>
> It does add branches though, and in one branch use a non-inlined function where
> previously it would unconditionally use an inline function.  ISTM it would
> still be valuable to do performance testing given that COPY is commonly used in
> performance sensitive settings.
>

Make sense. I just did a test to compare the performance between master and the patch:

For the data setup, since the patch only changes where the comma is emitted, I intentionally used a table with only one
column,to minimize the cost of formatting each row: 
```
DROP TABLE IF EXISTS copy_json_force_array_perf;

CREATE UNLOGGED TABLE copy_json_force_array_perf(id int);

INSERT INTO copy_json_force_array_perf
SELECT g
FROM generate_series(1, 10000000) AS g;

VACUUM ANALYZE copy_json_force_array_perf;

\timing on
```

On master:
```
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1208.694 ms (00:01.209)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1200.203 ms (00:01.200)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1238.639 ms (00:01.239)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1211.344 ms (00:01.211)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1252.197 ms (00:01.252)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1223.510 ms (00:01.224)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1212.378 ms (00:01.212)
```
Average: ~1221 ms

With the patch:
```
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1218.580 ms (00:01.219)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1212.913 ms (00:01.213)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1204.350 ms (00:01.204)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1205.276 ms (00:01.205)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1202.088 ms (00:01.202)
evantest=# COPY copy_json_force_array_perf TO '/dev/null' WITH (FORMAT json, FORCE_ARRAY);
COPY 10000000
Time: 1222.390 ms (00:01.222)
```
Average: ~1211 ms

The difference doesn't look quite meaningful from this test. I built with debug and asserts disabled, and compiled with
-O2.The output was written to /dev/null to avoid client/network overhead. The tests ran on my MacBook M4. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: Zhongpu Chen
Date:
Subject: Re: Proposal: tighten validation for legacy EUC encodings or document that accepted byte sequences may be unconvertible to UTF8
Next
From: Kirill Reshke
Date:
Subject: Re: support create index on virtual generated column.