Re: Emitting JSON to file using COPY TO - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Emitting JSON to file using COPY TO
Date
Msg-id 964a3377-d957-4a06-ad02-8228aefa65fc@dunslane.net
Whole thread Raw
In response to Re: Emitting JSON to file using COPY TO  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Emitting JSON to file using COPY TO
List pgsql-hackers


On 2026-03-16 Mo 2:24 PM, Masahiko Sawada wrote:
On Sun, Mar 8, 2026 at 8:49 PM jian he <jian.universality@gmail.com> wrote:
On Mon, Mar 9, 2026 at 3:44 AM Andrew Dunstan <andrew@dunslane.net> wrote:
Hmm. But should we be scribbling on slot->tts_tupleDescriptor like that?
How about something like this?:

-        * Full table or query without column list.  Ensure the slot uses
-        * cstate->tupDesc so that the datum is stamped with the right type;
-        * for queries output type is RECORDOID this must be the blessed
-        * descriptor so that composite_to_json can look it up via
-        * lookup_rowtype_tupdesc.
+        * Full table or query without column list.  For queries, the slot's
+        * TupleDesc may carry RECORDOID, which is not registered in the
type
+        * cache and would cause composite_to_json's lookup_rowtype_tupdesc
+        * call to fail.  Build a HeapTuple stamped with the blessed
+        * descriptor so the type can be looked up correctly.          */         if (!cstate->rel && slot->tts_tupleDescriptor->tdtypeid ==
RECORDOID)
-           slot->tts_tupleDescriptor = cstate->queryDesc->tupDesc;
+       {
+           HeapTuple   tup;

-       rowdata = ExecFetchSlotHeapTupleDatum(slot);
+           tup = heap_form_tuple(cstate->tupDesc,
+                                 slot->tts_values,
+                                 slot->tts_isnull);
+           rowdata = HeapTupleGetDatum(tup);
+       }
+       else
+       {
+           rowdata = ExecFetchSlotHeapTupleDatum(slot);
+       }

This is better. I've tried to get rid of json_projvalues and json_projnulls.
Just using heap_form_tuple, but it won't work.

I incorporated the v28-0004 COPY column list into v9-0002.
With this patch set, we added four fields to the struct CopyToStateData.

+    StringInfo    json_buf;        /* reusable buffer for JSON output,
+                                 * initialized in BeginCopyTo */
+    TupleDesc    tupDesc;        /* Descriptor for JSON output; for a column
+                                 * list this is a projected descriptor */
+    Datum       *json_projvalues;    /* pre-allocated projection values, or
+                                     * NULL */
+    bool       *json_projnulls; /* pre-allocated projection nulls, or NULL */

Using the script in
https://www.postgresql.org/message-id/CACJufxFFZqxC3p4WjpTEi4riaJm%3DpADX%2Bpy0yQ0%3DRWTn5cqK3Q%40mail.gmail.com
I tested it again on macOS and Linux, and there are no regressions for
COPY TO with the TEXT and CSV formats.
I've reviewed the patch and have some comments:

---
I got a SEGV in the following scenario:

postgres(1:1197708)=# create table test (a int, b text, c jsonb);
CREATE TABLE
postgres(1:1197708)=# copy test(a, b) to stdout with (format 'json' );
TRAP: failed Assert("tupdesc->firstNonCachedOffsetAttr >= 0"), File:
"execTuples.c", Line: 2328, PID: 1197708
postgres: masahiko postgres [local] COPY(ExceptionalCondition+0x9e) [0xbebe48]
postgres: masahiko postgres [local] COPY(BlessTupleDesc+0x2b) [0x729b50]
postgres: masahiko postgres [local] COPY(BeginCopyTo+0xc94) [0x637bdf]
postgres: masahiko postgres [local] COPY(DoCopy+0xb68) [0x62afbc]
postgres: masahiko postgres [local]
COPY(standard_ProcessUtility+0xa22) [0xa0ba48]
postgres: masahiko postgres [local] COPY(ProcessUtility+0x10e) [0xa0b01f]
postgres: masahiko postgres [local] COPY() [0xa09872]
postgres: masahiko postgres [local] COPY() [0xa09acf]
postgres: masahiko postgres [local] COPY(PortalRun+0x2c8) [0xa0901d]
postgres: masahiko postgres [local] COPY() [0xa02055]
postgres: masahiko postgres [local] COPY(PostgresMain+0xaf1) [0xa0724e]
postgres: masahiko postgres [local] COPY() [0x9fdab9]
postgres: masahiko postgres [local]
COPY(postmaster_child_launch+0x165) [0x905378]
postgres: masahiko postgres [local] COPY() [0x90b600]
postgres: masahiko postgres [local] COPY() [0x908e6a]
postgres: masahiko postgres [local] COPY(PostmasterMain+0x14fe) [0x90880c]
postgres: masahiko postgres [local] COPY(main+0x340) [0x7a1f9c]

It seems to forget to call TupleDescFinalize(). And I think we need
some regression tests for this case.

---
+       if (cstate->opts.format == COPY_FORMAT_JSON)
+       {
+               /*
+                * If FORCE_ARRAY has been specified, send the opening bracket.
+                */
+               if (cstate->opts.force_array)
+               {
+                       CopySendChar(cstate, '[');
+                       CopySendTextLikeEndOfRow(cstate);
+               }
+       }

We can conjunct the two if statement conditions.


Here's a v30 set that I hope fixes these issues.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Daniil Davydov
Date:
Subject: Re: POC: Parallel processing of indexes in autovacuum
Next
From: Andrei Lepikhov
Date:
Subject: Re: Read-only connection mode for AI workflows.