Re: Emitting JSON to file using COPY TO - Mailing list pgsql-hackers
| From | Andrew Dunstan |
|---|---|
| Subject | Re: Emitting JSON to file using COPY TO |
| Date | |
| Msg-id | 964a3377-d957-4a06-ad02-8228aefa65fc@dunslane.net Whole thread Raw |
| In response to | Re: Emitting JSON to file using COPY TO (Masahiko Sawada <sawada.mshk@gmail.com>) |
| Responses |
Re: Emitting JSON to file using COPY TO
|
| List | pgsql-hackers |
On 2026-03-16 Mo 2:24 PM, Masahiko Sawada wrote:
On Sun, Mar 8, 2026 at 8:49 PM jian he <jian.universality@gmail.com> wrote:On Mon, Mar 9, 2026 at 3:44 AM Andrew Dunstan <andrew@dunslane.net> wrote:Hmm. But should we be scribbling on slot->tts_tupleDescriptor like that? How about something like this?: - * Full table or query without column list. Ensure the slot uses - * cstate->tupDesc so that the datum is stamped with the right type; - * for queries output type is RECORDOID this must be the blessed - * descriptor so that composite_to_json can look it up via - * lookup_rowtype_tupdesc. + * Full table or query without column list. For queries, the slot's + * TupleDesc may carry RECORDOID, which is not registered in the type + * cache and would cause composite_to_json's lookup_rowtype_tupdesc + * call to fail. Build a HeapTuple stamped with the blessed + * descriptor so the type can be looked up correctly. */ if (!cstate->rel && slot->tts_tupleDescriptor->tdtypeid == RECORDOID) - slot->tts_tupleDescriptor = cstate->queryDesc->tupDesc; + { + HeapTuple tup; - rowdata = ExecFetchSlotHeapTupleDatum(slot); + tup = heap_form_tuple(cstate->tupDesc, + slot->tts_values, + slot->tts_isnull); + rowdata = HeapTupleGetDatum(tup); + } + else + { + rowdata = ExecFetchSlotHeapTupleDatum(slot); + }This is better. I've tried to get rid of json_projvalues and json_projnulls. Just using heap_form_tuple, but it won't work. I incorporated the v28-0004 COPY column list into v9-0002. With this patch set, we added four fields to the struct CopyToStateData. + StringInfo json_buf; /* reusable buffer for JSON output, + * initialized in BeginCopyTo */ + TupleDesc tupDesc; /* Descriptor for JSON output; for a column + * list this is a projected descriptor */ + Datum *json_projvalues; /* pre-allocated projection values, or + * NULL */ + bool *json_projnulls; /* pre-allocated projection nulls, or NULL */ Using the script in https://www.postgresql.org/message-id/CACJufxFFZqxC3p4WjpTEi4riaJm%3DpADX%2Bpy0yQ0%3DRWTn5cqK3Q%40mail.gmail.com I tested it again on macOS and Linux, and there are no regressions for COPY TO with the TEXT and CSV formats.I've reviewed the patch and have some comments: --- I got a SEGV in the following scenario: postgres(1:1197708)=# create table test (a int, b text, c jsonb); CREATE TABLE postgres(1:1197708)=# copy test(a, b) to stdout with (format 'json' ); TRAP: failed Assert("tupdesc->firstNonCachedOffsetAttr >= 0"), File: "execTuples.c", Line: 2328, PID: 1197708 postgres: masahiko postgres [local] COPY(ExceptionalCondition+0x9e) [0xbebe48] postgres: masahiko postgres [local] COPY(BlessTupleDesc+0x2b) [0x729b50] postgres: masahiko postgres [local] COPY(BeginCopyTo+0xc94) [0x637bdf] postgres: masahiko postgres [local] COPY(DoCopy+0xb68) [0x62afbc] postgres: masahiko postgres [local] COPY(standard_ProcessUtility+0xa22) [0xa0ba48] postgres: masahiko postgres [local] COPY(ProcessUtility+0x10e) [0xa0b01f] postgres: masahiko postgres [local] COPY() [0xa09872] postgres: masahiko postgres [local] COPY() [0xa09acf] postgres: masahiko postgres [local] COPY(PortalRun+0x2c8) [0xa0901d] postgres: masahiko postgres [local] COPY() [0xa02055] postgres: masahiko postgres [local] COPY(PostgresMain+0xaf1) [0xa0724e] postgres: masahiko postgres [local] COPY() [0x9fdab9] postgres: masahiko postgres [local] COPY(postmaster_child_launch+0x165) [0x905378] postgres: masahiko postgres [local] COPY() [0x90b600] postgres: masahiko postgres [local] COPY() [0x908e6a] postgres: masahiko postgres [local] COPY(PostmasterMain+0x14fe) [0x90880c] postgres: masahiko postgres [local] COPY(main+0x340) [0x7a1f9c] It seems to forget to call TupleDescFinalize(). And I think we need some regression tests for this case. --- + if (cstate->opts.format == COPY_FORMAT_JSON) + { + /* + * If FORCE_ARRAY has been specified, send the opening bracket. + */ + if (cstate->opts.force_array) + { + CopySendChar(cstate, '['); + CopySendTextLikeEndOfRow(cstate); + } + } We can conjunct the two if statement conditions.
Here's a v30 set that I hope fixes these issues.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Attachment
pgsql-hackers by date: