Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Sutou Kouhei
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id 20250814.153654.91221343186154858.kou@clear-code.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi,

In <CAD21AoBa0Wm3C2H12jaqkvLidP2zEhsC+gf=3w7XiA4LQnvx0g@mail.gmail.com>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on Mon, 28 Jul 2025 22:19:36 -0700,
  Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> The fields in 1 are mostly static fields, and the fields in 2 and 3
> are likely to be accessed in hot functions during COPY FROM. Would it
> be a good idea to restructure these fields so that we can hide the
> fields in 1 from callback functions and having the fields in 3 in a
> separate format-specific struct that can be accessed via an opaque
> pointer? But could the latter change potentially cause performance
> overheads?

Yes. It changes memory layout (1 continuous memory chunk ->
2 separated memory chunks) and introduces indirect member
accesses (x->y -> x->z->y). They may not have performance
impact but we need to measure it if we want to use this
approach.


BTW, how about the following approach?

copyapi.h:

typedef struct CopyToStateData
{
    /* public members */
    /* ... */
} CopyToStateData;

copyto.c:

typedef struct CopyToStateInternalData
{
    CopyToStateData parent;

    /* private members */
    /* ... */
} CopyToStateInternalData;

We export CopyToStateData only with public members. We don't
export CopyToStateInternalData that has members only for
built-in formats.

CopyToStateInternalData has the same memory layout as
CopyToStateData. So we can use CopyToStateInternalData as
CopyToStateData.

We use CopyToStateData not CopyToStateInternalData in public
API. We cast CopyToStateData to CopyToStateInternalData when
we need to use private members:

static void
CopySendData(CopyToState cstate, const void *databuf, int datasize)
{
    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
    appendBinaryStringInfo(cstate_internal->fe_msgbuf, databuf, datasize);
}

It's still direct member access.


With this approach, we can keep the same memory layout (1
continuous memory chunk) and direct member access. I think
that this approach doesn't have performance impact.

See the attached patch for PoC of this approach.

Drawback: This approach always allocates
CopyToStateInternalData not CopyToStateData. So we need to
allocate needless memories for extensions. But this will
prevent performance regression of built-in formats. Is it
acceptable?


Thanks,
-- 
kou
From 4f1ee841677774cdc36091066020674d300714db Mon Sep 17 00:00:00 2001
From: Sutou Kouhei <kou@clear-code.com>
Date: Thu, 14 Aug 2025 15:21:50 +0900
Subject: [PATCH] Split CopyToStateData to CopyToStateData and
 CopyToStateInternalData

---
 src/backend/commands/copyto.c  | 133 +++++++++++++++++----------------
 src/include/commands/copyapi.h |  21 ++++++
 2 files changed, 91 insertions(+), 63 deletions(-)

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 67b94b91cae..8c58cb18a4d 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -21,7 +21,6 @@
 #include "access/tableam.h"
 #include "commands/copyapi.h"
 #include "commands/progress.h"
-#include "executor/execdesc.h"
 #include "executor/executor.h"
 #include "executor/tuptable.h"
 #include "libpq/libpq.h"
@@ -62,40 +61,27 @@ typedef enum CopyDest
  * invoke pg_encoding_mblen() to skip over them. encoding_embeds_ascii is true
  * when we have to do it the hard way.
  */
-typedef struct CopyToStateData
+typedef struct CopyToStateInternalData
 {
-    /* format-specific routines */
-    const CopyToRoutine *routine;
+    struct CopyToStateData parent;
 
     /* low-level state data */
     CopyDest    copy_dest;        /* type of copy source/destination */
     FILE       *copy_file;        /* used if copy_dest == COPY_FILE */
     StringInfo    fe_msgbuf;        /* used for all dests during COPY TO */
 
-    int            file_encoding;    /* file or remote side's character encoding */
     bool        need_transcoding;    /* file encoding diff from server? */
     bool        encoding_embeds_ascii;    /* ASCII can be non-first byte? */
 
-    /* parameters from the COPY command */
-    Relation    rel;            /* relation to copy to */
-    QueryDesc  *queryDesc;        /* executable query to copy from */
-    List       *attnumlist;        /* integer list of attnums to copy */
-    char       *filename;        /* filename, or NULL for STDOUT */
-    bool        is_program;        /* is 'filename' a program to popen? */
-    copy_data_dest_cb data_dest_cb; /* function for writing data */
-
-    CopyFormatOptions opts;
-    Node       *whereClause;    /* WHERE condition (or NULL) */
-
     /*
      * Working state
      */
     MemoryContext copycontext;    /* per-copy execution context */
 
-    FmgrInfo   *out_functions;    /* lookup info for output functions */
     MemoryContext rowcontext;    /* per-row evaluation context */
     uint64        bytes_processed;    /* number of bytes processed so far */
-} CopyToStateData;
+}            CopyToStateInternalData;
+typedef struct CopyToStateInternalData *CopyToStateInternal;
 
 /* DestReceiver for COPY (query) TO */
 typedef struct
@@ -189,11 +175,13 @@ CopyToGetRoutine(const CopyFormatOptions *opts)
 static void
 CopyToTextLikeStart(CopyToState cstate, TupleDesc tupDesc)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
+
     /*
      * For non-binary copy, we need to convert null_print to file encoding,
      * because it will be sent directly with CopySendString.
      */
-    if (cstate->need_transcoding)
+    if (cstate_internal->need_transcoding)
         cstate->opts.null_print_client = pg_server_to_any(cstate->opts.null_print,
                                                           cstate->opts.null_print_len,
                                                           cstate->file_encoding);
@@ -390,6 +378,7 @@ CopyToBinaryEnd(CopyToState cstate)
 static void
 SendCopyBegin(CopyToState cstate)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
     StringInfoData buf;
     int            natts = list_length(cstate->attnumlist);
     int16        format = (cstate->opts.binary ? 1 : 0);
@@ -401,14 +390,14 @@ SendCopyBegin(CopyToState cstate)
     for (i = 0; i < natts; i++)
         pq_sendint16(&buf, format); /* per-column formats */
     pq_endmessage(&buf);
-    cstate->copy_dest = COPY_FRONTEND;
+    cstate_internal->copy_dest = COPY_FRONTEND;
 }
 
 static void
 SendCopyEnd(CopyToState cstate)
 {
     /* Shouldn't have any unsent data */
-    Assert(cstate->fe_msgbuf->len == 0);
+    Assert(((CopyToStateInternal) cstate)->fe_msgbuf->len == 0);
     /* Send Copy Done message */
     pq_putemptymessage(PqMsg_CopyDone);
 }
@@ -426,32 +415,39 @@ SendCopyEnd(CopyToState cstate)
 static void
 CopySendData(CopyToState cstate, const void *databuf, int datasize)
 {
-    appendBinaryStringInfo(cstate->fe_msgbuf, databuf, datasize);
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
+
+    appendBinaryStringInfo(cstate_internal->fe_msgbuf, databuf, datasize);
 }
 
 static void
 CopySendString(CopyToState cstate, const char *str)
 {
-    appendBinaryStringInfo(cstate->fe_msgbuf, str, strlen(str));
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
+
+    appendBinaryStringInfo(cstate_internal->fe_msgbuf, str, strlen(str));
 }
 
 static void
 CopySendChar(CopyToState cstate, char c)
 {
-    appendStringInfoCharMacro(cstate->fe_msgbuf, c);
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
+
+    appendStringInfoCharMacro(cstate_internal->fe_msgbuf, c);
 }
 
 static void
 CopySendEndOfRow(CopyToState cstate)
 {
-    StringInfo    fe_msgbuf = cstate->fe_msgbuf;
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
+    StringInfo    fe_msgbuf = cstate_internal->fe_msgbuf;
 
-    switch (cstate->copy_dest)
+    switch (cstate_internal->copy_dest)
     {
         case COPY_FILE:
             if (fwrite(fe_msgbuf->data, fe_msgbuf->len, 1,
-                       cstate->copy_file) != 1 ||
-                ferror(cstate->copy_file))
+                       cstate_internal->copy_file) != 1 ||
+                ferror(cstate_internal->copy_file))
             {
                 if (cstate->is_program)
                 {
@@ -492,8 +488,8 @@ CopySendEndOfRow(CopyToState cstate)
     }
 
     /* Update the progress */
-    cstate->bytes_processed += fe_msgbuf->len;
-    pgstat_progress_update_param(PROGRESS_COPY_BYTES_PROCESSED, cstate->bytes_processed);
+    cstate_internal->bytes_processed += fe_msgbuf->len;
+    pgstat_progress_update_param(PROGRESS_COPY_BYTES_PROCESSED, cstate_internal->bytes_processed);
 
     resetStringInfo(fe_msgbuf);
 }
@@ -505,7 +501,9 @@ CopySendEndOfRow(CopyToState cstate)
 static inline void
 CopySendTextLikeEndOfRow(CopyToState cstate)
 {
-    switch (cstate->copy_dest)
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
+
+    switch (cstate_internal->copy_dest)
     {
         case COPY_FILE:
             /* Default line termination depends on platform */
@@ -561,11 +559,12 @@ CopySendInt16(CopyToState cstate, int16 val)
 static void
 ClosePipeToProgram(CopyToState cstate)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
     int            pclose_rc;
 
     Assert(cstate->is_program);
 
-    pclose_rc = ClosePipeStream(cstate->copy_file);
+    pclose_rc = ClosePipeStream(cstate_internal->copy_file);
     if (pclose_rc == -1)
         ereport(ERROR,
                 (errcode_for_file_access(),
@@ -586,13 +585,15 @@ ClosePipeToProgram(CopyToState cstate)
 static void
 EndCopy(CopyToState cstate)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
+
     if (cstate->is_program)
     {
         ClosePipeToProgram(cstate);
     }
     else
     {
-        if (cstate->filename != NULL && FreeFile(cstate->copy_file))
+        if (cstate->filename != NULL && FreeFile(cstate_internal->copy_file))
             ereport(ERROR,
                     (errcode_for_file_access(),
                      errmsg("could not close file \"%s\": %m",
@@ -601,7 +602,7 @@ EndCopy(CopyToState cstate)
 
     pgstat_progress_end_command();
 
-    MemoryContextDelete(cstate->copycontext);
+    MemoryContextDelete(cstate_internal->copycontext);
     pfree(cstate);
 }
 
@@ -631,6 +632,7 @@ BeginCopyTo(ParseState *pstate,
             List *options)
 {
     CopyToState cstate;
+    CopyToStateInternal cstate_internal;
     bool        pipe = (filename == NULL && data_dest_cb == NULL);
     TupleDesc    tupDesc;
     int            num_phys_attrs;
@@ -687,17 +689,18 @@ BeginCopyTo(ParseState *pstate,
 
 
     /* Allocate workspace and zero all fields */
-    cstate = (CopyToStateData *) palloc0(sizeof(CopyToStateData));
+    cstate_internal = (CopyToStateInternal) palloc0(sizeof(CopyToStateInternalData));
+    cstate = (CopyToState) cstate_internal;
 
     /*
      * We allocate everything used by a cstate in a new memory context. This
      * avoids memory leaks during repeated use of COPY in a query.
      */
-    cstate->copycontext = AllocSetContextCreate(CurrentMemoryContext,
-                                                "COPY",
-                                                ALLOCSET_DEFAULT_SIZES);
+    cstate_internal->copycontext = AllocSetContextCreate(CurrentMemoryContext,
+                                                         "COPY",
+                                                         ALLOCSET_DEFAULT_SIZES);
 
-    oldcontext = MemoryContextSwitchTo(cstate->copycontext);
+    oldcontext = MemoryContextSwitchTo(cstate_internal->copycontext);
 
     /* Extract options from the statement node tree */
     ProcessCopyOptions(pstate, &cstate->opts, false /* is_from */ , options);
@@ -895,19 +898,19 @@ BeginCopyTo(ParseState *pstate,
      */
     if (cstate->file_encoding == GetDatabaseEncoding() ||
         cstate->file_encoding == PG_SQL_ASCII)
-        cstate->need_transcoding = false;
+        cstate_internal->need_transcoding = false;
     else
-        cstate->need_transcoding = true;
+        cstate_internal->need_transcoding = true;
 
     /* See Multibyte encoding comment above */
-    cstate->encoding_embeds_ascii = PG_ENCODING_IS_CLIENT_ONLY(cstate->file_encoding);
+    cstate_internal->encoding_embeds_ascii = PG_ENCODING_IS_CLIENT_ONLY(cstate->file_encoding);
 
-    cstate->copy_dest = COPY_FILE;    /* default */
+    cstate_internal->copy_dest = COPY_FILE; /* default */
 
     if (data_dest_cb)
     {
         progress_vals[1] = PROGRESS_COPY_TYPE_CALLBACK;
-        cstate->copy_dest = COPY_CALLBACK;
+        cstate_internal->copy_dest = COPY_CALLBACK;
         cstate->data_dest_cb = data_dest_cb;
     }
     else if (pipe)
@@ -916,7 +919,7 @@ BeginCopyTo(ParseState *pstate,
 
         Assert(!is_program);    /* the grammar does not allow this */
         if (whereToSendOutput != DestRemote)
-            cstate->copy_file = stdout;
+            cstate_internal->copy_file = stdout;
     }
     else
     {
@@ -926,8 +929,8 @@ BeginCopyTo(ParseState *pstate,
         if (is_program)
         {
             progress_vals[1] = PROGRESS_COPY_TYPE_PROGRAM;
-            cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
-            if (cstate->copy_file == NULL)
+            cstate_internal->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
+            if (cstate_internal->copy_file == NULL)
                 ereport(ERROR,
                         (errcode_for_file_access(),
                          errmsg("could not execute command \"%s\": %m",
@@ -952,14 +955,14 @@ BeginCopyTo(ParseState *pstate,
             oumask = umask(S_IWGRP | S_IWOTH);
             PG_TRY();
             {
-                cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_W);
+                cstate_internal->copy_file = AllocateFile(cstate->filename, PG_BINARY_W);
             }
             PG_FINALLY();
             {
                 umask(oumask);
             }
             PG_END_TRY();
-            if (cstate->copy_file == NULL)
+            if (cstate_internal->copy_file == NULL)
             {
                 /* copy errno because ereport subfunctions might change it */
                 int            save_errno = errno;
@@ -973,7 +976,7 @@ BeginCopyTo(ParseState *pstate,
                                  "You may want a client-side facility such as psql's \\copy.") : 0));
             }
 
-            if (fstat(fileno(cstate->copy_file), &st))
+            if (fstat(fileno(cstate_internal->copy_file), &st))
                 ereport(ERROR,
                         (errcode_for_file_access(),
                          errmsg("could not stat file \"%s\": %m",
@@ -991,7 +994,7 @@ BeginCopyTo(ParseState *pstate,
                                   cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
     pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
 
-    cstate->bytes_processed = 0;
+    cstate_internal->bytes_processed = 0;
 
     MemoryContextSwitchTo(oldcontext);
 
@@ -1025,6 +1028,7 @@ EndCopyTo(CopyToState cstate)
 uint64
 DoCopyTo(CopyToState cstate)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
     bool        pipe = (cstate->filename == NULL && cstate->data_dest_cb == NULL);
     bool        fe_copy = (pipe && whereToSendOutput == DestRemote);
     TupleDesc    tupDesc;
@@ -1043,7 +1047,7 @@ DoCopyTo(CopyToState cstate)
     cstate->opts.null_print_client = cstate->opts.null_print;    /* default */
 
     /* We use fe_msgbuf as a per-row buffer regardless of copy_dest */
-    cstate->fe_msgbuf = makeStringInfo();
+    cstate_internal->fe_msgbuf = makeStringInfo();
 
     /* Get info about the columns we need to process. */
     cstate->out_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
@@ -1062,9 +1066,9 @@ DoCopyTo(CopyToState cstate)
      * datatype output routines, and should be faster than retail pfree's
      * anyway.  (We don't need a whole econtext as CopyFrom does.)
      */
-    cstate->rowcontext = AllocSetContextCreate(CurrentMemoryContext,
-                                               "COPY TO",
-                                               ALLOCSET_DEFAULT_SIZES);
+    cstate_internal->rowcontext = AllocSetContextCreate(CurrentMemoryContext,
+                                                        "COPY TO",
+                                                        ALLOCSET_DEFAULT_SIZES);
 
     cstate->routine->CopyToStart(cstate, tupDesc);
 
@@ -1107,7 +1111,7 @@ DoCopyTo(CopyToState cstate)
 
     cstate->routine->CopyToEnd(cstate);
 
-    MemoryContextDelete(cstate->rowcontext);
+    MemoryContextDelete(cstate_internal->rowcontext);
 
     if (fe_copy)
         SendCopyEnd(cstate);
@@ -1121,10 +1125,11 @@ DoCopyTo(CopyToState cstate)
 static inline void
 CopyOneRowTo(CopyToState cstate, TupleTableSlot *slot)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
     MemoryContext oldcontext;
 
-    MemoryContextReset(cstate->rowcontext);
-    oldcontext = MemoryContextSwitchTo(cstate->rowcontext);
+    MemoryContextReset(cstate_internal->rowcontext);
+    oldcontext = MemoryContextSwitchTo(cstate_internal->rowcontext);
 
     /* Make sure the tuple is fully deconstructed */
     slot_getallattrs(slot);
@@ -1146,12 +1151,13 @@ CopyOneRowTo(CopyToState cstate, TupleTableSlot *slot)
 static void
 CopyAttributeOutText(CopyToState cstate, const char *string)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
     const char *ptr;
     const char *start;
     char        c;
     char        delimc = cstate->opts.delim[0];
 
-    if (cstate->need_transcoding)
+    if (cstate_internal->need_transcoding)
         ptr = pg_server_to_any(string, strlen(string), cstate->file_encoding);
     else
         ptr = string;
@@ -1170,7 +1176,7 @@ CopyAttributeOutText(CopyToState cstate, const char *string)
      * it's worth making two copies of it to get the IS_HIGHBIT_SET() test out
      * of the normal safe-encoding path.
      */
-    if (cstate->encoding_embeds_ascii)
+    if (cstate_internal->encoding_embeds_ascii)
     {
         start = ptr;
         while ((c = *ptr) != '\0')
@@ -1300,6 +1306,7 @@ static void
 CopyAttributeOutCSV(CopyToState cstate, const char *string,
                     bool use_quote)
 {
+    CopyToStateInternal cstate_internal = (CopyToStateInternal) cstate;
     const char *ptr;
     const char *start;
     char        c;
@@ -1312,7 +1319,7 @@ CopyAttributeOutCSV(CopyToState cstate, const char *string,
     if (!use_quote && strcmp(string, cstate->opts.null_print) == 0)
         use_quote = true;
 
-    if (cstate->need_transcoding)
+    if (cstate_internal->need_transcoding)
         ptr = pg_server_to_any(string, strlen(string), cstate->file_encoding);
     else
         ptr = string;
@@ -1342,7 +1349,7 @@ CopyAttributeOutCSV(CopyToState cstate, const char *string,
                     use_quote = true;
                     break;
                 }
-                if (IS_HIGHBIT_SET(c) && cstate->encoding_embeds_ascii)
+                if (IS_HIGHBIT_SET(c) && cstate_internal->encoding_embeds_ascii)
                     tptr += pg_encoding_mblen(cstate->file_encoding, tptr);
                 else
                     tptr++;
@@ -1366,7 +1373,7 @@ CopyAttributeOutCSV(CopyToState cstate, const char *string,
                 CopySendChar(cstate, escapec);
                 start = ptr;    /* we include char in next run */
             }
-            if (IS_HIGHBIT_SET(c) && cstate->encoding_embeds_ascii)
+            if (IS_HIGHBIT_SET(c) && cstate_internal->encoding_embeds_ascii)
                 ptr += pg_encoding_mblen(cstate->file_encoding, ptr);
             else
                 ptr++;
diff --git a/src/include/commands/copyapi.h b/src/include/commands/copyapi.h
index 2a2d2f9876b..b818e13cd1b 100644
--- a/src/include/commands/copyapi.h
+++ b/src/include/commands/copyapi.h
@@ -15,6 +15,7 @@
 #define COPYAPI_H
 
 #include "commands/copy.h"
+#include "executor/execdesc.h"
 
 /*
  * API structure for a COPY TO format implementation. Note this must be
@@ -54,6 +55,26 @@ typedef struct CopyToRoutine
     void        (*CopyToEnd) (CopyToState cstate);
 } CopyToRoutine;
 
+typedef struct CopyToStateData
+{
+    /* format-specific routines */
+    const CopyToRoutine *routine;
+
+    /* parameters from the COPY command */
+    Relation    rel;            /* relation to copy to */
+    QueryDesc  *queryDesc;        /* executable query to copy from */
+    List       *attnumlist;        /* integer list of attnums to copy */
+    char       *filename;        /* filename, or NULL for STDOUT */
+    bool        is_program;        /* is 'filename' a program to popen? */
+    copy_data_dest_cb data_dest_cb; /* function for writing data */
+
+    int            file_encoding;    /* file or remote side's character encoding */
+
+    CopyFormatOptions opts;
+
+    FmgrInfo   *out_functions;    /* lookup info for output functions */
+} CopyToStateData;
+
 /*
  * API structure for a COPY FROM format implementation. Note this must be
  * allocated in a server-lifetime manner, typically as a static const struct.
-- 
2.50.0


pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: VM corruption on standby
Next
From: "赵宇鹏(宇彭)"
Date:
Subject: memory leak in logical WAL sender with pgoutput's cachectx