Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id CAKFQuwaMAFMHqxDXR=SxA0mDjdmntrwxZd2w=nSruLNFH-OzLw@mail.gmail.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Sutou Kouhei <kou@clear-code.com>)
Responses Re: Make COPY format extendable: Extract COPY TO format implementations
List pgsql-hackers
On Tue, Mar 18, 2025 at 7:56 PM Sutou Kouhei <kou@clear-code.com> wrote:
And could someone help (take over if possible) writing a
document for this feature? I'm not good at writing a
document in English... 0009 in the attached v37 patch set
has a draft of it. It's based on existing documents in
doc/src/sgml/ and *.h.


I haven't touched the innards of the structs aside from changing programlisting to synopsis.  And redoing the two section opening paragraphs to better integrate with the content in the chapter opening.

The rest I kinda went to town on...

David J.

diff --git a/doc/src/sgml/copy-handler.sgml b/doc/src/sgml/copy-handler.sgml
index f602debae6..9d2897a104 100644
--- a/doc/src/sgml/copy-handler.sgml
+++ b/doc/src/sgml/copy-handler.sgml
@@ -10,56 +10,72 @@
  <para>
   <productname>PostgreSQL</productname> supports
   custom <link linkend="sql-copy"><literal>COPY</literal></link>
-  handlers. The <literal>COPY</literal> handlers can use different copy format
-  instead of built-in <literal>text</literal>, <literal>csv</literal>
-  and <literal>binary</literal>.
+  handlers; adding additional <replaceable>format_name</replaceable> options
+  to the <literal>FORMAT</literal> clause.
  </para>
 
  <para>
-  At the SQL level, a table sampling method is represented by a single SQL
-  function, typically implemented in C, having the signature
-<programlisting>
-format_name(internal) RETURNS copy_handler
-</programlisting>
-  The name of the function is the same name appearing in
-  the <literal>FORMAT</literal> option. The <type>internal</type> argument is
-  a dummy that simply serves to prevent this function from being called
-  directly from an SQL command. The real argument is <literal>bool
-  is_from</literal>. If the handler is used by <literal>COPY FROM</literal>,
-  it's <literal>true</literal>. If the handler is used by <literal>COPY
-  FROM</literal>, it's <literal>false</literal>.
+  At the SQL level, a copy handler method is represented by a single SQL
+  function (see <xref linkend="sql-createfunction"/>), typically implemented in
+  C, having the signature
+<synopsis>
+<replaceable>format_name</replaceable>(internal) RETURNS <literal>copy_handler</literal>
+</synopsis>
+  The function's name is then accepted as a valid <replaceable>format_name</replaceable>.
+  The return pseudo-type <literal>copy_handler</literal> informs the system that
+  this function needs to be registered as a copy handler.
+  The <type>internal</type> argument is a dummy that prevents
+  this function from being called directly from an SQL command.  As the
+  handler implementation must be server-lifetime immutable; this SQL function's
+  volatility should be marked immutable.  The <literal>link_symbol</literal>
+  for this function is the name of the implementation function, described next.
  </para>
 
  <para>
-  The function must return <type>CopyFromRoutine *</type> when
-  the <literal>is_from</literal> argument is <literal>true</literal>.
-  The function must return <type>CopyToRoutine *</type> when
-  the <literal>is_from</literal> argument is <literal>false</literal>.
+  The implementation function signature expected for the function named
+  in the <literal>link_symbol</literal> is:
+<synopsis>
+Datum
+<replaceable>copy_format_handler</replaceable>(PG_FUNCTION_ARGS)
+</synopsis>
+  The convention for the name is to replace the word
+  <replaceable>format</replaceable> in the placeholder above with the value given
+  to <replaceable>format_name</replaceable> in the SQL function.
+  The first argument is a <type>boolean</type> that indicates whether the handler
+  must provide a pointer to its implementation for <literal>COPY FROM</literal>
+  (a <type>CopyFromRoutine *</type>). If <literal>false</literal>, the handler
+  must provide a pointer to its implementation of <literal>COPY TO</literal>
+  (a <type>CopyToRoutine *</type>).  These structs are declared in
+  <filename>src/include/commands/copyapi.h</filename>.
  </para>
 
  <para>
-  The <type>CopyFromRoutine</type> and <type>CopyToRoutine</type> struct types
-  are declared in <filename>src/include/commands/copyapi.h</filename>,
-  which see for additional details.
+  The structs hold pointers to implementation functions for
+  initializing, starting, processing rows, and ending a copy operation.
+  The specific structures vary a bit between <literal>COPY FROM</literal> and
+  <literal>COPY TO</literal> so the next two sections describes each
+  in detail.
  </para>
 
  <sect1 id="copy-handler-from">
   <title>Copy From Handler</title>
 
   <para>
-   The <literal>COPY</literal> handler function for <literal>COPY
-   FROM</literal> returns a <type>CopyFromRoutine</type> struct containing
-   pointers to the functions described below. All functions are required.
+   The opening to this chapter describes how the executor will call the
+   main handler function with, in this case,
+   a <type>boolean</type> <literal>true</literal>, and expect to receive a
+   <type>CopyFromRoutine *</type> <type>Datum</type>.  This section describes
+   the components of the <type>CopyFromRoutine</type> struct.
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 void
 CopyFromInFunc(CopyFromState cstate,
                Oid atttypid,
                FmgrInfo *finfo,
                Oid *typioparam);
-</programlisting>
+</synopsis>
 
    This sets input function information for the
    given <literal>atttypid</literal> attribute. This function is called once
@@ -110,11 +126,11 @@ CopyFromInFunc(CopyFromState cstate,
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 void
 CopyFromStart(CopyFromState cstate,
               TupleDesc tupDesc);
-</programlisting>
+</synopsis>
 
    This starts a <literal>COPY FROM</literal>. This function is called once at
    the beginning of <literal>COPY FROM</literal>.
@@ -144,13 +160,13 @@ CopyFromStart(CopyFromState cstate,
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 bool
 CopyFromOneRow(CopyFromState cstate,
                ExprContext *econtext,
                Datum *values,
                bool *nulls);
-</programlisting>
+</synopsis>
 
    This reads one row from the source and fill <literal>values</literal>
    and <literal>nulls</literal>. If there is one or more tuples to be read,
@@ -202,10 +218,10 @@ CopyFromOneRow(CopyFromState cstate,
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 void
 CopyFromEnd(CopyFromState cstate);
-</programlisting>
+</synopsis>
 
    This ends a <literal>COPY FROM</literal>. This function is called once at
    the end of <literal>COPY FROM</literal>.
@@ -232,18 +248,20 @@ CopyFromEnd(CopyFromState cstate);
   <title>Copy To Handler</title>
 
   <para>
-   The <literal>COPY</literal> handler function for <literal>COPY
-   TO</literal> returns a <type>CopyToRoutine</type> struct containing
-   pointers to the functions described below. All functions are required.
+   The opening to this chapter describes how the executor will call the
+   main handler function with, in this case,
+   a <type>boolean</type> <literal>false</literal>, and expect to receive a
+   <type>CopyInRoutine *</type> <type>Datum</type>.  This section describes
+   the components of the <type>CopyInRoutine</type> struct.
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 void
 CopyToOutFunc(CopyToState cstate,
               Oid atttypid,
               FmgrInfo *finfo);
-</programlisting>
+</synopsis>
 
    This sets output function information for the
    given <literal>atttypid</literal> attribute. This function is called once
@@ -284,11 +302,11 @@ CopyToOutFunc(CopyToState cstate,
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 void
 CopyToStart(CopyToState cstate,
             TupleDesc tupDesc);
-</programlisting>
+</synopsis>
 
    This starts a <literal>COPY TO</literal>. This function is called once at
    the beginning of <literal>COPY TO</literal>.
@@ -316,11 +334,11 @@ CopyToStart(CopyToState cstate,
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 bool
 CopyToOneRow(CopyToState cstate,
              TupleTableSlot *slot);
-</programlisting>
+</synopsis>
 
    This writes one row stored in <literal>slot</literal> to the destination.
 
@@ -347,10 +365,10 @@ CopyToOneRow(CopyToState cstate,
   </para>
 
   <para>
-<programlisting>
+<synopsis>
 void
 CopyToEnd(CopyToState cstate);
-</programlisting>
+</synopsis>
 
    This ends a <literal>COPY TO</literal>. This function is called once at
    the end of <literal>COPY TO</literal>.

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: md.c vs elog.c vs smgrreleaseall() in barrier
Next
From: "David G. Johnston"
Date:
Subject: Re: Doc: Fixup misplaced filelist.sgml entities and add some commentary