Re: COPY CSV header line feature - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: COPY CSV header line feature
Date
Msg-id 200505060343.j463h0w07376@candle.pha.pa.us
Whole thread Raw
In response to Re: COPY CSV header line feature  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-patches
Here is an updated version of this patch, with documentation changes.

I have already updated the gram.y comment you suggested.

---------------------------------------------------------------------------

Andrew Dunstan wrote:
>
> ammended patch attached. sorry for the oversight. I agree with Tom's
> remark - it's far too easy to miss this.
>
> cheers
>
> andrew
>
> Alvaro Herrera wrote:
>
> >On Tue, Mar 15, 2005 at 08:55:36PM -0600, Andrew Dunstan wrote:
> >
> >
> >>Alvaro Herrera said:
> >>
> >>
> >>>On Sun, Mar 13, 2005 at 06:32:20PM -0500, Andrew Dunstan wrote:
> >>>
> >>>
> >>>
> >>>>The attached patch implements the previously discussed header line
> >>>>feature for CSV mode COPY. It is triggered by the keyword HEADER
> >>>>(blame Bruce - he chose it ;-) ).
> >>>>
> >>>>
> >>>I think you should add the new reserved keyword to the
> >>>unreserved_keywords list or some other.
> >>>
> >>>
> >>Please be more specific. I'll be happy to add in anything I've missed.
> >>
> >>
> >
> >The Postgres grammar classifies keywords in one of several lists, in
> >order to make them available as names to users (column names, function
> >names, etc).  So each time you create a new keyword and add it to the
> >keywords.c list, you have to add it to one of the lists on gram.y too.
> >See gram.y line 7669 ff.
> >
> >I'd add a comment on this on gram.y:
> >
> >Index: gram.y
> >===================================================================
> >RCS file: /home/alvherre/cvs/pgsql/src/backend/parser/gram.y,v
> >retrieving revision 2.484
> >diff -c -w -b -B -c -r2.484 gram.y
> >*** gram.y    14 Mar 2005 00:19:36 -0000    2.484
> >--- gram.y    16 Mar 2005 03:12:48 -0000
> >***************
> >*** 327,333 ****
> >  /*
> >   * If you make any token changes, update the keyword table in
> >   * parser/keywords.c and add new keywords to the appropriate one of
> >!  * the reserved-or-not-so-reserved keyword lists, below.
> >   */
> >
> >  /* ordinary key words in alphabetical order */
> >--- 327,334 ----
> >  /*
> >   * If you make any token changes, update the keyword table in
> >   * parser/keywords.c and add new keywords to the appropriate one of
> >!  * the reserved-or-not-so-reserved keyword lists, below; search this
> >!  * file for "Name classification hierarchy."
> >   */
> >
> >  /* ordinary key words in alphabetical order */
> >
> >
> >


>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/ref/copy.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.63
diff -c -c -r1.63 copy.sgml
*** doc/src/sgml/ref/copy.sgml    4 Jan 2005 00:39:53 -0000    1.63
--- doc/src/sgml/ref/copy.sgml    6 May 2005 03:36:30 -0000
***************
*** 24,34 ****
  COPY <replaceable class="parameter">tablename</replaceable> [ ( <replaceable class="parameter">column</replaceable>
[,...] ) ] 
      FROM { '<replaceable class="parameter">filename</replaceable>' | STDIN }
      [ [ WITH ]
!           [ BINARY ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
            [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
!           [ CSV [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
                  [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
                  [ FORCE NOT NULL <replaceable class="parameter">column</replaceable> [, ...] ]

--- 24,35 ----
  COPY <replaceable class="parameter">tablename</replaceable> [ ( <replaceable class="parameter">column</replaceable>
[,...] ) ] 
      FROM { '<replaceable class="parameter">filename</replaceable>' | STDIN }
      [ [ WITH ]
!           [ BINARY ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
            [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
!           [ CSV [ HEADER ]
!                 [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
                  [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
                  [ FORCE NOT NULL <replaceable class="parameter">column</replaceable> [, ...] ]

***************
*** 36,45 ****
      TO { '<replaceable class="parameter">filename</replaceable>' | STDOUT }
      [ [ WITH ]
            [ BINARY ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
            [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
!           [ CSV [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
                  [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
                  [ FORCE QUOTE <replaceable class="parameter">column</replaceable> [, ...] ]
  </synopsis>
--- 37,48 ----
      TO { '<replaceable class="parameter">filename</replaceable>' | STDOUT }
      [ [ WITH ]
            [ BINARY ]
+           [ HEADER ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
            [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
!           [ CSV [ HEADER ]
!                 [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
                  [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
                  [ FORCE QUOTE <replaceable class="parameter">column</replaceable> [, ...] ]
  </synopsis>
***************
*** 192,197 ****
--- 195,211 ----
     </varlistentry>

     <varlistentry>
+     <term><literal>HEADER</literal></term>
+     <listitem>
+      <para>
+       Specifies the file contains a header line with the names of each
+       column in the file.  On output, the first line contains the column
+       names from the table, and on input, the first line is ignored.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable class="parameter">quote</replaceable></term>
      <listitem>
       <para>
Index: src/backend/commands/copy.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.242
diff -c -c -r1.242 copy.c
*** src/backend/commands/copy.c    6 May 2005 02:56:42 -0000    1.242
--- src/backend/commands/copy.c    6 May 2005 03:36:31 -0000
***************
*** 130,142 ****
  /* non-export function prototypes */
  static void DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
           char *delim, char *null_print, bool csv_mode, char *quote,
!          char *escape, List *force_quote_atts, bool fe_copy);
  static void CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
   char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
!        List *force_quote_atts);
  static void CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
   char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
!          List *force_notnull_atts);
  static bool CopyReadLine(char * quote, char * escape);
  static char *CopyReadAttribute(const char *delim, const char *null_print,
                    CopyReadResult *result, bool *isnull);
--- 130,142 ----
  /* non-export function prototypes */
  static void DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
           char *delim, char *null_print, bool csv_mode, char *quote,
!          char *escape, List *force_quote_atts, bool header_line, bool fe_copy);
  static void CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
   char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
!        List *force_quote_atts, bool header_line);
  static void CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
   char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
!          List *force_notnull_atts, bool header_line);
  static bool CopyReadLine(char * quote, char * escape);
  static char *CopyReadAttribute(const char *delim, const char *null_print,
                    CopyReadResult *result, bool *isnull);
***************
*** 694,699 ****
--- 694,700 ----
      bool        binary = false;
      bool        oids = false;
      bool        csv_mode = false;
+     bool        header_line = false;
      char       *delim = NULL;
      char       *quote = NULL;
      char       *escape = NULL;
***************
*** 751,756 ****
--- 752,765 ----
                           errmsg("conflicting or redundant options")));
              csv_mode = intVal(defel->arg);
          }
+         else if (strcmp(defel->defname, "header") == 0)
+         {
+             if (header_line)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_SYNTAX_ERROR),
+                          errmsg("conflicting or redundant options")));
+             header_line = intVal(defel->arg);
+         }
          else if (strcmp(defel->defname, "quote") == 0)
          {
              if (quote)
***************
*** 824,829 ****
--- 833,844 ----
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                   errmsg("COPY delimiter must be a single character")));

+       /* Check header */
+     if (!csv_mode && header_line)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                  errmsg("COPY HEADER available only in CSV mode")));
+
      /* Check quote */
      if (!csv_mode && quote != NULL)
          ereport(ERROR,
***************
*** 1014,1020 ****
              }
          }
          CopyFrom(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                  quote, escape, force_notnull_atts);
      }
      else
      {                            /* copy from database to file */
--- 1029,1035 ----
              }
          }
          CopyFrom(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                  quote, escape, force_notnull_atts, header_line);
      }
      else
      {                            /* copy from database to file */
***************
*** 1078,1084 ****
          }

          DoCopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                  quote, escape, force_quote_atts, fe_copy);
      }

      if (!pipe)
--- 1093,1099 ----
          }

          DoCopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                  quote, escape, force_quote_atts, header_line, fe_copy);
      }

      if (!pipe)
***************
*** 1110,1116 ****
  static void
  DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
           char *delim, char *null_print, bool csv_mode, char *quote,
!          char *escape, List *force_quote_atts, bool fe_copy)
  {
      PG_TRY();
      {
--- 1125,1131 ----
  static void
  DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
           char *delim, char *null_print, bool csv_mode, char *quote,
!          char *escape, List *force_quote_atts, bool header_line, bool fe_copy)
  {
      PG_TRY();
      {
***************
*** 1118,1124 ****
              SendCopyBegin(binary, list_length(attnumlist));

          CopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                quote, escape, force_quote_atts);

          if (fe_copy)
              SendCopyEnd(binary);
--- 1133,1139 ----
              SendCopyBegin(binary, list_length(attnumlist));

          CopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                quote, escape, force_quote_atts, header_line);

          if (fe_copy)
              SendCopyEnd(binary);
***************
*** 1142,1148 ****
  static void
  CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
         char *delim, char *null_print, bool csv_mode, char *quote,
!        char *escape, List *force_quote_atts)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
--- 1157,1163 ----
  static void
  CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
         char *delim, char *null_print, bool csv_mode, char *quote,
!        char *escape, List *force_quote_atts, bool header_line)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
***************
*** 1225,1230 ****
--- 1240,1269 ----
              null_print = (char *)
                  pg_server_to_client((unsigned char *) null_print,
                                      strlen(null_print));
+
+         /* if a header has been requested send the line */
+         if (header_line)
+         {
+             bool hdr_delim = false;
+             char *colname;
+
+             foreach(cur, attnumlist)
+             {
+                 int            attnum = lfirst_int(cur);
+
+                 if (hdr_delim)
+                     CopySendChar(delim[0]);
+                 hdr_delim = true;
+
+                 colname = NameStr(attr[attnum - 1]->attname);
+
+                 CopyAttributeOutCSV(colname, delim, quote, escape,
+                                     strcmp(colname, null_print) == 0);
+             }
+
+             CopySendEndOfRow(binary);
+
+         }
      }

      scandesc = heap_beginscan(rel, ActiveSnapshot, 0, NULL);
***************
*** 1426,1432 ****
  static void
  CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
           char *delim, char *null_print, bool csv_mode, char *quote,
!          char *escape, List *force_notnull_atts)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
--- 1465,1471 ----
  static void
  CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
           char *delim, char *null_print, bool csv_mode, char *quote,
!          char *escape, List *force_notnull_atts, bool header_line)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
***************
*** 1652,1657 ****
--- 1691,1703 ----
      errcontext.previous = error_context_stack;
      error_context_stack = &errcontext;

+     /* on input just throw the header line away */
+     if (header_line)
+     {
+         copy_lineno++;
+         done = CopyReadLine(quote, escape) ;
+     }
+
      while (!done)
      {
          bool        skip_tuple;
Index: src/backend/parser/gram.y
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.489
diff -c -c -r2.489 gram.y
*** src/backend/parser/gram.y    28 Apr 2005 21:47:14 -0000    2.489
--- src/backend/parser/gram.y    6 May 2005 03:36:36 -0000
***************
*** 361,367 ****

      GLOBAL GRANT GROUP_P

!     HANDLER HAVING HOLD HOUR_P

      ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT
      INDEX INHERITS INITIALLY INNER_P INOUT INPUT_P
--- 361,367 ----

      GLOBAL GRANT GROUP_P

!     HANDLER HAVING HEADER HOLD HOUR_P

      ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT
      INDEX INHERITS INITIALLY INNER_P INOUT INPUT_P
***************
*** 1443,1448 ****
--- 1443,1452 ----
                  {
                      $$ = makeDefElem("csv", (Node *)makeInteger(TRUE));
                  }
+             | HEADER
+                 {
+                     $$ = makeDefElem("header", (Node *)makeInteger(TRUE));
+                 }
              | QUOTE opt_as Sconst
                  {
                      $$ = makeDefElem("quote", (Node *)makeString($3));
***************
*** 7786,7791 ****
--- 7790,7796 ----
              | FUNCTION
              | GLOBAL
              | HANDLER
+             | HEADER
              | HOLD
              | HOUR_P
              | IMMEDIATE
Index: src/backend/parser/keywords.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/keywords.c,v
retrieving revision 1.154
diff -c -c -r1.154 keywords.c
*** src/backend/parser/keywords.c    31 Dec 2004 22:00:27 -0000    1.154
--- src/backend/parser/keywords.c    6 May 2005 03:36:36 -0000
***************
*** 148,153 ****
--- 148,154 ----
      {"group", GROUP_P},
      {"handler", HANDLER},
      {"having", HAVING},
+     {"header", HEADER},
      {"hold", HOLD},
      {"hour", HOUR_P},
      {"ilike", ILIKE},
Index: src/bin/psql/copy.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/psql/copy.c,v
retrieving revision 1.56
diff -c -c -r1.56 copy.c
*** src/bin/psql/copy.c    22 Feb 2005 04:40:54 -0000    1.56
--- src/bin/psql/copy.c    6 May 2005 03:36:39 -0000
***************
*** 66,71 ****
--- 66,72 ----
      bool        binary;
      bool        oids;
      bool        csv_mode;
+     bool        header;
      char       *delim;
      char       *null;
      char       *quote;
***************
*** 289,294 ****
--- 290,297 ----
                  result->oids = true;
              else if (pg_strcasecmp(token, "csv") == 0)
                  result->csv_mode = true;
+             else if (pg_strcasecmp(token, "header") == 0)
+                 result->header = true;
              else if (pg_strcasecmp(token, "delimiter") == 0)
              {
                  token = strtokx(NULL, whitespace, NULL, "'",
***************
*** 481,486 ****
--- 484,492 ----
      if (options->csv_mode)
          appendPQExpBuffer(&query, " CSV");

+     if (options->header)
+         appendPQExpBuffer(&query, " HEADER");
+
      if (options->quote)
      {
          if (options->quote[0] == '\'')
Index: src/bin/psql/tab-complete.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/psql/tab-complete.c,v
retrieving revision 1.126
diff -c -c -r1.126 tab-complete.c
*** src/bin/psql/tab-complete.c    4 May 2005 14:25:24 -0000    1.126
--- src/bin/psql/tab-complete.c    6 May 2005 03:36:40 -0000
***************
*** 1040,1046 ****
                pg_strcasecmp(prev3_wd, "TO") == 0))
          {
              static const char *const list_CSV[] =
!             {"QUOTE", "ESCAPE", "FORCE QUOTE", NULL};

              COMPLETE_WITH_LIST(list_CSV);
          }
--- 1040,1046 ----
                pg_strcasecmp(prev3_wd, "TO") == 0))
          {
              static const char *const list_CSV[] =
!             {"HEADER", "QUOTE", "ESCAPE", "FORCE QUOTE", NULL};

              COMPLETE_WITH_LIST(list_CSV);
          }

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Cleaning up unreferenced table files
Next
From: Emmanuel Bernard
Date:
Subject: Blob .getBytes position should start at 1