Thread: CSV patch applied

CSV patch applied

From

Bruce Momjian

Date:

19 April 2004, 14:36:39

I have applied the attached patch that complete TODO item:

        o -Allow dump/load of CSV format

This adds new keywords to COPY and \copy:

        CSV - enable CSV mode
        QUOTE - specify quote character
        ESCAPE - specify escape character
        FORCE - force quoting of specified columns
        LITERAL - prevent NULL checks for specific columns

If folks have better ideas for naming of those keywords, I am all ears!

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/ref/copy.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.55
diff -c -c -r1.55 copy.sgml
*** doc/src/sgml/ref/copy.sgml    13 Dec 2003 23:59:07 -0000    1.55
--- doc/src/sgml/ref/copy.sgml    19 Apr 2004 17:07:42 -0000
***************
*** 26,32 ****
            [ BINARY ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
!           [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ] ]

  COPY <replaceable class="parameter">tablename</replaceable> [ ( <replaceable class="parameter">column</replaceable>
[,...] ) ] 
      TO { '<replaceable class="parameter">filename</replaceable>' | STDOUT }
--- 26,35 ----
            [ BINARY ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
!           [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
!           [ CSV [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
!                 [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
!                 [ LITERAL <replaceable class="parameter">column</replaceable> [, ...] ]

  COPY <replaceable class="parameter">tablename</replaceable> [ ( <replaceable class="parameter">column</replaceable>
[,...] ) ] 
      TO { '<replaceable class="parameter">filename</replaceable>' | STDOUT }
***************
*** 34,40 ****
            [ BINARY ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
!           [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ] ]
  </synopsis>
   </refsynopsisdiv>

--- 37,46 ----
            [ BINARY ]
            [ OIDS ]
            [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
!           [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
!           [ CSV [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
!                 [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
!                 [ FORCE <replaceable class="parameter">column</replaceable> [, ...] ]
  </synopsis>
   </refsynopsisdiv>

***************
*** 146,152 ****
      <listitem>
       <para>
        The single character that separates columns within each row
!       (line) of the file.  The default is a tab character.
       </para>
      </listitem>
     </varlistentry>
--- 152,159 ----
      <listitem>
       <para>
        The single character that separates columns within each row
!       (line) of the file.  The default is a tab character in text mode,
!       a comma in <literal>CSV</> mode.
       </para>
      </listitem>
     </varlistentry>
***************
*** 156,175 ****
      <listitem>
       <para>
        The string that represents a null value. The default is
!       <literal>\N</literal> (backslash-N). You might prefer an empty
!       string, for example.
       </para>

       <note>
        <para>
!        On a <command>COPY FROM</command>, any data item that matches
         this string will be stored as a null value, so you should make
         sure that you use the same string as you used with
         <command>COPY TO</command>.
        </para>
       </note>
      </listitem>
     </varlistentry>
    </variablelist>
   </refsect1>

--- 163,248 ----
      <listitem>
       <para>
        The string that represents a null value. The default is
!       <literal>\N</literal> (backslash-N) in text mode, and a empty
!       value with no quotes in <literal>CSV</> mode. You might prefer an
!       empty string even in text mode for cases where you don't want to
!       distinguish nulls from empty strings.
       </para>

       <note>
        <para>
!        When using <command>COPY FROM</command>, any data item that matches
         this string will be stored as a null value, so you should make
         sure that you use the same string as you used with
         <command>COPY TO</command>.
        </para>
       </note>
+
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>CSV</literal></term>
+     <listitem>
+      <para>
+       Enables Comma Separated Variable (<literal>CSV</>) mode.  (Also called
+       Comma Separated Value).  It sets the default <literal>DELIMITER</> to
+       comma, and <literal>QUOTE</> and <literal>ESCAPE</> values to
+       double-quote.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term><replaceable class="parameter">quote</replaceable></term>
+     <listitem>
+      <para>
+       Specifies the quotation character in <literal>CSV</> mode.
+       The default is double-quote.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term><replaceable class="parameter">escape</replaceable></term>
+     <listitem>
+      <para>
+       Specifies the character that should appear before a <literal>QUOTE</>
+       data character value in <literal>CSV</> mode.  The default is the
+       <literal>QUOTE</> value (usually double-quote).
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term><literal>FORCE</></term>
+     <listitem>
+      <para>
+       In <literal>CSV</> <command>COPY TO</> mode, forces quoting
+       to be used for all non-<literal>NULL</> values in each specified
+       column.  <literal>NULL</> output is never quoted.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term><literal>LITERAL</></term>
+     <listitem>
+      <para>
+       In <literal>CSV</> <command>COPY FROM</> mode, for each column specified,
+       do not do a <literal>null string</> comparison;  instead load the value
+       literally.  <literal>QUOTE</> and <literal>ESCAPE</> processing are still
+       performed.
+      </para>
+      <para>
+       If the <literal>null string</> is <literal>''</> (the default
+       in <literal>CSV</> mode), a missing input value (<literal>delimiter,
+       delimiter</>), will load as a zero-length string.  <literal>Delimiter, quote,
+       quote, delimiter</> is always treated as a zero-length string on input.
+      </para>
+     </listitem>
+    </varlistentry>
+
    </variablelist>
   </refsect1>

***************
*** 234,239 ****
--- 307,323 ----
     </para>

     <para>
+     <command>COPY</command> input and output is affected by
+     <varname>DateStyle </varname>. For portability with other
+     <productname>PostgreSQL</productname> installations which might use
+     non-default <varname>DateStyle</varname> settings,
+     <varname>DateStyle</varname> should be set to <literal>ISO</> before
+     using <command>COPY</>. In <literal>CSV</> mode, use <literal>ISO</>
+     or a <varname>DateStyle</varname> setting appropriate for the
+     external application.
+    </para>
+
+    <para>
      <command>COPY</command> stops operation at the first error. This
      should not lead to problems in the event of a <command>COPY
      TO</command>, but the target table will already have received
***************
*** 253,259 ****

     <para>
      When <command>COPY</command> is used without the <literal>BINARY</literal> option,
!     the data read or written is a text file with one line per table row.
      Columns in a row are separated by the delimiter character.
      The column values themselves are strings generated by the
      output function, or acceptable to the input function, of each
--- 337,344 ----

     <para>
      When <command>COPY</command> is used without the <literal>BINARY</literal> option,
!     the data read or written is a text file with one line per table row,
!     unless <literal>CSV</> mode is used.
      Columns in a row are separated by the delimiter character.
      The column values themselves are strings generated by the
      output function, or acceptable to the input function, of each
***************
*** 377,382 ****
--- 462,524 ----
      meant as data, <command>COPY FROM</command> will complain if the line
      endings in the input are not all alike.
     </para>
+   </refsect2>
+
+   <refsect2>
+    <title>CSV Format</title>
+
+    <para>
+     This format is used for importing and exporting the Comma
+     Separated Variable (<literal>CSV</>) file format used by many other
+     programs, such as spreadsheets. Instead of the escaping used by
+     <productname>PostgreSQL</productname>'s standard text mode, it
+     produces and recognises the common CSV escaping mechanism.
+    </para>
+
+    <para>
+     The values in each record are separated by the <literal>DELIMITER</>
+     character. If the value contains the delimiter character, the
+     <literal>QUOTE</> character, the <literal>NULL</> string, a carriage
+     return, or line feed character, then the whole value is prefixed and
+     suffixed by the <literal>QUOTE</> character, and any occurrence
+     within the value of a <literal>QUOTE</> character or the
+     <literal>ESCAPE</> character is preceded by the escape character.
+     You can also use <literal>FORCE</> to force quotes when outputting
+     non-<literal>NULL</> values in specific columns.
+    </para>
+
+    <para>
+     In general, the <literal>CSV</> format has no way to distinguish a
+     <literal>NULL</> from an empty string.
+     <productname>PostgreSQL</productname>'s COPY handles this by
+     quoting. A <literal>NULL</> is output as the <literal>NULL</> string
+     and is not quoted, while a data value matching the <literal>NULL</> string
+     is quoted. Therefore, using the default settings, a <literal>NULL</> is
+     written as an unquoted empty string, while an empty string is
+     written with double quotes (<literal>""</>). Reading values follows
+     similar rules. You can use <literal>LITERAL</> to prevent <literal>NULL</>
+     input comparisons for specific columns.
+    </para>
+
+    <note>
+     <para>
+      CSV mode will both recognize and produce CSV files with quoted
+      values containing embedded carriage returns and line feeds. Thus
+      the files are not strictly one line per table row like text-mode
+      files.
+     </para>
+    </note>
+
+    <note>
+     <para>
+      Many programs produce strange and occasionally perverse CSV files,
+      so the file format is more a convention than a standard. Thus you
+      might encounter some files that cannot be imported using this
+      mechanism, and <command>COPY</> might produce files that other
+      programs can not process.
+     </para>
+    </note>
+
    </refsect2>

    <refsect2>
Index: doc/src/sgml/ref/psql-ref.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/psql-ref.sgml,v
retrieving revision 1.110
diff -c -c -r1.110 psql-ref.sgml
*** doc/src/sgml/ref/psql-ref.sgml    12 Apr 2004 15:58:52 -0000    1.110
--- doc/src/sgml/ref/psql-ref.sgml    19 Apr 2004 17:07:44 -0000
***************
*** 711,716 ****
--- 711,720 ----
              [ <literal>oids</literal> ]
              [ <literal>delimiter [as] </literal> '<replaceable class="parameter">character</replaceable>' ]
              [ <literal>null [as] </literal> '<replaceable class="parameter">string</replaceable>' ]</literal>
+             [ <literal>csv [ quote [as] </literal> '<replaceable class="parameter">character</replaceable>' ]
+                            [ <literal>escape [as] </literal> '<replaceable class="parameter">character</replaceable>'
]
+                            [ <literal>force</> <replaceable class="parameter">column_list</replaceable> ]
+                            [ <literal>literal</> <replaceable class="parameter">column_list</replaceable> ] ]
          </term>

          <listitem>
Index: src/backend/commands/copy.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/commands/copy.c,v
retrieving revision 1.220
diff -c -c -r1.220 copy.c
*** src/backend/commands/copy.c    15 Apr 2004 22:36:03 -0000    1.220
--- src/backend/commands/copy.c    19 Apr 2004 17:07:46 -0000
***************
*** 70,76 ****
  typedef enum CopyReadResult
  {
      NORMAL_ATTR,
!     END_OF_LINE
  } CopyReadResult;

  /*
--- 70,77 ----
  typedef enum CopyReadResult
  {
      NORMAL_ATTR,
!     END_OF_LINE,
!     UNTERMINATED_FIELD
  } CopyReadResult;

  /*
***************
*** 130,144 ****

  /* non-export function prototypes */
  static void CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
!        char *delim, char *null_print);
  static void CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
!          char *delim, char *null_print);
  static bool CopyReadLine(void);
  static char *CopyReadAttribute(const char *delim, const char *null_print,
                                 CopyReadResult *result, bool *isnull);
  static Datum CopyReadBinaryAttribute(int column_no, FmgrInfo *flinfo,
                          Oid typelem, bool *isnull);
  static void CopyAttributeOut(char *string, char *delim);
  static List *CopyGetAttnums(Relation rel, List *attnamelist);
  static void limit_printout_length(StringInfo buf);

--- 131,152 ----

  /* non-export function prototypes */
  static void CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
!        char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
!        List *force_atts);
  static void CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
!          char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
!          List *literal_atts);
  static bool CopyReadLine(void);
  static char *CopyReadAttribute(const char *delim, const char *null_print,
                                 CopyReadResult *result, bool *isnull);
+ static char *CopyReadAttributeCSV(const char *delim, const char *null_print,
+                                char *quote, char *escape,
+                                CopyReadResult *result, bool *isnull);
  static Datum CopyReadBinaryAttribute(int column_no, FmgrInfo *flinfo,
                          Oid typelem, bool *isnull);
  static void CopyAttributeOut(char *string, char *delim);
+ static void CopyAttributeOutCSV(char *string, char *delim, char *quote,
+                                 char *escape, bool force_quote);
  static List *CopyGetAttnums(Relation rel, List *attnamelist);
  static void limit_printout_length(StringInfo buf);

***************
*** 682,689 ****
--- 690,704 ----
      List       *attnumlist;
      bool        binary = false;
      bool        oids = false;
+     bool        csv_mode = false;
      char       *delim = NULL;
+     char       *quote = NULL;
+     char       *escape = NULL;
      char       *null_print = NULL;
+     List       *force = NIL;
+     List       *literal = NIL;
+     List       *force_atts = NIL;
+     List       *literal_atts = NIL;
      Relation    rel;
      AclMode        required_access = (is_from ? ACL_INSERT : ACL_SELECT);
      AclResult    aclresult;
***************
*** 725,730 ****
--- 740,785 ----
                           errmsg("conflicting or redundant options")));
              null_print = strVal(defel->arg);
          }
+         else if (strcmp(defel->defname, "csv") == 0)
+         {
+             if (csv_mode)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_SYNTAX_ERROR),
+                          errmsg("conflicting or redundant options")));
+             csv_mode = intVal(defel->arg);
+         }
+         else if (strcmp(defel->defname, "quote") == 0)
+         {
+             if (quote)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_SYNTAX_ERROR),
+                          errmsg("conflicting or redundant options")));
+             quote = strVal(defel->arg);
+         }
+         else if (strcmp(defel->defname, "escape") == 0)
+         {
+             if (escape)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_SYNTAX_ERROR),
+                          errmsg("conflicting or redundant options")));
+             escape = strVal(defel->arg);
+         }
+         else if (strcmp(defel->defname, "force") == 0)
+         {
+             if (force)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_SYNTAX_ERROR),
+                          errmsg("conflicting or redundant options")));
+             force = (List *)defel->arg;
+         }
+         else if (strcmp(defel->defname, "literal") == 0)
+         {
+             if (literal)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_SYNTAX_ERROR),
+                          errmsg("conflicting or redundant options")));
+             literal = (List *)defel->arg;
+         }
          else
              elog(ERROR, "option \"%s\" not recognized",
                   defel->defname);
***************
*** 735,740 ****
--- 790,800 ----
                  (errcode(ERRCODE_SYNTAX_ERROR),
                   errmsg("cannot specify DELIMITER in BINARY mode")));

+     if (binary && csv_mode)
+         ereport(ERROR,
+                 (errcode(ERRCODE_SYNTAX_ERROR),
+                  errmsg("cannot specify CSV in BINARY mode")));
+
      if (binary && null_print)
          ereport(ERROR,
                  (errcode(ERRCODE_SYNTAX_ERROR),
***************
*** 742,751 ****

      /* Set defaults */
      if (!delim)
!         delim = "\t";
!
      if (!null_print)
!         null_print = "\\N";

      /*
       * Open and lock the relation, using the appropriate lock type.
--- 802,893 ----

      /* Set defaults */
      if (!delim)
!         delim = csv_mode ? "," : "\t";
!
      if (!null_print)
!         null_print = csv_mode ? "" : "\\N";
!
!     if (csv_mode)
!     {
!         if (!quote)
!             quote = "\"";
!         if (!escape)
!             escape = quote;
!     }
!
!     /*
!      * Only single-character delimiter strings are supported.
!      */
!     if (strlen(delim) != 1)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY delimiter must be a single character")));
!
!     /*
!      * Check quote
!      */
!     if (!csv_mode && quote != NULL)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY quote available only in CSV mode")));
!
!     if (csv_mode && strlen(quote) != 1)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY quote must be a single character")));
!
!     /*
!      * Check escape
!      */
!     if (!csv_mode && escape != NULL)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY escape available only in CSV mode")));
!
!     if (csv_mode && strlen(escape) != 1)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY escape must be a single character")));
!
!     /*
!      * Check force
!      */
!     if (!csv_mode && force != NIL)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY force available only in CSV mode")));
!     if (force != NIL && is_from)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY force only available using COPY TO")));
!
!     /*
!      * Check literal
!      */
!     if (!csv_mode && literal != NIL)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY literal available only in CSV mode")));
!     if (literal != NIL && !is_from)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY literal only available using COPY FROM")));
!
!     /*
!      * Don't allow the delimiter to appear in the null string.
!      */
!     if (strchr(null_print, delim[0]) != NULL)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("COPY delimiter must not appear in the NULL specification")));
!
!     /*
!      * Don't allow the csv quote char to appear in the null string.
!      */
!     if (csv_mode && strchr(null_print, quote[0]) != NULL)
!         ereport(ERROR,
!                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("CSV quote character must not appear in the NULL specification")));

      /*
       * Open and lock the relation, using the appropriate lock type.
***************
*** 772,793 ****
                         "psql's \\copy command also works for anyone.")));

      /*
-      * Presently, only single-character delimiter strings are supported.
-      */
-     if (strlen(delim) != 1)
-         ereport(ERROR,
-                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-                  errmsg("COPY delimiter must be a single character")));
-
-     /*
-      * Don't allow the delimiter to appear in the null string.
-      */
-     if (strchr(null_print, delim[0]) != NULL)
-         ereport(ERROR,
-                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-                  errmsg("COPY delimiter must not appear in the NULL specification")));
-
-     /*
       * Don't allow COPY w/ OIDs to or from a table without them
       */
      if (oids && !rel->rd_rel->relhasoids)
--- 914,919 ----
***************
*** 802,807 ****
--- 928,979 ----
      attnumlist = CopyGetAttnums(rel, attnamelist);

      /*
+      * Check that FORCE references valid COPY columns
+      */
+     if (force)
+     {
+         TupleDesc    tupDesc = RelationGetDescr(rel);
+         Form_pg_attribute *attr = tupDesc->attrs;
+         List       *cur;
+
+         force_atts = CopyGetAttnums(rel, force);
+
+         foreach(cur, force_atts)
+         {
+             int            attnum = lfirsti(cur);
+
+             if (!intMember(attnum, attnumlist))
+                 ereport(ERROR,
+                         (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+                          errmsg("FORCE column \"%s\" not referenced by COPY",
+                                 NameStr(attr[attnum - 1]->attname))));
+         }
+     }
+
+     /*
+      * Check that LITERAL references valid COPY columns
+      */
+     if (literal)
+     {
+         List       *cur;
+         TupleDesc    tupDesc = RelationGetDescr(rel);
+         Form_pg_attribute *attr = tupDesc->attrs;
+
+         literal_atts = CopyGetAttnums(rel, literal);
+
+         foreach(cur, literal_atts)
+         {
+             int            attnum = lfirsti(cur);
+
+             if (!intMember(attnum, attnumlist))
+                 ereport(ERROR,
+                         (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+                          errmsg("LITERAL column \"%s\" not referenced by COPY",
+                                 NameStr(attr[attnum - 1]->attname))));
+         }
+     }
+
+     /*
       * Set up variables to avoid per-attribute overhead.
       */
      initStringInfo(&attribute_buf);
***************
*** 864,870 ****
                           errmsg("\"%s\" is a directory", filename)));
              }
          }
!         CopyFrom(rel, attnumlist, binary, oids, delim, null_print);
      }
      else
      {                            /* copy from database to file */
--- 1036,1043 ----
                           errmsg("\"%s\" is a directory", filename)));
              }
          }
!         CopyFrom(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                  quote, escape, literal_atts);
      }
      else
      {                            /* copy from database to file */
***************
*** 926,932 ****
                           errmsg("\"%s\" is a directory", filename)));
              }
          }
!         CopyTo(rel, attnumlist, binary, oids, delim, null_print);
      }

      if (!pipe)
--- 1099,1106 ----
                           errmsg("\"%s\" is a directory", filename)));
              }
          }
!         CopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
!                 quote, escape, force_atts);
      }

      if (!pipe)
***************
*** 958,964 ****
   */
  static void
  CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
!        char *delim, char *null_print)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
--- 1132,1139 ----
   */
  static void
  CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
!        char *delim, char *null_print, bool csv_mode, char *quote,
!        char *escape, List *force_atts)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
***************
*** 967,972 ****
--- 1142,1148 ----
      int            attr_count;
      Form_pg_attribute *attr;
      FmgrInfo   *out_functions;
+     bool       *force_quote;
      Oid           *elements;
      bool       *isvarlena;
      char       *string;
***************
*** 988,998 ****
      out_functions = (FmgrInfo *) palloc((num_phys_attrs + 1) * sizeof(FmgrInfo));
      elements = (Oid *) palloc((num_phys_attrs + 1) * sizeof(Oid));
      isvarlena = (bool *) palloc((num_phys_attrs + 1) * sizeof(bool));
      foreach(cur, attnumlist)
      {
          int            attnum = lfirsti(cur);
          Oid            out_func_oid;
!
          if (binary)
              getTypeBinaryOutputInfo(attr[attnum - 1]->atttypid,
                                      &out_func_oid, &elements[attnum - 1],
--- 1164,1175 ----
      out_functions = (FmgrInfo *) palloc((num_phys_attrs + 1) * sizeof(FmgrInfo));
      elements = (Oid *) palloc((num_phys_attrs + 1) * sizeof(Oid));
      isvarlena = (bool *) palloc((num_phys_attrs + 1) * sizeof(bool));
+     force_quote = (bool *) palloc((num_phys_attrs + 1) * sizeof(bool));
      foreach(cur, attnumlist)
      {
          int            attnum = lfirsti(cur);
          Oid            out_func_oid;
!
          if (binary)
              getTypeBinaryOutputInfo(attr[attnum - 1]->atttypid,
                                      &out_func_oid, &elements[attnum - 1],
***************
*** 1002,1007 ****
--- 1179,1189 ----
                                &out_func_oid, &elements[attnum - 1],
                                &isvarlena[attnum - 1]);
          fmgr_info(out_func_oid, &out_functions[attnum - 1]);
+
+         if (intMember(attnum, force_atts))
+             force_quote[attnum - 1] = true;
+         else
+             force_quote[attnum - 1] = false;
      }

      /*
***************
*** 1051,1057 ****
      while ((tuple = heap_getnext(scandesc, ForwardScanDirection)) != NULL)
      {
          bool        need_delim = false;
-
          CHECK_FOR_INTERRUPTS();

          MemoryContextReset(mycontext);
--- 1233,1238 ----
***************
*** 1113,1119 ****
                                                             value,
                                    ObjectIdGetDatum(elements[attnum - 1]),
                              Int32GetDatum(attr[attnum - 1]->atttypmod)));
!                     CopyAttributeOut(string, delim);
                  }
                  else
                  {
--- 1294,1308 ----
                                                             value,
                                    ObjectIdGetDatum(elements[attnum - 1]),
                              Int32GetDatum(attr[attnum - 1]->atttypmod)));
!                     if (csv_mode)
!                     {
!                         CopyAttributeOutCSV(string, delim, quote, escape,
!                                             (strcmp(string, null_print) == 0 ||
!                                             force_quote[attnum - 1]));
!                     }
!                     else
!                         CopyAttributeOut(string, delim);
!
                  }
                  else
                  {
***************
*** 1148,1153 ****
--- 1337,1343 ----
      pfree(out_functions);
      pfree(elements);
      pfree(isvarlena);
+     pfree(force_quote);
  }


***************
*** 1243,1249 ****
   */
  static void
  CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
!          char *delim, char *null_print)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
--- 1433,1440 ----
   */
  static void
  CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
!          char *delim, char *null_print, bool csv_mode, char *quote,
!          char *escape, List *literal_atts)
  {
      HeapTuple    tuple;
      TupleDesc    tupDesc;
***************
*** 1256,1264 ****
      Oid           *elements;
      Oid            oid_in_element;
      ExprState **constraintexprs;
      bool        hasConstraints = false;
-     int            i;
      int            attnum;
      List       *cur;
      Oid            in_func_oid;
      Datum       *values;
--- 1447,1456 ----
      Oid           *elements;
      Oid            oid_in_element;
      ExprState **constraintexprs;
+     bool       *literal_nullstr;
      bool        hasConstraints = false;
      int            attnum;
+     int            i;
      List       *cur;
      Oid            in_func_oid;
      Datum       *values;
***************
*** 1317,1322 ****
--- 1509,1515 ----
      defmap = (int *) palloc((num_phys_attrs + 1) * sizeof(int));
      defexprs = (ExprState **) palloc((num_phys_attrs + 1) * sizeof(ExprState *));
      constraintexprs = (ExprState **) palloc0((num_phys_attrs + 1) * sizeof(ExprState *));
+     literal_nullstr = (bool *) palloc((num_phys_attrs + 1) * sizeof(bool));

      for (attnum = 1; attnum <= num_phys_attrs; attnum++)
      {
***************
*** 1333,1338 ****
--- 1526,1536 ----
                               &in_func_oid, &elements[attnum - 1]);
          fmgr_info(in_func_oid, &in_functions[attnum - 1]);

+         if (intMember(attnum, literal_atts))
+             literal_nullstr[attnum - 1] = true;
+         else
+             literal_nullstr[attnum - 1] = false;
+
          /* Get default info if needed */
          if (!intMember(attnum, attnumlist))
          {
***************
*** 1389,1397 ****
      ExecBSInsertTriggers(estate, resultRelInfo);

      if (!binary)
-     {
          file_has_oids = oids;    /* must rely on user to tell us this... */
-     }
      else
      {
          /* Read and verify binary header */
--- 1587,1593 ----
***************
*** 1500,1505 ****
--- 1696,1702 ----

              if (file_has_oids)
              {
+                 /* can't be in CSV mode here */
                  string = CopyReadAttribute(delim, null_print,
                                             &result, &isnull);

***************
*** 1538,1551 ****
                               errmsg("missing data for column \"%s\"",
                                      NameStr(attr[m]->attname))));

!                 string = CopyReadAttribute(delim, null_print,
!                                            &result, &isnull);
!
!                 if (isnull)
                  {
!                     /* we read an SQL NULL, no need to do anything */
                  }
                  else
                  {
                      copy_attname = NameStr(attr[m]->attname);
                      values[m] = FunctionCall3(&in_functions[m],
--- 1735,1761 ----
                               errmsg("missing data for column \"%s\"",
                                      NameStr(attr[m]->attname))));

!                 if (csv_mode)
                  {
!                     string = CopyReadAttributeCSV(delim, null_print, quote,
!                                                   escape, &result, &isnull);
!                     if (result == UNTERMINATED_FIELD)
!                         ereport(ERROR,
!                                 (errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
!                                  errmsg("unterminated CSV quoted field")));
                  }
                  else
+                     string = CopyReadAttribute(delim, null_print,
+                                                &result, &isnull);
+
+                 if (csv_mode && isnull && literal_nullstr[m])
+                 {
+                     string = null_print;    /* set to NULL string */
+                     isnull = false;
+                 }
+
+                  /* we read an SQL NULL, no need to do anything */
+                 if (!isnull)
                  {
                      copy_attname = NameStr(attr[m]->attname);
                      values[m] = FunctionCall3(&in_functions[m],
***************
*** 1732,1742 ****
      pfree(values);
      pfree(nulls);

!     if (!binary)
!     {
!         pfree(in_functions);
!         pfree(elements);
!     }

      ExecDropTupleTable(tupleTable, true);

--- 1942,1953 ----
      pfree(values);
      pfree(nulls);

!     pfree(in_functions);
!     pfree(elements);
!     pfree(defmap);
!     pfree(defexprs);
!     pfree(constraintexprs);
!     pfree(literal_nullstr);

      ExecDropTupleTable(tupleTable, true);

***************
*** 2070,2075 ****
--- 2281,2432 ----
      return attribute_buf.data;
  }

+
+ /*
+  * Read the value of a single attribute in CSV mode,
+  * performing de-escaping as needed. Escaping does not follow the normal
+  * PostgreSQL text mode, but instead "standard" (i.e. common) CSV usage.
+  *
+  * Quoted fields can span lines, in which case the line end is embedded
+  * in the returned string.
+  *
+  * null_print is the null marker string.  Note that this is compared to
+  * the pre-de-escaped input string (thus if it is quoted it is not a NULL).
+  *
+  * *result is set to indicate what terminated the read:
+  *        NORMAL_ATTR:    column delimiter
+  *        END_OF_LINE:    end of line
+  *      UNTERMINATED_FIELD no quote detected at end of a quoted field
+  *
+  * In any case, the string read up to the terminator (or end of file)
+  * is returned.
+  *
+  * *isnull is set true or false depending on whether the input matched
+  * the null marker.  Note that the caller cannot check this since the
+  * returned string will be the post-de-escaping equivalent, which may
+  * look the same as some valid data string.
+  *----------
+  */
+
+ static char *
+ CopyReadAttributeCSV(const char *delim, const char *null_print, char *quote,
+                      char *escape, CopyReadResult *result, bool *isnull)
+ {
+     char        delimc = delim[0];
+     char        quotec = quote[0];
+     char        escapec = escape[0];
+     char        c;
+     int            start_cursor = line_buf.cursor;
+     int            end_cursor = start_cursor;
+     int            input_len;
+     bool        in_quote = false;
+     bool        saw_quote = false;
+
+     /* reset attribute_buf to empty */
+     attribute_buf.len = 0;
+     attribute_buf.data[0] = '\0';
+
+     /* set default status */
+     *result = END_OF_LINE;
+
+     for (;;)
+     {
+         /* handle multiline quoted fields */
+         if (in_quote && line_buf.cursor >= line_buf.len)
+         {
+             bool done;
+
+             switch(eol_type)
+             {
+                 case EOL_NL:
+                     appendStringInfoString(&attribute_buf,"\n");
+                     break;
+                 case EOL_CR:
+                     appendStringInfoString(&attribute_buf,"\r");
+                     break;
+                 case EOL_CRNL:
+                     appendStringInfoString(&attribute_buf,"\r\n");
+                     break;
+                 case EOL_UNKNOWN:
+                     /* shouldn't happen - just keep going */
+                     break;
+             }
+
+             copy_lineno++;
+             done = CopyReadLine();
+             if (done && line_buf.len == 0)
+                 break;
+             start_cursor = line_buf.cursor;
+         }
+
+         end_cursor = line_buf.cursor;
+         if (line_buf.cursor >= line_buf.len)
+             break;
+         c = line_buf.data[line_buf.cursor++];
+         /*
+          * unquoted field delimiter
+          */
+         if (!in_quote && c == delimc)
+         {
+             *result = NORMAL_ATTR;
+             break;
+         }
+         /*
+          * start of quoted field (or part of field)
+          */
+         if (!in_quote && c == quotec)
+         {
+             saw_quote = true;
+             in_quote = true;
+             continue;
+         }
+         /*
+          * escape within a quoted field
+          */
+         if (in_quote && c == escapec)
+         {
+             /*
+              * peek at the next char if available, and escape it if it
+              * is an escape char or a quote char
+              */
+             if (line_buf.cursor <= line_buf.len)
+             {
+                 char nextc = line_buf.data[line_buf.cursor];
+                 if (nextc == escapec || nextc == quotec)
+                 {
+                     appendStringInfoCharMacro(&attribute_buf, nextc);
+                     line_buf.cursor++;
+                     continue;
+                 }
+             }
+         }
+         /*
+          * end of quoted field.
+          * Must do this test after testing for escape in case quote char
+          * and escape char are the same (which is the common case).
+          */
+         if (in_quote && c == quotec)
+         {
+             in_quote = false;
+             continue;
+         }
+         appendStringInfoCharMacro(&attribute_buf, c);
+     }
+
+     if (in_quote)
+         *result = UNTERMINATED_FIELD;
+
+     /* check whether raw input matched null marker */
+     input_len = end_cursor - start_cursor;
+     if (!saw_quote && input_len == strlen(null_print) &&
+         strncmp(&line_buf.data[start_cursor], null_print, input_len) == 0)
+         *isnull = true;
+     else
+         *isnull = false;
+
+     return attribute_buf.data;
+ }
+
  /*
   * Read a binary attribute
   */
***************
*** 2193,2198 ****
--- 2550,2622 ----
                  break;
          }
      }
+ }
+
+ /*
+  * Send CSV representation of one attribute, with conversion and
+  * CSV type escaping
+  */
+ static void
+ CopyAttributeOutCSV(char *server_string, char *delim, char *quote,
+                     char *escape, bool force_quote)
+ {
+     char       *string;
+     char        c;
+     char        delimc = delim[0];
+     char        quotec = quote[0];
+      char        escapec = escape[0];
+     bool        need_quote = force_quote;
+     char        *test_string;
+     bool        same_encoding;
+     int            mblen;
+     int            i;
+
+     same_encoding = (server_encoding == client_encoding);
+     if (!same_encoding)
+         string = (char *) pg_server_to_client((unsigned char *) server_string,
+                                               strlen(server_string));
+     else
+         string = server_string;
+
+     /* have to run through the string twice,
+      * first time to see if it needs quoting, second to actually send it
+      */
+
+     for(test_string = string;
+         !need_quote && (c = *test_string) != '\0';
+         test_string += mblen)
+     {
+         if (c == delimc || c == quotec || c == '\n' || c == '\r')
+             need_quote = true;
+         if (!same_encoding)
+             mblen = pg_encoding_mblen(client_encoding, test_string);
+         else
+             mblen = 1;
+     }
+
+     if (need_quote)
+         CopySendChar(quotec);
+
+     for (; (c = *string) != '\0'; string += mblen)
+     {
+         if (c == quotec || c == escapec)
+             CopySendChar(escapec);
+
+         CopySendChar(c);
+
+         if (!same_encoding)
+         {
+             /* send additional bytes of the char, if any */
+             mblen = pg_encoding_mblen(client_encoding, string);
+             for (i = 1; i < mblen; i++)
+                 CopySendChar(string[i]);
+         }
+         else
+             mblen = 1;
+     }
+
+     if (need_quote)
+         CopySendChar(quotec);
  }

  /*
Index: src/backend/parser/gram.y
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/parser/gram.y,v
retrieving revision 2.450
diff -c -c -r2.450 gram.y
*** src/backend/parser/gram.y    5 Apr 2004 03:07:26 -0000    2.450
--- src/backend/parser/gram.y    19 Apr 2004 17:07:50 -0000
***************
*** 343,349 ****
      CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
      CLUSTER COALESCE COLLATE COLUMN COMMENT COMMIT
      COMMITTED CONSTRAINT CONSTRAINTS CONVERSION_P CONVERT COPY CREATE CREATEDB
!     CREATEUSER CROSS CURRENT_DATE CURRENT_TIME
      CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE

      DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
--- 343,349 ----
      CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
      CLUSTER COALESCE COLLATE COLUMN COMMENT COMMIT
      COMMITTED CONSTRAINT CONSTRAINTS CONVERSION_P CONVERT COPY CREATE CREATEDB
!     CREATEUSER CROSS CSV CURRENT_DATE CURRENT_TIME
      CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE

      DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
***************
*** 370,376 ****
      KEY

      LANCOMPILER LANGUAGE LARGE_P LAST_P LEADING LEFT LEVEL LIKE LIMIT
!     LISTEN LOAD LOCAL LOCALTIME LOCALTIMESTAMP LOCATION
      LOCK_P

      MATCH MAXVALUE MINUTE_P MINVALUE MODE MONTH_P MOVE
--- 370,376 ----
      KEY

      LANCOMPILER LANGUAGE LARGE_P LAST_P LEADING LEFT LEVEL LIKE LIMIT
!     LISTEN LITERAL LOAD LOCAL LOCALTIME LOCALTIMESTAMP LOCATION
      LOCK_P

      MATCH MAXVALUE MINUTE_P MINVALUE MODE MONTH_P MOVE
***************
*** 386,391 ****
--- 386,393 ----
      PRECISION PRESERVE PREPARE PRIMARY
      PRIOR PRIVILEGES PROCEDURAL PROCEDURE

+     QUOTE
+
      READ REAL RECHECK REFERENCES REINDEX RELATIVE_P RENAME REPEATABLE REPLACE
      RESET RESTART RESTRICT RETURNS REVOKE RIGHT ROLLBACK ROW ROWS
      RULE
***************
*** 1360,1365 ****
--- 1362,1387 ----
                  {
                      $$ = makeDefElem("null", (Node *)makeString($3));
                  }
+             | CSV
+                 {
+                     $$ = makeDefElem("csv", (Node *)makeInteger(TRUE));
+                 }
+             | QUOTE opt_as Sconst
+                 {
+                     $$ = makeDefElem("quote", (Node *)makeString($3));
+                 }
+             | ESCAPE opt_as Sconst
+                 {
+                     $$ = makeDefElem("escape", (Node *)makeString($3));
+                 }
+             | FORCE columnList
+                 {
+                     $$ = makeDefElem("force", (Node *)$2);
+                 }
+             | LITERAL columnList
+                 {
+                     $$ = makeDefElem("literal", (Node *)$2);
+                 }
          ;

  /* The following exist for backward compatibility */
***************
*** 7420,7425 ****
--- 7442,7448 ----
              | COPY
              | CREATEDB
              | CREATEUSER
+             | CSV
              | CURSOR
              | CYCLE
              | DATABASE
***************
*** 7473,7478 ****
--- 7496,7502 ----
              | LAST_P
              | LEVEL
              | LISTEN
+             | LITERAL
              | LOAD
              | LOCAL
              | LOCATION
***************
*** 7507,7512 ****
--- 7531,7537 ----
              | PRIVILEGES
              | PROCEDURAL
              | PROCEDURE
+             | QUOTE
              | READ
              | RECHECK
              | REINDEX
Index: src/backend/parser/keywords.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/parser/keywords.c,v
retrieving revision 1.147
diff -c -c -r1.147 keywords.c
*** src/backend/parser/keywords.c    11 Mar 2004 01:47:40 -0000    1.147
--- src/backend/parser/keywords.c    19 Apr 2004 17:07:50 -0000
***************
*** 90,95 ****
--- 90,96 ----
      {"createdb", CREATEDB},
      {"createuser", CREATEUSER},
      {"cross", CROSS},
+     {"csv", CSV},
      {"current_date", CURRENT_DATE},
      {"current_time", CURRENT_TIME},
      {"current_timestamp", CURRENT_TIMESTAMP},
***************
*** 186,191 ****
--- 187,193 ----
      {"like", LIKE},
      {"limit", LIMIT},
      {"listen", LISTEN},
+     {"literal", LITERAL},
      {"load", LOAD},
      {"local", LOCAL},
      {"localtime", LOCALTIME},
***************
*** 248,253 ****
--- 250,256 ----
      {"privileges", PRIVILEGES},
      {"procedural", PROCEDURAL},
      {"procedure", PROCEDURE},
+     {"quote", QUOTE},
      {"read", READ},
      {"real", REAL},
      {"recheck", RECHECK},
Index: src/backend/tcop/fastpath.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/tcop/fastpath.c,v
retrieving revision 1.71
diff -c -c -r1.71 fastpath.c
*** src/backend/tcop/fastpath.c    7 Jan 2004 18:56:27 -0000    1.71
--- src/backend/tcop/fastpath.c    19 Apr 2004 17:07:51 -0000
***************
*** 154,161 ****
              bool        typisvarlena;
              char       *outputstr;

!             getTypeOutputInfo(rettype,
!                               &typoutput, &typelem, &typisvarlena);
              outputstr = DatumGetCString(OidFunctionCall3(typoutput,
                                                           retval,
                                                 ObjectIdGetDatum(typelem),
--- 154,160 ----
              bool        typisvarlena;
              char       *outputstr;

!             getTypeOutputInfo(rettype, &typoutput, &typelem, &typisvarlena);
              outputstr = DatumGetCString(OidFunctionCall3(typoutput,
                                                           retval,
                                                 ObjectIdGetDatum(typelem),
Index: src/bin/psql/copy.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/bin/psql/copy.c,v
retrieving revision 1.43
diff -c -c -r1.43 copy.c
*** src/bin/psql/copy.c    12 Apr 2004 15:58:52 -0000    1.43
--- src/bin/psql/copy.c    19 Apr 2004 17:07:51 -0000
***************
*** 66,73 ****
--- 66,78 ----
      bool        from;
      bool        binary;
      bool        oids;
+     bool        csv_mode;
      char       *delim;
      char       *null;
+     char       *quote;
+     char       *escape;
+     char        *force_list;
+     char        *literal_list;
  };


***************
*** 81,86 ****
--- 86,95 ----
      free(ptr->file);
      free(ptr->delim);
      free(ptr->null);
+     free(ptr->quote);
+     free(ptr->escape);
+     free(ptr->force_list);
+     free(ptr->literal_list);
      free(ptr);
  }

***************
*** 272,282 ****
--- 281,299 ----

          while (token)
          {
+             bool fetch_next;
+
+             fetch_next = true;
+
              /* someday allow BINARY here */
              if (strcasecmp(token, "oids") == 0)
              {
                  result->oids = true;
              }
+             else if (strcasecmp(token, "csv") == 0)
+             {
+                 result->csv_mode = true;
+             }
              else if (strcasecmp(token, "delimiter") == 0)
              {
                  token = strtokx(NULL, whitespace, NULL, "'",
***************
*** 301,311 ****
                  else
                      goto error;
              }
              else
                  goto error;

!             token = strtokx(NULL, whitespace, NULL, NULL,
!                             0, false, pset.encoding);
          }
      }

--- 318,395 ----
                  else
                      goto error;
              }
+             else if (strcasecmp(token, "quote") == 0)
+             {
+                 token = strtokx(NULL, whitespace, NULL, "'",
+                                 '\\', false, pset.encoding);
+                 if (token && strcasecmp(token, "as") == 0)
+                     token = strtokx(NULL, whitespace, NULL, "'",
+                                     '\\', false, pset.encoding);
+                 if (token)
+                     result->quote = pg_strdup(token);
+                 else
+                     goto error;
+             }
+             else if (strcasecmp(token, "escape") == 0)
+             {
+                 token = strtokx(NULL, whitespace, NULL, "'",
+                                 '\\', false, pset.encoding);
+                 if (token && strcasecmp(token, "as") == 0)
+                     token = strtokx(NULL, whitespace, NULL, "'",
+                                     '\\', false, pset.encoding);
+                 if (token)
+                     result->escape = pg_strdup(token);
+                 else
+                     goto error;
+             }
+             else if (strcasecmp(token, "force") == 0)
+             {
+                 /* handle column list */
+                 fetch_next = false;
+                 for (;;)
+                 {
+                     token = strtokx(NULL, whitespace, ",", "\"",
+                                     0, false, pset.encoding);
+                     if (!token || strchr(",", token[0]))
+                         goto error;
+                     if (!result->force_list)
+                         result->force_list = pg_strdup(token);
+                     else
+                         xstrcat(&result->force_list, token);
+                     token = strtokx(NULL, whitespace, ",", "\"",
+                                     0, false, pset.encoding);
+                     if (!token || token[0] != ',')
+                         break;
+                     xstrcat(&result->force_list, token);
+                 }
+             }
+             else if (strcasecmp(token, "literal") == 0)
+             {
+                 /* handle column list */
+                 fetch_next = false;
+                 for (;;)
+                 {
+                     token = strtokx(NULL, whitespace, ",", "\"",
+                                     0, false, pset.encoding);
+                     if (!token || strchr(",", token[0]))
+                         goto error;
+                     if (!result->literal_list)
+                         result->literal_list = pg_strdup(token);
+                     else
+                         xstrcat(&result->literal_list, token);
+                     token = strtokx(NULL, whitespace, ",", "\"",
+                                     0, false, pset.encoding);
+                     if (!token || token[0] != ',')
+                         break;
+                     xstrcat(&result->literal_list, token);
+                 }
+             }
              else
                  goto error;

!             if (fetch_next)
!                 token = strtokx(NULL, whitespace, NULL, NULL,
!                                 0, false, pset.encoding);
          }
      }

***************
*** 340,346 ****
      PGresult   *result;
      bool        success;
      struct stat st;
!
      /* parse options */
      options = parse_slash_copy(args);

--- 424,430 ----
      PGresult   *result;
      bool        success;
      struct stat st;
!
      /* parse options */
      options = parse_slash_copy(args);

***************
*** 379,390 ****
--- 463,506 ----
                                options->delim);
      }

+     /* There is no backward-compatible CSV syntax */
      if (options->null)
      {
          if (options->null[0] == '\'')
              appendPQExpBuffer(&query, " WITH NULL AS %s", options->null);
          else
              appendPQExpBuffer(&query, " WITH NULL AS '%s'", options->null);
+     }
+
+     if (options->csv_mode)
+     {
+         appendPQExpBuffer(&query, " CSV");
+     }
+
+     if (options->quote)
+     {
+         if (options->quote[0] == '\'')
+             appendPQExpBuffer(&query, " QUOTE AS %s", options->quote);
+         else
+             appendPQExpBuffer(&query, " QUOTE AS '%s'", options->quote);
+     }
+
+     if (options->escape)
+     {
+         if (options->escape[0] == '\'')
+             appendPQExpBuffer(&query, " ESCAPE AS %s", options->escape);
+         else
+             appendPQExpBuffer(&query, " ESCAPE AS '%s'", options->escape);
+     }
+
+     if (options->force_list)
+     {
+         appendPQExpBuffer(&query, " FORCE %s", options->force_list);
+     }
+
+     if (options->literal_list)
+     {
+         appendPQExpBuffer(&query, " LITERAL %s", options->literal_list);
      }

      if (options->from)

Re: CSV patch applied

From

Peter Eisentraut

Date:

20 April 2004, 07:05:34

Bruce Momjian wrote:
> I have applied the attached patch that complete TODO item:
>
>         o -Allow dump/load of CSV format
>
> This adds new keywords to COPY and \copy:
>
>         CSV - enable CSV mode
>         QUOTE - specify quote character
>         ESCAPE - specify escape character
>         FORCE - force quoting of specified columns

FORCE QUOTE

>         LITERAL - prevent NULL checks for specific columns

NO NULL CHECK

> If folks have better ideas for naming of those keywords, I am all
> ears!

Re: CSV patch applied

From

Andrew Dunstan

Date:

20 April 2004, 10:39:00

Peter Eisentraut wrote:

>Bruce Momjian wrote:
>
>
>>I have applied the attached patch that complete TODO item:
>>
>>        o -Allow dump/load of CSV format
>>
>>This adds new keywords to COPY and \copy:
>>
>>        CSV - enable CSV mode
>>        QUOTE - specify quote character
>>        ESCAPE - specify escape character
>>        FORCE - force quoting of specified columns
>>
>>
>
>FORCE QUOTE
>
>
>
>>        LITERAL - prevent NULL checks for specific columns
>>
>>
>
>NO NULL CHECK
>
>
>
>>If folks have better ideas for naming of those keywords, I am all
>>ears!
>>
>>

Bruce and I tossed this around quite a bit.

The problem is that using QUOTE or NULL in these phrases might look
confusing, e.g.
   COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE field1,field2;
   COPY mytable FROM 'mytable.csv' NULL '' CSV NO NULL CHECK field1,field2;

I also don't think NO NULL CHECK actually matches the use case for this
very well (and I'm dubious about LITERAL too).  They both describe the
actual behaviour, but not what you are trying to achieve. Essentially,
this would be used when you have a field with a NOT NULL constraint, but
the input CSV data stream has what would otherwise be considered nulls.
(COPY itself will never produce such a CSV, as non-null values that
resemble null are always quoted, but third party programs well might.)
So an alternative might be FORCE NOT NULL, but for the previous
consideration. Perhaps use of an optional preposition might make things
slightly clearer, e.g.:

   COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE  IN
field1,field2;
   COPY mytable FROM 'mytable.csv' NULL '' CSV FORCE NOT NULL IN
field1,field2;

But it does start to look a little too much like COBOL :-).

So I'm interested to see if there are any other inspirations people have.

cheers

andrew

Re: CSV patch applied

From

Fabien COELHO

Date:

20 April 2004, 12:32:13

> >>        CSV - enable CSV mode
> >>        QUOTE - specify quote character
> >>        ESCAPE - specify escape character
> >>        FORCE - force quoting of specified columns
> >
> >FORCE QUOTE

QUOTING col1,col2?
QUOTED col1,col2?
IN QUOTES col1,cold

> >>        LITERAL - prevent NULL checks for specific columns
> >
> >NO NULL CHECK

QUOTED (meaning 'as quoted')?

From a language design point of view, I think it may be better to stick
to one word versions?

--
Fabien Coelho - coelho@cri.ensmp.fr

Re: CSV patch applied

From

Bruce Momjian

Date:

20 April 2004, 12:42:54

Fabien COELHO wrote:
>
> > >>        CSV - enable CSV mode
> > >>        QUOTE - specify quote character
> > >>        ESCAPE - specify escape character
> > >>        FORCE - force quoting of specified columns
> > >
> > >FORCE QUOTE
>
> QUOTING col1,col2?
> QUOTED col1,col2?
> IN QUOTES col1,cold
>
> > >>        LITERAL - prevent NULL checks for specific columns
> > >
> > >NO NULL CHECK
>
> QUOTED (meaning 'as quoted')?
>
> From a language design point of view, I think it may be better to stick
> to one word versions?

I understand, and we tried that.  The confusion is that FORCE forces
quotes on non-NULL values, while LITERAL forces quotes on NULL values,
so while both deal with quoting, their functionality is different for
input and output.  They same single keyword just seemed confusing.

This is why the two-keyword idea sounds good to me --- we use the word
QUOTE (already a keyword), and then say FORCE for output, and NULL for
input:

    FORCE QUOTE
    QUOTE NULL

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073