COPY for CSV documentation - Mailing list pgsql-patches
| From | Andrew Dunstan |
|---|---|
| Subject | COPY for CSV documentation |
| Date | |
| Msg-id | 40781FB7.3090103@dunslane.net Whole thread Raw |
| Responses |
Re: COPY for CSV documentation
|
| List | pgsql-patches |
Attached is a patch with documentation for the CSV mode of COPY patch
submitted yesterday.
cheers
andrew
Index: doc/src/sgml/ref/copy.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.55
diff -c -r1.55 copy.sgml
*** doc/src/sgml/ref/copy.sgml 13 Dec 2003 23:59:07 -0000 1.55
--- doc/src/sgml/ref/copy.sgml 10 Apr 2004 16:22:23 -0000
***************
*** 136,142 ****
<para>
Specifies copying the OID for each row. (An error is raised if
<literal>OIDS</literal> is specified for a table that does not
! have OIDs.)
</para>
</listitem>
</varlistentry>
--- 136,143 ----
<para>
Specifies copying the OID for each row. (An error is raised if
<literal>OIDS</literal> is specified for a table that does not
! have OIDs, or if CSV mode is selected by specifying
! <literal>DELIMITER</literal> longer then one character.)
</para>
</listitem>
</varlistentry>
***************
*** 145,152 ****
<term><replaceable class="parameter">delimiter</replaceable></term>
<listitem>
<para>
! The single character that separates columns within each row
! (line) of the file. The default is a tab character.
</para>
</listitem>
</varlistentry>
--- 146,162 ----
<term><replaceable class="parameter">delimiter</replaceable></term>
<listitem>
<para>
! A one- to three-character string containing characters used in importing
! or exporting Text or Comma Separated Variable (CSV) files.
! With only one character, Text file format is used.
! If there are two or three characters, CSV format is used.
! The first character for either file format is the delimiter character
! used between values for a single row.
! The second character is the CSV quote character,
! and the third character, if present, is the CSV escape character
! used inside quoted values, and defaults to the quote character.
! The default for this parameter is a single tab character. (Thus TEXT
! file mode is the default.)
</para>
</listitem>
</varlistentry>
***************
*** 156,163 ****
<listitem>
<para>
The string that represents a null value. The default is
! <literal>\N</literal> (backslash-N). You might prefer an empty
! string, for example.
</para>
<note>
--- 166,174 ----
<listitem>
<para>
The string that represents a null value. The default is
! <literal>\N</literal> (backslash-N), unless a CSV is being processed,
! in which case it is an empty string. You might prefer an empty string
! in any case.
</para>
<note>
***************
*** 168,173 ****
--- 179,197 ----
<command>COPY TO</command>.
</para>
</note>
+
+ <note>
+ <para>
+ If you don't want anything used as null when using
+ <command>COPY FROM</command>, you can specify some value that is very
+ unlikely to appear in the file, such as <literal>frobnitz</literal> or
+ <literal>d5f4074b254c76cd8ae37bf1731f4aed</literal> (which is
+ <literal>md5('frobnitz')</literal>). This could be especially useful
+ when importing a CSV file into a table with <literal>NOT NULL</>
+ columns.
+ </para>
+ </note>
+
</listitem>
</varlistentry>
</variablelist>
***************
*** 252,259 ****
<title>Text Format</title>
<para>
! When <command>COPY</command> is used without the <literal>BINARY</literal> option,
! the data read or written is a text file with one line per table row.
Columns in a row are separated by the delimiter character.
The column values themselves are strings generated by the
output function, or acceptable to the input function, of each
--- 276,285 ----
<title>Text Format</title>
<para>
! When <command>COPY</command> is used without the <literal>BINARY</literal>
! option, the data read or written is a text file with one line per table
! row unless <literal>DELIMITER</literal> is longer than one character,
! in which case CSV format is used.
Columns in a row are separated by the delimiter character.
The column values themselves are strings generated by the
output function, or acceptable to the input function, of each
***************
*** 380,385 ****
--- 406,459 ----
</refsect2>
<refsect2>
+ <title>CSV format</title>
+
+ <para>
+ This format is used for importing from and exporting to the Comma
+ Separated Variable (CSV) file format used by many other programs,
+ such as spreadsheets. Instead of escaping in
+ <productname>PostgreSQL</productname>'s standard text mode, it produces
+ and recognises the common CSV escaping mechanism.
+ </para>
+
+ <para>
+ The values in each record are separated by the delimiter character, which
+ is the first character in the <literal>DELIMITER</literal> parameter.
+ If the value contains the delimiter character, the quote character (the
+ second character in the <literal>DELIMITER</literal> parameter), or a
+ carriage return or line feed character, then the whole value is prefixed
+ and suffixed by the quote character, and any occurrence within the value
+ of a quote character or the escape character (the third character in the
+ <literal>DELIMITER</literal> parameter if present, or else just the quote
+ character itself) is preceded by the escape character.
+ </para>
+
+ <para>
+ In the most common cases, simply specifying <literal>DELIMITER ',"'</>
+ or <literal>DELIMITER ',\''</> should suffice, both for
+ <command>COPY FROM</> and <command>COPY TO</>.
+ </para>
+
+ <note>
+ <para>
+ CSV mode will both recognise and produce CSV files with quoted values
+ containing embedded carriage returns and line feeds. Thus the files are
+ not strictly one line per table row, as TEXT files are.
+ </para>
+ </note>
+
+ <note>
+ <para>
+ Many programs produce strange and occasionally perverse CSV files, and
+ the file format is a convention rather than a standard. Thus you might
+ encounter some files that cannot be imported using this mechanism, and
+ you might produce files that other programs fail to recognise properly.
+ </para>
+ </note>
+
+ </refsect2>
+
+ <refsect2>
<title>Binary Format</title>
<para>
***************
*** 532,537 ****
--- 606,620 ----
using the vertical bar (<literal>|</literal>) as the field delimiter:
<programlisting>
COPY country TO STDOUT WITH DELIMITER '|';
+ </programlisting>
+ </para>
+
+ <para>
+ To do the same thing but instead produce a standard CSV file, using
+ <literal>,</> as the delimiter and <literal>"</> as both the quote and
+ escape characters:
+ <programlisting>
+ COPY country TO STDOUT WITH DELIMITER ',"';
</programlisting>
</para>
pgsql-patches by date: