COPY for CSV documentation - Mailing list pgsql-patches
From | Andrew Dunstan |
---|---|
Subject | COPY for CSV documentation |
Date | |
Msg-id | 40781FB7.3090103@dunslane.net Whole thread Raw |
Responses |
Re: COPY for CSV documentation
|
List | pgsql-patches |
Attached is a patch with documentation for the CSV mode of COPY patch submitted yesterday. cheers andrew Index: doc/src/sgml/ref/copy.sgml =================================================================== RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/ref/copy.sgml,v retrieving revision 1.55 diff -c -r1.55 copy.sgml *** doc/src/sgml/ref/copy.sgml 13 Dec 2003 23:59:07 -0000 1.55 --- doc/src/sgml/ref/copy.sgml 10 Apr 2004 16:22:23 -0000 *************** *** 136,142 **** <para> Specifies copying the OID for each row. (An error is raised if <literal>OIDS</literal> is specified for a table that does not ! have OIDs.) </para> </listitem> </varlistentry> --- 136,143 ---- <para> Specifies copying the OID for each row. (An error is raised if <literal>OIDS</literal> is specified for a table that does not ! have OIDs, or if CSV mode is selected by specifying ! <literal>DELIMITER</literal> longer then one character.) </para> </listitem> </varlistentry> *************** *** 145,152 **** <term><replaceable class="parameter">delimiter</replaceable></term> <listitem> <para> ! The single character that separates columns within each row ! (line) of the file. The default is a tab character. </para> </listitem> </varlistentry> --- 146,162 ---- <term><replaceable class="parameter">delimiter</replaceable></term> <listitem> <para> ! A one- to three-character string containing characters used in importing ! or exporting Text or Comma Separated Variable (CSV) files. ! With only one character, Text file format is used. ! If there are two or three characters, CSV format is used. ! The first character for either file format is the delimiter character ! used between values for a single row. ! The second character is the CSV quote character, ! and the third character, if present, is the CSV escape character ! used inside quoted values, and defaults to the quote character. ! The default for this parameter is a single tab character. (Thus TEXT ! file mode is the default.) </para> </listitem> </varlistentry> *************** *** 156,163 **** <listitem> <para> The string that represents a null value. The default is ! <literal>\N</literal> (backslash-N). You might prefer an empty ! string, for example. </para> <note> --- 166,174 ---- <listitem> <para> The string that represents a null value. The default is ! <literal>\N</literal> (backslash-N), unless a CSV is being processed, ! in which case it is an empty string. You might prefer an empty string ! in any case. </para> <note> *************** *** 168,173 **** --- 179,197 ---- <command>COPY TO</command>. </para> </note> + + <note> + <para> + If you don't want anything used as null when using + <command>COPY FROM</command>, you can specify some value that is very + unlikely to appear in the file, such as <literal>frobnitz</literal> or + <literal>d5f4074b254c76cd8ae37bf1731f4aed</literal> (which is + <literal>md5('frobnitz')</literal>). This could be especially useful + when importing a CSV file into a table with <literal>NOT NULL</> + columns. + </para> + </note> + </listitem> </varlistentry> </variablelist> *************** *** 252,259 **** <title>Text Format</title> <para> ! When <command>COPY</command> is used without the <literal>BINARY</literal> option, ! the data read or written is a text file with one line per table row. Columns in a row are separated by the delimiter character. The column values themselves are strings generated by the output function, or acceptable to the input function, of each --- 276,285 ---- <title>Text Format</title> <para> ! When <command>COPY</command> is used without the <literal>BINARY</literal> ! option, the data read or written is a text file with one line per table ! row unless <literal>DELIMITER</literal> is longer than one character, ! in which case CSV format is used. Columns in a row are separated by the delimiter character. The column values themselves are strings generated by the output function, or acceptable to the input function, of each *************** *** 380,385 **** --- 406,459 ---- </refsect2> <refsect2> + <title>CSV format</title> + + <para> + This format is used for importing from and exporting to the Comma + Separated Variable (CSV) file format used by many other programs, + such as spreadsheets. Instead of escaping in + <productname>PostgreSQL</productname>'s standard text mode, it produces + and recognises the common CSV escaping mechanism. + </para> + + <para> + The values in each record are separated by the delimiter character, which + is the first character in the <literal>DELIMITER</literal> parameter. + If the value contains the delimiter character, the quote character (the + second character in the <literal>DELIMITER</literal> parameter), or a + carriage return or line feed character, then the whole value is prefixed + and suffixed by the quote character, and any occurrence within the value + of a quote character or the escape character (the third character in the + <literal>DELIMITER</literal> parameter if present, or else just the quote + character itself) is preceded by the escape character. + </para> + + <para> + In the most common cases, simply specifying <literal>DELIMITER ',"'</> + or <literal>DELIMITER ',\''</> should suffice, both for + <command>COPY FROM</> and <command>COPY TO</>. + </para> + + <note> + <para> + CSV mode will both recognise and produce CSV files with quoted values + containing embedded carriage returns and line feeds. Thus the files are + not strictly one line per table row, as TEXT files are. + </para> + </note> + + <note> + <para> + Many programs produce strange and occasionally perverse CSV files, and + the file format is a convention rather than a standard. Thus you might + encounter some files that cannot be imported using this mechanism, and + you might produce files that other programs fail to recognise properly. + </para> + </note> + + </refsect2> + + <refsect2> <title>Binary Format</title> <para> *************** *** 532,537 **** --- 606,620 ---- using the vertical bar (<literal>|</literal>) as the field delimiter: <programlisting> COPY country TO STDOUT WITH DELIMITER '|'; + </programlisting> + </para> + + <para> + To do the same thing but instead produce a standard CSV file, using + <literal>,</> as the delimiter and <literal>"</> as both the quote and + escape characters: + <programlisting> + COPY country TO STDOUT WITH DELIMITER ',"'; </programlisting> </para>
pgsql-patches by date: