Re: patches for items from TODO list - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: patches for items from TODO list
Date
Msg-id 428E57BA.8000803@dunslane.net
Whole thread Raw
In response to Re: patches for items from TODO list  ("Sergey Ten" <sergey@sourcelabs.com>)
List pgsql-hackers
I've been reviewing this patch and some of the following discussion.

First, postgresql patches are usually sent as context diffs. I don't 
object to unidiffs myself, but you should do what everybody else does.

Second, it's best not to combine features in one patch. The \x escape 
piece should be broken out.

I'm also a rather worried about COPY producing output which it can't 
itself parse. We can read our own binary, text and CSV formats, and I 
think that's a useful validation tool. I know the TODO item only 
mentions output, but I believe we should rethink that. In any case, if 
it's valid for us to hand XML to other programs why shouldn't we accept 
it too. This is all about playing nicely in the playground.

One advantage of XML is that, being hierarchical, it can easily express 
nested composites (records, arrays) in a way that our present text and 
CSV formats really can't. But unless I missed something this patch 
doesn't in fact do anything to break out nested composites.

Finally, I don't know if there is a standard on this, or even a 
convention. What do other DBs do? I'm not keen on us just inventing our 
own XML dialect for something that should after all be most useful in 
data exchange.

Bottom line, much as I would like to see XML input/output, I think this 
needs lots more thought and discussion.

cheers

andrew

Sergey Ten wrote:

>Hello all,
>
>Thank you to all who replied for suggestions and help. Enclosed please find
>code changes for the following items:
>- Allow COPY to understand \x as a hex byte, and
>- Add XML output to COPY
>The changes include implementation of the features as well as modification
>of the copy regression test.
>
>After a careful consideration we decided to
>- put XML implementation in the backend and
>- use XML format described below, with justification of our decision.
>
>The XML schema used by the COPY TO command was designed for ease of use and
>to avoid the problem of column names appearing in XML element names. 
>XML doesn't allow spaces and punctuation in element names but Postgres does
>allow these characters in column names; therefore, a direct mapping would be
>problematic.
>
>The solution selected places the column names into attribute fields where
>any special characters they contain can be properly escaped using XML
>entities.  An additional attribute is used to distinguish null fields from
>empty ones.
>
>The example below is taken from the test suite.  It demonstrates some basic
>XML escaping in row 2.  Row 3 demonstrates the difference between an empty
>string (in col2) and a null string (in col3).  If a field is null it will
>always be empty but a field which is empty may or may not be null. 
>Always check the value of the 'null' attribute to be sure when a field is
>truly null.
>
><?xml version='1.0'?>
><table>
>    <row>
>        <col name='col1' null='n'>Jackson, Sam</col>
>        <col name='col2' null='n'>\h</col>
>    </row>
>    <row>
>        <col name='col1' null='n'>It is "perfect".</col>
>        <col name='col2' null='n'>	</col>
>    </row>
>    <row>
>        <col name='col1' null='n'></col>
>        <col name='col2' null='y'></col>
>    </row>
></table>
>
>Please let us know if about any concerns, objections the proposed change may
>cause.
>
>Best regards,
>Jason Lucas, Sergey Ten
>SourceLabs
>
>  
>
>>-----Original Message-----
>>From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
>>Sent: Wednesday, May 11, 2005 7:11 PM
>>To: Sergey Ten
>>Cc: pgsql-hackers@postgresql.org; jason@sourcelabs.com
>>Subject: Re: [HACKERS] patches for items from TODO list
>>
>>Sergey Ten wrote:
>>    
>>
>>>Hello all,
>>>
>>>We would like to contribute to the Postgresql community by implementing
>>>the following items from the TODO list
>>>(http://developer.postgresql.org/todo.php):
>>>. Allow COPY to understand \x as a hex byte . Allow COPY to optionally
>>>include column headings in the first line . Add XML output to COPY
>>>
>>>The changes are straightforward and include implementation of the
>>>features as well as modification of the regression tests and
>>>      
>>>
>>documentation.
>>    
>>
>>>Before sending a diff file with the changes, we would like to know if
>>>these features have been already implemented.
>>>      
>>>
>>Please check the web site version.  Someone has already implemented
>>"Allow COPY to optionally include column headings in the first line".
>>
>>As far as XML, there has been discussion on where that should be done?
>>In the backend, libpq, or psql.  It will need discussion on hackers.  I
>>assume you have read the developer's FAQ too.
>>
>>--
>>  Bruce Momjian                        |  http://candle.pha.pa.us
>>  pgman@candle.pha.pa.us               |  (610) 359-1001
>>  +  If your life is a hard drive,     |  13 Roberts Road
>>  +  Christ can be your backup.        |  Newtown Square, Pennsylvania
>>19073
>>    
>>
>>------------------------------------------------------------------------
>>
>>Index: src/backend/commands/copy.c
>>===================================================================
>>RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v
>>retrieving revision 1.244
>>diff -u -r1.244 copy.c
>>--- src/backend/commands/copy.c    7 May 2005 02:22:46 -0000    1.244
>>+++ src/backend/commands/copy.c    13 May 2005 22:21:00 -0000
>>@@ -84,6 +84,16 @@
>>     EOL_CRNL
>> } EolType;
>>
>>+/*
>>+ *    Represents the format of the file to be read or written
>>+ */
>>+typedef enum CopyFmt
>>+{
>>+    FMT_TXT,
>>+    FMT_BIN,
>>+    FMT_CSV,
>>+    FMT_XML
>>+} CopyFmt;
>> 
>> static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0";
>> 
>>@@ -129,14 +139,14 @@
>> static bool line_buf_converted;
>> 
>> /* non-export function prototypes */
>>-static void DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
>>-         char *delim, char *null_print, bool csv_mode, char *quote,
>>+static void DoCopyTo(Relation rel, List *attnumlist, CopyFmt fmt, bool oids,
>>+         char *delim, char *null_print, char *quote,
>>          char *escape, List *force_quote_atts, bool header_line, bool fe_copy);
>>-static void CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
>>- char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
>>+static void CopyTo(Relation rel, List *attnumlist, CopyFmt fmt, bool oids,
>>+ char *delim, char *null_print, char *quote, char *escape,
>>        List *force_quote_atts, bool header_line);
>>-static void CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
>>- char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
>>+static void CopyFrom(Relation rel, List *attnumlist, CopyFmt fmt, bool oids,
>>+ char *delim, char *null_print, char *quote, char *escape,
>>          List *force_notnull_atts, bool header_line);
>> static bool CopyReadLine(char * quote, char * escape);
>> static char *CopyReadAttribute(const char *delim, const char *null_print,
>>@@ -171,6 +181,11 @@
>> static void CopySendInt16(int16 val);
>> static int16 CopyGetInt16(void);
>> 
>>+static int GetDecimalFromHex(char hex);
>>+
>>+static void CopyAttributeOutXML (char *colname, char *string);
>>+static void CopySendStringXML(char *string);
>>+static char *CopyGetXMLEntity(char c, char *buf);
>> 
>> /*
>>  * Send copy start/stop messages for frontend copies.  These have changed
>>@@ -692,10 +707,9 @@
>>     List       *attnamelist = stmt->attlist;
>>     List       *attnumlist;
>>     bool        fe_copy = false;
>>-    bool        binary = false;
>>     bool        oids = false;
>>-    bool        csv_mode = false;
>>-    bool        header_line = false;
>>+    bool        header_line = false;
>>+    CopyFmt        fmt = FMT_TXT;
>>     char       *delim = NULL;
>>     char       *quote = NULL;
>>     char       *escape = NULL;
>>@@ -715,11 +729,11 @@
>> 
>>         if (strcmp(defel->defname, "binary") == 0)
>>         {
>>-            if (binary)
>>+            if (fmt != FMT_TXT)
>>                 ereport(ERROR,
>>                         (errcode(ERRCODE_SYNTAX_ERROR),
>>                          errmsg("conflicting or redundant options")));
>>-            binary = intVal(defel->arg);
>>+            fmt = FMT_BIN;
>>         }
>>         else if (strcmp(defel->defname, "oids") == 0)
>>         {
>>@@ -747,11 +761,19 @@
>>         }
>>         else if (strcmp(defel->defname, "csv") == 0)
>>         {
>>-            if (csv_mode)
>>+            if (fmt != FMT_TXT)
>>                 ereport(ERROR,
>>                         (errcode(ERRCODE_SYNTAX_ERROR),
>>                          errmsg("conflicting or redundant options")));
>>-            csv_mode = intVal(defel->arg);
>>+            fmt = FMT_CSV;
>>+        }
>>+        else if (strcmp(defel->defname, "xml") == 0)
>>+        {
>>+            if (fmt != FMT_TXT)
>>+                ereport(ERROR,
>>+                        (errcode(ERRCODE_SYNTAX_ERROR),
>>+                         errmsg("conflicting or redundant options")));
>>+            fmt = FMT_XML;
>>         }
>>         else if (strcmp(defel->defname, "header") == 0)
>>         {
>>@@ -798,29 +820,39 @@
>>                  defel->defname);
>>     }
>> 
>>-    if (binary && delim)
>>+    if (fmt == FMT_BIN && delim)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_SYNTAX_ERROR),
>>                  errmsg("cannot specify DELIMITER in BINARY mode")));
>> 
>>-    if (binary && csv_mode)
>>+    if (fmt == FMT_BIN && null_print)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_SYNTAX_ERROR),
>>-                 errmsg("cannot specify CSV in BINARY mode")));
>>+                 errmsg("cannot specify NULL in BINARY mode")));
>> 
>>-    if (binary && null_print)
>>+    if (fmt == FMT_XML && is_from)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_SYNTAX_ERROR),
>>-                 errmsg("cannot specify NULL in BINARY mode")));
>>+                 errmsg("XML mode is not available in COPY FROM")));
>>+
>>+    if (fmt == FMT_XML && delim)
>>+        ereport(ERROR,
>>+                (errcode(ERRCODE_SYNTAX_ERROR),
>>+                 errmsg("cannot specify DELIMITER in XML mode")));
>>+
>>+    if (fmt == FMT_XML && null_print)
>>+        ereport(ERROR,
>>+                (errcode(ERRCODE_SYNTAX_ERROR),
>>+                 errmsg("cannot specify NULL in XML mode")));
>> 
>>     /* Set defaults */
>>     if (!delim)
>>-        delim = csv_mode ? "," : "\t";
>>+        delim = (fmt == FMT_CSV) ? "," : "\t";
>> 
>>     if (!null_print)
>>-        null_print = csv_mode ? "" : "\\N";
>>+        null_print = (fmt == FMT_CSV) ? "" : "\\N";
>> 
>>-    if (csv_mode)
>>+    if (fmt == FMT_CSV)
>>     {
>>         if (!quote)
>>             quote = "\"";
>>@@ -835,35 +867,35 @@
>>                  errmsg("COPY delimiter must be a single character")));
>> 
>>       /* Check header */
>>-    if (!csv_mode && header_line)
>>+    if (fmt != FMT_CSV && header_line)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                  errmsg("COPY HEADER available only in CSV mode")));
>> 
>>     /* Check quote */
>>-    if (!csv_mode && quote != NULL)
>>+    if (fmt != FMT_CSV && quote != NULL)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                  errmsg("COPY quote available only in CSV mode")));
>> 
>>-    if (csv_mode && strlen(quote) != 1)
>>+    if (fmt == FMT_CSV && strlen(quote) != 1)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                  errmsg("COPY quote must be a single character")));
>> 
>>     /* Check escape */
>>-    if (!csv_mode && escape != NULL)
>>+    if (fmt != FMT_CSV && escape != NULL)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                  errmsg("COPY escape available only in CSV mode")));
>> 
>>-    if (csv_mode && strlen(escape) != 1)
>>+    if (fmt == FMT_CSV && strlen(escape) != 1)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                  errmsg("COPY escape must be a single character")));
>> 
>>     /* Check force_quote */
>>-    if (!csv_mode && force_quote != NIL)
>>+    if (fmt != FMT_CSV && force_quote != NIL)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                  errmsg("COPY force quote available only in CSV mode")));
>>@@ -873,7 +905,7 @@
>>                errmsg("COPY force quote only available using COPY TO")));
>> 
>>     /* Check force_notnull */
>>-    if (!csv_mode && force_notnull != NIL)
>>+    if (fmt != FMT_CSV && force_notnull != NIL)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>               errmsg("COPY force not null available only in CSV mode")));
>>@@ -889,7 +921,7 @@
>>                  errmsg("COPY delimiter must not appear in the NULL specification")));
>> 
>>     /* Don't allow the csv quote char to appear in the null string. */
>>-    if (csv_mode && strchr(null_print, quote[0]) != NULL)
>>+    if (fmt == FMT_CSV && strchr(null_print, quote[0]) != NULL)
>>         ereport(ERROR,
>>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                  errmsg("CSV quote character must not appear in the NULL specification")));
>>@@ -1004,7 +1036,7 @@
>>         if (pipe)
>>         {
>>             if (whereToSendOutput == Remote)
>>-                ReceiveCopyBegin(binary, list_length(attnumlist));
>>+                ReceiveCopyBegin(fmt == FMT_BIN, list_length(attnumlist));
>>             else
>>                 copy_file = stdin;
>>         }
>>@@ -1029,7 +1061,7 @@
>>                          errmsg("\"%s\" is a directory", filename)));
>>             }
>>         }
>>-        CopyFrom(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
>>+        CopyFrom(rel, attnumlist, fmt, oids, delim, null_print,
>>                  quote, escape, force_notnull_atts, header_line);
>>     }
>>     else
>>@@ -1093,7 +1125,7 @@
>>             }
>>         }
>> 
>>-        DoCopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
>>+        DoCopyTo(rel, attnumlist, fmt, oids, delim, null_print,
>>                  quote, escape, force_quote_atts, header_line, fe_copy);
>>     }
>> 
>>@@ -1124,20 +1156,20 @@
>>  * so we don't need to plaster a lot of variables with "volatile".
>>  */
>> static void
>>-DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
>>-         char *delim, char *null_print, bool csv_mode, char *quote,
>>+DoCopyTo(Relation rel, List *attnumlist, CopyFmt fmt, bool oids,
>>+         char *delim, char *null_print, char *quote,
>>          char *escape, List *force_quote_atts, bool header_line, bool fe_copy)
>> {
>>     PG_TRY();
>>     {
>>         if (fe_copy)
>>-            SendCopyBegin(binary, list_length(attnumlist));
>>+            SendCopyBegin(fmt == FMT_BIN, list_length(attnumlist));
>> 
>>-        CopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
>>+        CopyTo(rel, attnumlist, fmt, oids, delim, null_print,
>>                quote, escape, force_quote_atts, header_line);
>> 
>>         if (fe_copy)
>>-            SendCopyEnd(binary);
>>+            SendCopyEnd(fmt == FMT_BIN);
>>     }
>>     PG_CATCH();
>>     {
>>@@ -1156,8 +1188,8 @@
>>  * Copy from relation TO file.
>>  */
>> static void
>>-CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
>>-       char *delim, char *null_print, bool csv_mode, char *quote,
>>+CopyTo(Relation rel, List *attnumlist, CopyFmt fmt, bool oids,
>>+       char *delim, char *null_print, char *quote,
>>        char *escape, List *force_quote_atts, bool header_line)
>> {
>>     HeapTuple    tuple;
>>@@ -1187,7 +1219,7 @@
>>         Oid            out_func_oid;
>>         bool        isvarlena;
>> 
>>-        if (binary)
>>+        if (fmt == FMT_BIN)
>>             getTypeBinaryOutputInfo(attr[attnum - 1]->atttypid,
>>                                     &out_func_oid,
>>                                     &isvarlena);
>>@@ -1215,7 +1247,7 @@
>>                                       ALLOCSET_DEFAULT_INITSIZE,
>>                                       ALLOCSET_DEFAULT_MAXSIZE);
>> 
>>-    if (binary)
>>+    if (fmt == FMT_BIN)
>>     {
>>         /* Generate header for a binary copy */
>>         int32        tmp;
>>@@ -1233,6 +1265,14 @@
>>     }
>>     else
>>     {
>>+        if (fmt == FMT_XML)
>>+        {
>>+            CopySendString("<?xml version='1.0'?>");
>>+            CopySendEndOfRow(false);
>>+            CopySendString("<table>");
>>+            CopySendEndOfRow(false);
>>+        }
>>+
>>         /*
>>          * For non-binary copy, we need to convert null_print to client
>>          * encoding, because it will be sent directly with CopySendString.
>>@@ -1262,7 +1302,7 @@
>>                                     strcmp(colname, null_print) == 0);
>>             }
>> 
>>-            CopySendEndOfRow(binary);
>>+            CopySendEndOfRow(fmt == FMT_BIN);
>> 
>>         }
>>     }
>>@@ -1278,7 +1318,7 @@
>>         MemoryContextReset(mycontext);
>>         oldcontext = MemoryContextSwitchTo(mycontext);
>> 
>>-        if (binary)
>>+        if (fmt == FMT_BIN)
>>         {
>>             /* Binary per-tuple header */
>>             CopySendInt16(attr_count);
>>@@ -1294,25 +1334,34 @@
>>         }
>>         else
>>         {
>>+            if (fmt == FMT_XML)
>>+                CopySendString("<row>");
>>+
>>             /* Text format has no per-tuple header, but send OID if wanted */
>>             if (oids)
>>             {
>>                 string = DatumGetCString(DirectFunctionCall1(oidout,
>>                               ObjectIdGetDatum(HeapTupleGetOid(tuple))));
>>-                CopySendString(string);
>>+
>>+                if (fmt == FMT_XML)
>>+                    CopyAttributeOutXML("oid", string);
>>+                else
>>+                    CopySendString(string);
>>+
>>                 need_delim = true;
>>             }
>>         }
>> 
>>         foreach(cur, attnumlist)
>>         {
>>-            int            attnum = lfirst_int(cur);
>>+            int        attnum = lfirst_int(cur);
>>             Datum        value;
>>             bool        isnull;
>>+            char        *colname = NameStr(attr[attnum - 1]->attname);
>> 
>>             value = heap_getattr(tuple, attnum, tupDesc, &isnull);
>> 
>>-            if (!binary)
>>+            if (fmt == FMT_TXT || fmt == FMT_CSV)
>>             {
>>                 if (need_delim)
>>                     CopySendChar(delim[0]);
>>@@ -1321,53 +1370,71 @@
>> 
>>             if (isnull)
>>             {
>>-                if (!binary)
>>-                    CopySendString(null_print); /* null indicator */
>>-                else
>>-                    CopySendInt32(-1);    /* null marker */
>>+                switch (fmt)
>>+                {
>>+                    case FMT_BIN:
>>+                        CopySendInt32(-1);    /* null marker */
>>+                        break;
>>+                    case FMT_XML:
>>+                        CopyAttributeOutXML(colname, NULL); /* null entity */
>>+                        break;
>>+                    default:
>>+                        CopySendString(null_print); /* null indicator */
>>+                        break;
>>+                }
>>+
>>             }
>>             else
>>             {
>>-                if (!binary)
>>+                if (fmt == FMT_BIN)
>>                 {
>>-                    string = DatumGetCString(FunctionCall1(&out_functions[attnum - 1],
>>-                                                           value));
>>-                    if (csv_mode)
>>-                    {
>>-                        CopyAttributeOutCSV(string, delim, quote, escape,
>>-                                      (strcmp(string, null_print) == 0 ||
>>-                                       force_quote[attnum - 1]));
>>-                    }
>>-                    else
>>-                        CopyAttributeOut(string, delim);
>>-
>>+                    bytea    *outputbytes =
>>+                        DatumGetByteaP(FunctionCall1(&out_functions[attnum - 1], value));
>>+                    /* We assume the result will not have been toasted */
>>+                    CopySendInt32(VARSIZE(outputbytes) - VARHDRSZ);
>>+                    CopySendData(VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ);
>>                 }
>>                 else
>>                 {
>>-                    bytea       *outputbytes;
>>-
>>-                    outputbytes = DatumGetByteaP(FunctionCall1(&out_functions[attnum - 1],
>>-                                                               value));
>>-                    /* We assume the result will not have been toasted */
>>-                    CopySendInt32(VARSIZE(outputbytes) - VARHDRSZ);
>>-                    CopySendData(VARDATA(outputbytes),
>>-                                 VARSIZE(outputbytes) - VARHDRSZ);
>>+                    string = DatumGetCString(FunctionCall1(&out_functions[attnum - 1], value));
>>+                    switch (fmt)
>>+                    {
>>+                        case FMT_CSV:
>>+                            CopyAttributeOutCSV(string, delim, quote, escape,
>>+                                (strcmp(string, null_print) == 0
>>+                                || force_quote[attnum - 1]));
>>+                            break;
>>+                        case FMT_XML:
>>+                            CopyAttributeOutXML(colname, string);
>>+                            break;
>>+                        default:
>>+                            CopyAttributeOut(string, delim);
>>+                            break;
>>+                    }
>>                 }
>>             }
>>         }
>> 
>>-        CopySendEndOfRow(binary);
>>+        if (fmt == FMT_XML)
>>+            CopySendString("</row>");
>>+
>>+        CopySendEndOfRow(fmt == FMT_BIN);
>> 
>>         MemoryContextSwitchTo(oldcontext);
>>     }
>> 
>>     heap_endscan(scandesc);
>> 
>>-    if (binary)
>>+    if (fmt == FMT_BIN)
>>     {
>>         /* Generate trailer for a binary copy */
>>         CopySendInt16(-1);
>>     }
>>+    else if (fmt == FMT_XML)
>>+    {
>>+        CopySendString("</table>");
>>+        CopySendEndOfRow(false);
>>+    }
>> 
>>     MemoryContextDelete(mycontext);
>> 
>>@@ -1464,8 +1531,8 @@
>>  * Copy FROM file to relation.
>>  */
>> static void
>>-CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
>>-         char *delim, char *null_print, bool csv_mode, char *quote,
>>+CopyFrom(Relation rel, List *attnumlist, CopyFmt fmt, bool oids,
>>+         char *delim, char *null_print, char *quote,
>>          char *escape, List *force_notnull_atts, bool header_line)
>> {
>>     HeapTuple    tuple;
>>@@ -1549,7 +1616,7 @@
>>             continue;
>> 
>>         /* Fetch the input function and typioparam info */
>>-        if (binary)
>>+        if (fmt == FMT_BIN)
>>             getTypeBinaryInputInfo(attr[attnum - 1]->atttypid,
>>                                  &in_func_oid, &typioparams[attnum - 1]);
>>         else
>>@@ -1620,7 +1687,7 @@
>>      */
>>     ExecBSInsertTriggers(estate, resultRelInfo);
>> 
>>-    if (!binary)
>>+    if (fmt != FMT_BIN)
>>         file_has_oids = oids;    /* must rely on user to tell us this... */
>>     else
>>     {
>>@@ -1663,7 +1730,7 @@
>>         }
>>     }
>> 
>>-    if (file_has_oids && binary)
>>+    if (file_has_oids && fmt == FMT_BIN)
>>     {
>>         getTypeBinaryInputInfo(OIDOID,
>>                                &in_func_oid, &oid_typioparam);
>>@@ -1681,7 +1748,7 @@
>>     /* Initialize static variables */
>>     fe_eof = false;
>>     eol_type = EOL_UNKNOWN;
>>-    copy_binary = binary;
>>+    copy_binary = (fmt == FMT_BIN);
>>     copy_relname = RelationGetRelationName(rel);
>>     copy_lineno = 0;
>>     copy_attname = NULL;
>>@@ -1718,14 +1785,14 @@
>>         MemSet(values, 0, num_phys_attrs * sizeof(Datum));
>>         MemSet(nulls, 'n', num_phys_attrs * sizeof(char));
>> 
>>-        if (!binary)
>>+        if (fmt != FMT_BIN)
>>         {
>>             CopyReadResult result = NORMAL_ATTR;
>>             char       *string;
>>             ListCell   *cur;
>> 
>>             /* Actually read the line into memory here */
>>-            done = csv_mode ? 
>>+            done = (fmt == FMT_CSV) ? 
>>                 CopyReadLine(quote, escape) : CopyReadLine(NULL, NULL);
>> 
>>             /*
>>@@ -1776,7 +1843,7 @@
>>                              errmsg("missing data for column \"%s\"",
>>                                     NameStr(attr[m]->attname))));
>> 
>>-                if (csv_mode)
>>+                if (fmt == FMT_CSV)
>>                 {
>>                     string = CopyReadAttributeCSV(delim, null_print, quote,
>>                                                escape, &result, &isnull);
>>@@ -1789,7 +1856,7 @@
>>                     string = CopyReadAttribute(delim, null_print,
>>                                                &result, &isnull);
>> 
>>-                if (csv_mode && isnull && force_notnull[m])
>>+                if (fmt == FMT_CSV && isnull && force_notnull[m])
>>                 {
>>                     string = null_print;        /* set to NULL string */
>>                     isnull = false;
>>@@ -2275,6 +2342,27 @@
>> }
>> 
>> /*----------
>>+ * Returns decimal value for a hexadecimal digit.
>>+*----------
>>+ */
>>+static int GetDecimalFromHex(char hex)
>>+{
>>+    if (isdigit(hex))
>>+    {
>>+        // If it is a digit
>>+        return hex - '0';
>>+    }
>>+    if (hex < 'a')
>>+    {
>>+        return hex - 'A' + 10;
>>+    }
>>+    else
>>+    {
>>+        return hex - 'a' + 10;
>>+    }
>>+}
>>+
>>+/*----------
>>  * Read the value of a single attribute, performing de-escaping as needed.
>>  *
>>  * delim is the column delimiter string (must be just one byte for now).
>>@@ -2378,6 +2466,29 @@
>>                 case 'v':
>>                     c = '\v';
>>                     break;
>>+                case 'x':
>>+                case 'X':
>>+                    if (line_buf.cursor < line_buf.len)
>>+                    {
>>+                        char hexchar = line_buf.data[line_buf.cursor];
>>+                        if (isxdigit(hexchar))
>>+                        {
>>+                            int val = GetDecimalFromHex(hexchar);
>>+                            line_buf.cursor++;
>>+                            if (line_buf.cursor < line_buf.len)
>>+                            {
>>+                                hexchar = line_buf.data[line_buf.cursor];
>>+                                if (isxdigit(hexchar))
>>+                                {
>>+                                    line_buf.cursor++;
>>+                                    val = (val << 4) + GetDecimalFromHex(hexchar);
>>+                                }
>>+                            }
>>+
>>+                            c = val & 0xff;
>>+                        }
>>+                    }
>>+                    break;
>> 
>>                     /*
>>                      * in all other cases, take the char after '\'
>>@@ -2760,3 +2871,84 @@
>> 
>>     return attnums;
>> }
>>+
>>+/*
>>+ * Send XML representation of one attribute, with element tagging, null
>>+ * marking, and entity escaping.
>>+ */
>>+
>>+static void
>>+CopyAttributeOutXML (char *colname, char *string)
>>+{
>>+    CopySendString("<col name='");
>>+    CopySendStringXML(colname);
>>+    CopySendString("' null='");
>>+    CopySendChar((string == NULL) ? 'y' : 'n');
>>+    CopySendString("'>");
>>+
>>+    if (string != NULL)
>>+        CopySendStringXML(string);
>>+
>>+    CopySendString("</col>");
>>+}
>>+
>>+/*
>>+ * Sends a string with entity escaping.
>>+ */
>>+
>>+static void
>>+CopySendStringXML (char *string)
>>+{
>>+    char    *csr;
>>+    for (csr = string; *csr; ++csr)
>>+    {
>>+        char buf[10];
>>+        char *entity = CopyGetXMLEntity(*csr, buf);
>>+        if (entity)
>>+            CopySendString(entity);
>>+        else
>>+            CopySendChar(*csr);
>>+    }
>>+}
>>+
>>+/*
>>+ * Locates or creates an XML entity for the given character.
>>+ * If that character doesn't require an entity, then the
>>+ * function returns NULL.
>>+ */
>>+
>>+static char *
>>+CopyGetXMLEntity (char c, char *buf)
>>+{
>>+    char *entity;
>>+
>>+    switch (c)
>>+    {
>>+        case '<':
>>+            entity = "<";
>>+            break;
>>+        case '>':
>>+            entity = ">";
>>+            break;
>>+        case '&':
>>+            entity = "&";
>>+            break;
>>+        case '\'':
>>+            entity = "'";
>>+            break;
>>+        case '"':
>>+            entity = """;
>>+            break;
>>+        default:
>>+            if (!isgraph(c) && c != ' ')
>>+            {
>>+                sprintf(buf, "&#%02x;", (unsigned char)c);
>>+                entity = buf;
>>+            }
>>+            else
>>+                entity = NULL;
>>+            break;
>>+    }
>>+
>>+    return entity;
>>+}
>>Index: src/backend/parser/gram.y
>>===================================================================
>>RCS file: /projects/cvsroot/pgsql/src/backend/parser/gram.y,v
>>retrieving revision 2.491
>>diff -u -r2.491 gram.y
>>--- src/backend/parser/gram.y    7 May 2005 02:22:46 -0000    2.491
>>+++ src/backend/parser/gram.y    13 May 2005 22:21:01 -0000
>>@@ -413,6 +413,8 @@
>>
>>     WHEN WHERE WITH WITHOUT WORK WRITE
>> 
>>+    XML
>>+
>>     YEAR_P
>> 
>>     ZONE
>>@@ -1448,6 +1450,10 @@
>>                 {
>>                     $$ = makeDefElem("header", (Node *)makeInteger(TRUE));
>>                 }
>>+            | XML
>>+                {
>>+                    $$ = makeDefElem("xml", (Node *)makeInteger(TRUE));
>>+                }
>>             | QUOTE opt_as Sconst
>>                 {
>>                     $$ = makeDefElem("quote", (Node *)makeString($3));
>>Index: src/backend/parser/keywords.c
>>===================================================================
>>RCS file: /projects/cvsroot/pgsql/src/backend/parser/keywords.c,v
>>retrieving revision 1.155
>>diff -u -r1.155 keywords.c
>>--- src/backend/parser/keywords.c    7 May 2005 02:22:47 -0000    1.155
>>+++ src/backend/parser/keywords.c    13 May 2005 22:21:01 -0000
>>@@ -342,6 +342,7 @@
>>     {"without", WITHOUT},
>>     {"work", WORK},
>>     {"write", WRITE},
>>+    {"xml", XML},
>>     {"year", YEAR_P},
>>     {"zone", ZONE},
>> };
>>Index: src/test/regress/expected/copy2.out
>>===================================================================
>>RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/copy2.out,v
>>retrieving revision 1.21
>>diff -u -r1.21 copy2.out
>>--- src/test/regress/expected/copy2.out    13 May 2005 06:33:40 -0000    1.21
>>+++ src/test/regress/expected/copy2.out    13 May 2005 22:21:01 -0000
>>@@ -194,6 +194,28 @@
>> --test that we read consecutive LFs properly
>> CREATE TEMP TABLE testnl (a int, b text, c int);
>> COPY testnl FROM stdin CSV;
>>-DROP TABLE x, y;
>>+CREATE TABLE z (
>>+    col1 text,
>>+    col2 text
>>+);
>>+COPY z from stdin;
>>+COPY z TO stdout;
>>+Jackson, Sam    \\h
>>+ABC    \\\\\t
>>+It is "perfect".    \t
>>+    NULL
>>+COPY z TO stdout WITH CSV;
>>+"Jackson, Sam",\h
>>+ABC,\\    
>>+"It is ""perfect"".",    
>>+"",NULL
>>+COPY y TO stdout WITH XML;
>>+<?xml version='1.0'?>
>>+<table>
>>+<row><col name='col1' null='n'>Jackson, Sam</col><col name='col2' null='n'>\h</col></row>
>>+<row><col name='col1' null='n'>It is "perfect".</col><col name='col2' null='n'>	</col></row>
>>+<row><col name='col1' null='n'></col><col name='col2' null='y'></col></row>
>>+</table>
>>+DROP TABLE x, y, z;
>> DROP FUNCTION fn_x_before();
>> DROP FUNCTION fn_x_after();
>>Index: src/test/regress/sql/copy2.sql
>>===================================================================
>>RCS file: /projects/cvsroot/pgsql/src/test/regress/sql/copy2.sql,v
>>retrieving revision 1.12
>>diff -u -r1.12 copy2.sql
>>--- src/test/regress/sql/copy2.sql    13 May 2005 06:33:40 -0000    1.12
>>+++ src/test/regress/sql/copy2.sql    13 May 2005 22:21:01 -0000
>>@@ -139,7 +139,22 @@
>> inside",2
>> \.
>> 
>>+CREATE TABLE z (
>>+    col1 text,
>>+    col2 text
>>+);
>> 
>>-DROP TABLE x, y;
>>+COPY z from stdin;
>>+Jackson, Sam    \\h
>>+\x41\x42\x43\xa0\x1    \x5c\x5c\x9
>>+It is "perfect".    \t
>>+    NULL
>>+\.
>>+
>>+COPY z TO stdout;
>>+COPY z TO stdout WITH CSV;
>>+COPY y TO stdout WITH XML;
>>+
>>+DROP TABLE x, y, z;
>> DROP FUNCTION fn_x_before();
>> DROP FUNCTION fn_x_after();
>>    
>>
>>------------------------------------------------------------------------
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 6: Have you searched our list archives?
>>
>>               http://archives.postgresql.org
>>    
>>


pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: 8.02 rpm error
Next
From: "Ed L."
Date:
Subject: Re: [GENERAL] Image storage questions