Thread: pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

From
momjian@postgresql.org (Bruce Momjian - CVS)
Date:
CVSROOT:    /cvsroot
Module name:    pgsql-server
Changes by:    momjian@postgresql.org    03/04/19 15:55:37

Modified files:
    doc/src/sgml/ref: copy.sgml
    src/backend/commands: copy.c

Log message:
    Add pipe parameter to COPY function to allow proper line termination.


Re: pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

From
Neil Conway
Date:
On Sat, 2003-04-19 at 15:55, Bruce Momjian - CVS wrote:
>     Add pipe parameter to COPY function to allow proper line termination.
>

    <para>
-    Note that the end of each row is marked by a Unix-style newline
-    (<quote><literal>\n</></>).  Presently, <command>COPY FROM</command> will not behave as
-    desired if given a file containing DOS- or Mac-style newlines.
-    This is expected to change in future releases.
+    <command>COPY TO</command> will terminate each row with a Unix-style
+    newline (<quote><literal>\n</></>),  or carriage return/newline
+    ("\r\n") on  MS Windows.  <command>COPY FROM</command> can handle lines
+    ending with newlines, carriage returns, or carriage return/newlines.
    </para>
   </refsect2>

You might want to clarify that it's the OS of the server (I assume) that
determines the line endings used by COPY TO.

Cheers,

Neil


Re: pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

From
Bruce Momjian
Date:
Neil Conway wrote:
> On Sat, 2003-04-19 at 15:55, Bruce Momjian - CVS wrote:
> >     Add pipe parameter to COPY function to allow proper line termination.
> >
>
>     <para>
> -    Note that the end of each row is marked by a Unix-style newline
> -    (<quote><literal>\n</></>).  Presently, <command>COPY FROM</command> will not behave as
> -    desired if given a file containing DOS- or Mac-style newlines.
> -    This is expected to change in future releases.
> +    <command>COPY TO</command> will terminate each row with a Unix-style
> +    newline (<quote><literal>\n</></>),  or carriage return/newline
> +    ("\r\n") on  MS Windows.  <command>COPY FROM</command> can handle lines
> +    ending with newlines, carriage returns, or carriage return/newlines.
>     </para>
>    </refsect2>
>
> You might want to clarify that it's the OS of the server (I assume) that
> determines the line endings used by COPY TO.

Yes, I did think of that.  The issue is that COPY only outputs files on
to the server machine, so I figured it was clear, but you are right, it
is a good idea to make it clear in the docs.

COPY to STDOUT/STDIN will be controlled by the client end-of-line
because those files are opened in text mode by the client, I think.

Doc patch attached.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: copy.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.43
diff -c -c -r1.43 copy.sgml
*** copy.sgml    19 Apr 2003 19:55:37 -0000    1.43
--- copy.sgml    20 Apr 2003 01:50:08 -0000
***************
*** 363,370 ****
     <para>
      <command>COPY TO</command> will terminate each row with a Unix-style
      newline (<quote><literal>\n</></>),  or carriage return/newline
!     ("\r\n") on  MS Windows.  <command>COPY FROM</command> can handle lines
!     ending with newlines, carriage returns, or carriage return/newlines.
     </para>
    </refsect2>

--- 363,371 ----
     <para>
      <command>COPY TO</command> will terminate each row with a Unix-style
      newline (<quote><literal>\n</></>),  or carriage return/newline
!     ("\r\n") for servers running MS Windows.
!     <command>COPY FROM</command> can handle lines ending with newlines,
!     carriage returns, or carriage return/newlines.
     </para>
    </refsect2>


Re: pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> COPY to STDOUT/STDIN will be controlled by the client end-of-line
> because those files are opened in text mode by the client, I think.

Actually, I was going to question you on that before.  AFAICT, the
just-committed code will *only* send LF newlines during COPY TO STDOUT,
independent of the server's OS, the client's OS, or anything else.

This is perhaps justifiable on the grounds that "the FE/BE protocol
spec says LF and not anything else", and I didn't complain because
I assumed that was your thinking.  But your response to Neil doesn't
suggest that you're thinking that way.  What exactly do you have in
mind here?  Certainly the client is not going to determine the
newline format for COPY TO STDOUT unless it does translation.

            regards, tom lane


Re: pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > COPY to STDOUT/STDIN will be controlled by the client end-of-line
> > because those files are opened in text mode by the client, I think.
>
> Actually, I was going to question you on that before.  AFAICT, the
> just-committed code will *only* send LF newlines during COPY TO STDOUT,
> independent of the server's OS, the client's OS, or anything else.

Right.  In my initial testing, I noticed that when I was sending \r\n to
the client for STDOUT, the regression tests hung, so I added code to
test/pass pipe and force \n for STDIN/STDOUT.

> This is perhaps justifiable on the grounds that "the FE/BE protocol
> spec says LF and not anything else", and I didn't complain because
> I assumed that was your thinking.  But your response to Neil doesn't

It is my thinking.  The server could be Win32 and the client could be
unix, or the opposite.  I see no reason to allow handling of any line
terminator at that level.

> suggest that you're thinking that way.  What exactly do you have in
> mind here?  Certainly the client is not going to determine the
> newline format for COPY TO STDOUT unless it does translation.

My idea was that if the client opens a file to dump the STDOUT data, it
will opened in text mode, and that will have \r\n for Win32 and \n for
Unix.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073


Re: pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> suggest that you're thinking that way.  What exactly do you have in
>> mind here?  Certainly the client is not going to determine the
>> newline format for COPY TO STDOUT unless it does translation.

> My idea was that if the client opens a file to dump the STDOUT data, it
> will opened in text mode, and that will have \r\n for Win32 and \n for
> Unix.

But it would probably be a bad idea for the client to open such a file
in text mode.  We are going to have COPY BINARY TO STDOUT/FROM STDIN
real soon now (like probably today or tomorrow ;-)).  Unless the client
takes the trouble to determine whether the copy is text or binary,
opening the file in text mode will be the Wrong Thing.  So I think that
a decision to always send LF on-the-wire will result in Windows users
seeing LF-newline dump files.  Not sure how unhappy that will make them.

I personally don't have a problem with the approach; I was just
wondering if it really does what you intend.

            regards, tom lane


Re: pgsql-server/ oc/src/sgml/ref/copy.sgml rc/bac ...

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> suggest that you're thinking that way.  What exactly do you have in
> >> mind here?  Certainly the client is not going to determine the
> >> newline format for COPY TO STDOUT unless it does translation.
>
> > My idea was that if the client opens a file to dump the STDOUT data, it
> > will opened in text mode, and that will have \r\n for Win32 and \n for
> > Unix.
>
> But it would probably be a bad idea for the client to open such a file
> in text mode.  We are going to have COPY BINARY TO STDOUT/FROM STDIN
> real soon now (like probably today or tomorrow ;-)).  Unless the client
> takes the trouble to determine whether the copy is text or binary,
> opening the file in text mode will be the Wrong Thing.  So I think that
> a decision to always send LF on-the-wire will result in Windows users
> seeing LF-newline dump files.  Not sure how unhappy that will make them.
>
> I personally don't have a problem with the approach; I was just
> wondering if it really does what you intend.

I think that is fine.  pg_dump is the one that uses STDIN/STDOUT the
most, and that will open as text/binary as appropriate.  I see now that
pg_dump seems to only open in binary so I may have to look at that, but
clearly it is only applications that should control this stuff --- the
wire protocol should not.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073