Thread: newline character handling

newline character handling

From

"Sampath, Krishna"

Date:

07 April 2000, 15:50:29

It is all postgresql all day, today :-)
sorry if this should have gone to the novice forum.

As I tried, using COPY, to import a few flat files created under Windows
into postgresql running on a Linux machine, I discovered that:
* If the last field in your record is a string, postgresql imports it, but
keeps the ^M as part of the text string.
* If the last field is numeric, postgresql refuses to import that line
(because of the ^M, the field is not recognized as a number)

Once I stripped the ^M, the data bulkloaded without a problem. Perhaps COPY
should be smarter and recognize the DOS-style line endings?

Curiously,
krishna

Re: newline character handling

From

Michael Blakeley

Date:

08 April 2000, 13:03:53

>  From: "Sampath, Krishna" <KSampath@ekmail.com>
>  To: pgsql-general@postgresql.org
>  Subject: newline character handling
>  Date: Fri, 7 Apr 2000 15:49:58 -0400
>
>  As I tried, using COPY, to import a few flat files created under Windows
>  into postgresql running on a Linux machine, I discovered that:
>  * If the last field in your record is a string, postgresql imports it, but
>  keeps the ^M as part of the text string.
>  * If the last field is numeric, postgresql refuses to import that line
>  (because of the ^M, the field is not recognized as a number)
>
>  Once I stripped the ^M, the data bulkloaded without a problem. Perhaps COPY
>  should be smarter and recognize the DOS-style line endings?

I'm ok with this for numerics, but against it for text. Why? Because
I work with some binary data, and I wouldn't want the mysterious
problem of not being able to COPY a line containing a record that's
_supposed_ to end in ^M.

-- Mike

RE: Re: newline character handling

From

"Sampath, Krishna"

Date:

10 April 2000, 11:49:29

maybe we need a keyword DOS|UNIX or perhaps TEXT|BINARY to tell postgresql
to pick DOS style or UNIX style line endings...

krishna

-----Original Message-----
From: Michael Blakeley [mailto:mike@blakeley.com]
Sent: Saturday, April 08, 2000 12:57 PM
To: pgsql-general@hub.org
Subject: [GENERAL] Re: newline character handling


>  From: "Sampath, Krishna" <KSampath@ekmail.com>
>  To: pgsql-general@postgresql.org
>  Subject: newline character handling
>  Date: Fri, 7 Apr 2000 15:49:58 -0400
>
>  As I tried, using COPY, to import a few flat files created under Windows
>  into postgresql running on a Linux machine, I discovered that:
>  * If the last field in your record is a string, postgresql imports it,
but
>  keeps the ^M as part of the text string.
>  * If the last field is numeric, postgresql refuses to import that line
>  (because of the ^M, the field is not recognized as a number)
>
>  Once I stripped the ^M, the data bulkloaded without a problem. Perhaps
COPY
>  should be smarter and recognize the DOS-style line endings?

I'm ok with this for numerics, but against it for text. Why? Because
I work with some binary data, and I wouldn't want the mysterious
problem of not being able to COPY a line containing a record that's
_supposed_ to end in ^M.

-- Mike

Re: Re: newline character handling

From

"Steve Wolfe"

Date:

10 April 2000, 12:39:38

> maybe we need a keyword DOS|UNIX or perhaps TEXT|BINARY to tell
postgresql
> to pick DOS style or UNIX style line endings...

  Maybe we just need to make sure that the files we are using are in the
correct format for the platform they're being processed on. ; )

steve