Thread: A few questions about carriage returns (\r)

A few questions about carriage returns (\r)

From
Jon Lapham
Date:
I have a database that needs to work with TEXT input generated from a
heterogeneous environment (UNIX, Win, Mac).  As such, I have noticed
that equality comparisons of supposedly identical TEXT is failing due to
the different line terminators embedded in some TEXT fields.

So, as I understand it (please correct me if I'm wrong), UNIX uses a
"newline" (or \n), Mac uses "carriage return" (or \r) and Win/DOS uses
\r\n.

Looking into this issue has led me to a number of questions below...

Thanks for any help!
-Jon

=======================
1) Does anyone know why the "id" column is not visible for the final
select statement?  I guess a lone \r literally means to go to the
farthest position to the left... but it seems like a bug that it moves
past its column position.  Bug in psql?

test=> create table foo ( id serial, bar TEXT );
NOTICE:  CREATE TABLE will create implicit sequence "foo_id_seq" for
serial column "foo.id"
CREATE TABLE
test=> insert into foo (bar) VALUES ('Line 1\nLine 2');
INSERT 0 1
test=> insert into foo (bar) VALUES ('Line 1\r\nLine 2');
INSERT 0 1
test=> insert into foo (bar) VALUES ('Line 1\rLine 2');
INSERT 0 1
test=> select id,textcol from foo;
  id | textcol
----+---------
   1 | Line 1
Line 2
   2 | Line 1
Line 2
Line 2Line 1
(3 rows)

=======================
2) Is there a way to *view* the \n and \rs embedded in a TEXT field
using psql?

=======================
3) Is there a string function that is capable of replacing \r\n with \n?
   More generally, is there a string function capable of regular
expression replace?  (eg: perl and other languages have "=~
s/\r\n/\n/").  I imagine an SQL function that would work like this
fictional function:

SELECT re_replace('\r\n' IN bar USING '\n') FROM foo;

I see that "substring" works with REs, but I cannot for the life of me
figure out how to use substring equivalently to my fictional
re_replace() function above.





Extra information that may be useful:

test=> select version();
                                                          version

-------------------------------------------------------------------------------------------------------------------------
  PostgreSQL 8.1.4 on i386-redhat-linux-gnu, compiled by GCC
i386-redhat-linux-gcc (GCC) 4.1.0 20060304 (Red Hat 4.1.0-3)
(1 row)


--
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
  Jon Lapham  <lapham@jandr.org>                Rio de Janeiro, Brasil
  Personal: http://www.jandr.org/
***-*--*----*-------*------------*--------------------*---------------


Re: A few questions about carriage returns (\r)

From
Martijn van Oosterhout
Date:
On Thu, Jun 15, 2006 at 11:51:21AM -0300, Jon Lapham wrote:
> So, as I understand it (please correct me if I'm wrong), UNIX uses a
> "newline" (or \n), Mac uses "carriage return" (or \r) and Win/DOS uses
> \r\n.

Correct.

> 1) Does anyone know why the "id" column is not visible for the final
> select statement?  I guess a lone \r literally means to go to the
> farthest position to the left... but it seems like a bug that it moves
> past its column position.  Bug in psql?

Well, your terminal moving the cursor left when it sees a \r, psql
isn't doing anything (which is the problem).

CVS HEAD contains patches that display the output more clearly.

> =======================
> 2) Is there a way to *view* the \n and \rs embedded in a TEXT field
> using psql?

You could use replace to make them visible.

> =======================
> 3) Is there a string function that is capable of replacing \r\n with \n?

Yes, replace.
# select replace('aac','a','b');
 replace
---------
 bbc
(1 row)

So replace(str, '\r', '\\r') should work

>   More generally, is there a string function capable of regular
> expression replace?  (eg: perl and other languages have "=~
> s/\r\n/\n/").  I imagine an SQL function that would work like this
> fictional function:

I beleive there is a regexp_replace. In psql, if you type \df you get a
list of all defined functions. The docs have info too.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: A few questions about carriage returns (\r)

From
Nis Jorgensen
Date:
Martijn van Oosterhout wrote:

> I beleive there is a regexp_replace. In psql, if you type \df you get a
> list of all defined functions. The docs have info too.

The function is confusingly not mentioned in the documentation under
"String Functions and Operators", but only under "Pattern Matching". I
would suggest at least adding a reference from the former to the latter.

If someone with access edits that page, perhaps they can at the same
time move the note "a. The to_ascii function supports conversion from
LATIN1, LATIN2, LATIN9, and WIN1250 encodings only" into the description
field for the function, or at least add an indication that there is a note.

/Nis


Re: A few questions about carriage returns (\r)

From
Jon Lapham
Date:
Martijn van Oosterhout wrote:
> # select replace('aac','a','b');
>  replace
> ---------
>  bbc
> (1 row)
>
> So replace(str, '\r', '\\r') should work

Perfect, thanks.

I guess I'm blind, I didn't see this in the docs.  :)

-Jon

--
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
  Jon Lapham  <lapham@jandr.org>                Rio de Janeiro, Brasil
  Personal: http://www.jandr.org/
***-*--*----*-------*------------*--------------------*---------------


Re: A few questions about carriage returns (\r)

From
Bruce Momjian
Date:
Nis Jorgensen wrote:
> Martijn van Oosterhout wrote:
>
> > I beleive there is a regexp_replace. In psql, if you type \df you get a
> > list of all defined functions. The docs have info too.
>
> The function is confusingly not mentioned in the documentation under
> "String Functions and Operators", but only under "Pattern Matching". I
> would suggest at least adding a reference from the former to the latter.

Docs are updated in CVS HEAD and 8.1.X branches.

> If someone with access edits that page, perhaps they can at the same
> time move the note "a. The to_ascii function supports conversion from
> LATIN1, LATIN2, LATIN9, and WIN1250 encodings only" into the description
> field for the function, or at least add an indication that there is a note.

OK, footnote removed and text added to main description:

    Convert string to ASCII from another encoding (only supports conversion
    from LATIN1, LATIN2, LATIN9, and WIN1250 encodings)

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +