Re: Apparent Problem With NULL in Restoring pg_dump - Mailing list pgsql-general

From Andy Colson
Subject Re: Apparent Problem With NULL in Restoring pg_dump
Date
Msg-id 4E7411A3.7050606@squeakycode.net
Whole thread Raw
In response to Re: Apparent Problem With NULL in Restoring pg_dump  (Rich Shepard <rshepard@appl-ecosys.com>)
Responses Re: Apparent Problem With NULL in Restoring pg_dump  (Rich Shepard <rshepard@appl-ecosys.com>)
List pgsql-general
On 09/16/2011 04:42 PM, Rich Shepard wrote:
> On Thu, 15 Sep 2011, Andy Colson wrote:
>
>> First you need to trim the \n and spaces:
>>
>> andy=# insert into junk values (E'GW-22');
>> INSERT 0 1
>> andy=# insert into junk values (E'GW-22 \n');
>> INSERT 0 1
>> andy=# insert into junk values (E'GW-22 \n');
>
> Andy,
>
> Here's what worked for me:
>
> nevada=# \i junk.sql
> CREATE TABLE
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22');
> INSERT 0 803
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22 \n');
> INSERT 0 0
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22 \n');
> INSERT 0 0
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22\n');
> INSERT 0 1409
> nevada=# select '['|| rtrim(trim(trailing E'\n' from site_id)) || ']' from junk;
>
> ?column? ----------
> [GW-22]
> [GW-22]
>
> and so on for 2212 rows.
>
>> Trim it up:
>>
>> andy=# select '['|| rtrim(trim(trailing E'\n' from a)) || ']' from junk;
>
>> If you have a unique index you'll wanna drop it first. Once you get that done, we can remove the dups.
>
> No index on junk; I can remove it from chemistry prior to reinserting the
> cleaned rows.
>
> Also, where can I read about the select syntax you use? I find nothing
> about it in Rick van der Lans' 4th edition, the most comprehensive language
> reference I've read.
>
> Thanks,
>
> Rich
>

The fine online manual:

http://www.postgresql.org/docs/current/interactive/index.html

Especially the string ops:

http://www.postgresql.org/docs/current/interactive/functions-string.html

>> Trim it up:
>> andy=# select '['|| rtrim(trim(trailing E'\n' from a)) || ']' from junk;
>
> Andy,
>
> Scrolling through the table with rows ordered by date and chemical I find
> no duplicates ... so far. However, what I do find is that the above did not
> work:


No, it wasnt supposed to.  A select statement builds a new result set and returns it to you, it wont update a table.
Thatselect statement was meant as an example for writing an update statement. 

Like:

update chemistry set side_id = rtrim(trim(trailing E'\n' from site_id));

If there was a unique index on chemistry(site_id), the above would throw an error, so I was warning you to drop it.

Once the site_id was trimmed, you could then delete the dups, with:

delete from chemistry where site_id = 'GW-22' and ctid <> (select min(ctid) from chemistry site_id = 'GW-22');

Those 11 steps you had... I was thinking two steps.  The update and the delete above.

Sorry, I should have been a little more clear, but, at least you got things cleaned up.  PG has a huge number of data
manipulationfunctions.  If you have to export data out of a database in order to massage it, then that's a failure of a
database. PG (and sql) were meant for just this kind of job. 


-Andy

pgsql-general by date:

Previous
From: Marti Raudsepp
Date:
Subject: Re: Arrays
Next
From: Raghavendra
Date:
Subject: Re: How to get Transaction Timestamp ?