Re: Replacing a table with constraints - Mailing list pgsql-sql

From Ragnar Hafstað
Subject Re: Replacing a table with constraints
Date
Msg-id 1116024199.11747.37.camel@localhost.localdomain
Whole thread Raw
In response to Replacing a table with constraints  ("Mark Fenbers" <Mark.Fenbers@noaa.gov>)
List pgsql-sql
On Fri, 2005-05-13 at 14:38 -0400, Mark Fenbers wrote:
> I have a table called Counties which partially contains a lot bad
> data.  By" bad data", I mean some records are missing; some exist and
> shouldn't; and some records have fields with erroneous information.
> However, the majority of the data in the table is accurate.  I have
> built/loaded a new table called newCounties with the same structure as
> Counties, but contains no bad data.  My was  to completely replace the
> contents of Counties with the contents of newCounties.  The problem
> is: several other tables have Foreign Key constraints placed on
> Counties.  Therefore, Pg will not let me 'DELETE FROM Counties;", nor
> will it let me "DROP TABLE Counties;"  
> 
> I'm perplexed.  Can someone suggest how I can best get data from
> Counties to look just like newCounties?

assuming same primary key on these 2 tables, you have to
consider these cases:

a) records in Counties not in newCounties
b) records in newCounties existing in Counties, but different
c) records in newCounties also correct in Counties
d) records in newCounties missing from Counties

a) get rid of extra records:      delete from Counties              where primarykey not in                 (select
primarykeyfrom newCounties);  if you hit foreign key constrains, you need to look  at your data a bit more, anyways 
 
b) update erroneous records:      update Counties from newCounties set            col1= newCounties.col1,
col2=newCounties.col2,           col3= newCounties.col3,           ...      from newCounties      where
primarykey=newCounties.primarykey          and           (    Counties.col1 <> newCounties.col1             OR
Counties.col2<> newCounties.col2             OR Counties.col3 <> newCounties.col3             ... );
 

c) nothing to do here

d) insert into Counties       select * from newCounties          where primarykey not in             (select primarykey
fromCounties);
 



test case:

test=# create table c (p int, col1 int, col2 int);
CREATE TABLE
test=# insert into c values (1,1,1); -- correct values
INSERT 7693959 1
test=# insert into c values (2,1,2); -- incorrect
INSERT 7693960 1
test=# insert into c values (9,9,9); -- extra value
INSERT 7693961 1
test=# create table newc (p int, col1 int, col2 int);
CREATE TABLE
test=# insert into newc values (1,1,1); -- correct values
INSERT 7693964 1
test=# insert into newc values (2,2,2); -- incorrect in c
INSERT 7693965 1
test=# insert into newc values (3,3,3); -- missing in c
INSERT 7693966 1
test=# delete from c where p not in (select p from newc);
DELETE 1
test=# update c set col1=n.col1, col2=n.col2
test-#   from newc as n
test-#   where c.p=n.p and (c.col1<>n.col1 or c.col2<>n.col2);
UPDATE 1
test=# insert into c select * from newc where p not in (select p from
c);
INSERT 7693967 1
test=# select * from c;p | col1 | col2
---+------+------1 |    1 |    12 |    2 |    23 |    3 |    3
(3 rows)


note: 
b) and c) can be merged and simplified if you don't mind updating
unchanged records:      update Counties from newCounties set            col1= newCounties.col1,           col2=
newCounties.col2,          col3= newCounties.col3,           ...      from newCounties      where
primarykey=newCounties.primarykey;



gnari




pgsql-sql by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Replacing a table with constraints
Next
From: Rajesh Mallah
Date:
Subject: Significance of Database Encoding