Re: Replacing a table with constraints - Mailing list pgsql-sql
From | Ragnar Hafstað |
---|---|
Subject | Re: Replacing a table with constraints |
Date | |
Msg-id | 1116024199.11747.37.camel@localhost.localdomain Whole thread Raw |
In response to | Replacing a table with constraints ("Mark Fenbers" <Mark.Fenbers@noaa.gov>) |
List | pgsql-sql |
On Fri, 2005-05-13 at 14:38 -0400, Mark Fenbers wrote: > I have a table called Counties which partially contains a lot bad > data. By" bad data", I mean some records are missing; some exist and > shouldn't; and some records have fields with erroneous information. > However, the majority of the data in the table is accurate. I have > built/loaded a new table called newCounties with the same structure as > Counties, but contains no bad data. My was to completely replace the > contents of Counties with the contents of newCounties. The problem > is: several other tables have Foreign Key constraints placed on > Counties. Therefore, Pg will not let me 'DELETE FROM Counties;", nor > will it let me "DROP TABLE Counties;" > > I'm perplexed. Can someone suggest how I can best get data from > Counties to look just like newCounties? assuming same primary key on these 2 tables, you have to consider these cases: a) records in Counties not in newCounties b) records in newCounties existing in Counties, but different c) records in newCounties also correct in Counties d) records in newCounties missing from Counties a) get rid of extra records: delete from Counties where primarykey not in (select primarykeyfrom newCounties); if you hit foreign key constrains, you need to look at your data a bit more, anyways b) update erroneous records: update Counties from newCounties set col1= newCounties.col1, col2=newCounties.col2, col3= newCounties.col3, ... from newCounties where primarykey=newCounties.primarykey and ( Counties.col1 <> newCounties.col1 OR Counties.col2<> newCounties.col2 OR Counties.col3 <> newCounties.col3 ... ); c) nothing to do here d) insert into Counties select * from newCounties where primarykey not in (select primarykey fromCounties); test case: test=# create table c (p int, col1 int, col2 int); CREATE TABLE test=# insert into c values (1,1,1); -- correct values INSERT 7693959 1 test=# insert into c values (2,1,2); -- incorrect INSERT 7693960 1 test=# insert into c values (9,9,9); -- extra value INSERT 7693961 1 test=# create table newc (p int, col1 int, col2 int); CREATE TABLE test=# insert into newc values (1,1,1); -- correct values INSERT 7693964 1 test=# insert into newc values (2,2,2); -- incorrect in c INSERT 7693965 1 test=# insert into newc values (3,3,3); -- missing in c INSERT 7693966 1 test=# delete from c where p not in (select p from newc); DELETE 1 test=# update c set col1=n.col1, col2=n.col2 test-# from newc as n test-# where c.p=n.p and (c.col1<>n.col1 or c.col2<>n.col2); UPDATE 1 test=# insert into c select * from newc where p not in (select p from c); INSERT 7693967 1 test=# select * from c;p | col1 | col2 ---+------+------1 | 1 | 12 | 2 | 23 | 3 | 3 (3 rows) note: b) and c) can be merged and simplified if you don't mind updating unchanged records: update Counties from newCounties set col1= newCounties.col1, col2= newCounties.col2, col3= newCounties.col3, ... from newCounties where primarykey=newCounties.primarykey; gnari