Thread: INSERT OR UPDATE

INSERT OR UPDATE

From
"Julian Scarfe"
Date:
I'm keeping config information for an application in a series of related
tables.  I'd like a command that INSERTs data if it's new, or UPDATEs it if
the key is duplicated.

Copying the config info from one database to another virgin installation is
easy, of course. I can just use pg_dump on the original db1 to create the
DDL and run it into the new database db2.

The problem comes when I've made changes in db1 that I want to copy to db2,
but db2 may itself have additional entries not present in db1.  I'd like to
update the entries in db2 that have equivalent entries in db1 with their db1
values, without destroying the rest of the data in db2.

Here's an example:

db1 is created as follows:

create table a (
a_id text PRIMARY KEY,
param_b text,
param_c text
);

create table d (
d_id text PRIMARY KEY,
a_id text references a(a_id) ON UPDATE CASCADE,
param_e text,
param_f text
);

insert into a values ('a1', 'b1', 'c1');
insert into a values ('a2', 'b2', 'c2');

insert into d values ('d1', 'a1', 'e1', 'f1');
insert into d values ('d2', 'a2', 'e2', 'f2');
insert into d values ('d3', 'a1', 'e3', 'f3');

The databases have identical schema:
$ pg_dump -s db1 | psql db2

and provided db2 is empty, I can just copy the contents across:
$ pg_dump -ad db1 > db1.config
$ psql db2 < db1.config

So far so good.

But now db2 gets some additional entries:
db2=# insert into a values ('a3', 'b3', 'c3');
db2=# insert into d values ('d4', 'a1', 'e4', 'f4');
db2=# insert into d values ('d5', 'a2', 'e5', 'f5');

Meanwhile, I make some config changes in db1:
db1=# update a set param_b = 'b1 new' where a_id = 'a1';
db1=# update d set param_e = 'e1 new' where d_id = 'd1';
db1=# update d set param_f = 'f2 new' where d_id = 'd2';

So:
$ pg_dump -ad db1 > db1.config
and db1.config now contains
INSERT INTO a VALUES ('a2', 'b2', 'c2');
INSERT INTO a VALUES ('a1', 'b1 new', 'c1');
INSERT INTO d VALUES ('d3', 'a1', 'e3', 'f3');
INSERT INTO d VALUES ('d1', 'a1', 'e1 new', 'f1');
INSERT INTO d VALUES ('d2', 'a2', 'e2', 'f2 new');

I want to update the data in db2 to reflect the values in db1. I can't
truncate the tables in db2 because I'd lose the additional entries (a3, d4,
d5).

But of course when I do...
$ psql db2 < db1.config
... I get ...
ERROR:  duplicate key violates unique constraint "a_pkey"
ERROR:  duplicate key violates unique constraint "d_pkey"
...and the a1, d1, d2 rows are not updated to match db1.

What I'd really like is to be able to do:

INSERT OR UPDATE INTO a VALUES ('a2', 'b2', 'c2');
INSERT OR UPDATE INTO a VALUES ('a1', 'b1 new', 'c1');
INSERT OR UPDATE INTO d VALUES ('d3', 'a1', 'e3', 'f3');
INSERT OR UPDATE INTO d VALUES ('d1', 'a1', 'e1 new', 'f1');
INSERT OR UPDATE INTO d VALUES ('d2', 'a2', 'e2', 'f2 new');

so that the rest of the row is treated as an UPDATE if the primary key is a
duplicate.

Of course I can write something at the application level to examine each row
and take appropriate action. But it feels like this may be a commonly
encountered problem for which there may be a database-level solution.  Am I
missing something obvious?

Thanks

Julian Scarfe






Re: INSERT OR UPDATE

From
"Guy Rouillier"
Date:
Julian Scarfe wrote:
> I'm keeping config information for an application in a series of
> related tables.  I'd like a command that INSERTs data if it's new,
< or UPDATEs it if the key is duplicated.

Write a stored procedure called something like InsertUpdateConfigData.
Pick the operation that you think is most likely to occur more often
(the insert or update).  Code that as the initial statement.  Then
create an exception block, and in that exception block catch the error
that would result from the initial statement being executed in the case
where the other one should have been; then execute that other statement.

There is no magic database-provided SQL statement that says "try an
insert and if that fails then try an update."  You have to do that
yourself.

--
Guy Rouillier


Re: INSERT OR UPDATE

From
Tony Wasson
Date:
On 1/2/06, Julian Scarfe <julian@avbrief.com> wrote:
> I'm keeping config information for an application in a series of related
> tables.  I'd like a command that INSERTs data if it's new, or UPDATEs it if
> the key is duplicated.

A MERGE trigger will do exactly what you are asking for.

http://archives.postgresql.org/pgsql-sql/2004-05/msg00135.php

Send all your data as INSERTS.
The trigger will run a function to see if the row exists.
If the row exists -> rewrite it as it as an UPDATE
If the row does not exist -> leave the INSERT alone

Tony

Re: INSERT OR UPDATE

From
Andrew - Supernews
Date:
On 2006-01-03, Tony Wasson <ajwasson@gmail.com> wrote:
> On 1/2/06, Julian Scarfe <julian@avbrief.com> wrote:
>> I'm keeping config information for an application in a series of related
>> tables.  I'd like a command that INSERTs data if it's new, or UPDATEs it if
>> the key is duplicated.
>
> A MERGE trigger will do exactly what you are asking for.
>
> http://archives.postgresql.org/pgsql-sql/2004-05/msg00135.php
>
> Send all your data as INSERTS.
> The trigger will run a function to see if the row exists.
> If the row exists -> rewrite it as it as an UPDATE
> If the row does not exist -> leave the INSERT alone

There are some rather serious race conditions in that (it's not taking
anything like a strong enough lock to avoid that - it would need to use
SHARE ROW EXCLUSIVE, not ROW EXCLUSIVE).

In 8.0 on, use the method described here:
http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services