Thread: Insert into ... Select ... From ... too intelligent transaction
Hi all, I'm still working on my updateable views, which work fine now even I have a trigger to use because my delete rules wont work,but thats not my current, more important problem. I use some updateable views to handle my data (which are amazingly slow), which gives me ultimate flexibility to handle mydata. there are some insert rules which use currval() to get the last sequence id for my data which I have to insert. The problem now is, it works fine if I do the statement via normal insert into satatements, even within a transaction block.So far so good. But If I migrate my old data via **Insert into my_new_view Select ... From my_old_table**, Only thelast retrieved value of the sequences is used which blows my whole internal logic, because obviously I want to use thecurrent (for that row) and not the last id. For me it seems that the optimizer optimizes a bit to much! Is the an easy workaround? Many thanks for any help Jürgen
# Juergen.Rose@sag-el.com / 2005-07-21 19:11:04 +0200: > I use some updateable views to handle my data (which are amazingly > slow), which gives me ultimate flexibility to handle my data. > > there are some insert rules which use currval() to get the last > sequence id for my data which I have to insert. > > The problem now is, it works fine if I do the statement via normal > insert into satatements, even within a transaction block. So far so > good. But If I migrate my old data via **Insert into my_new_view > Select ... From my_old_table**, Only the last retrieved value of the > sequences is used which blows my whole internal logic, because > obviously I want to use the current (for that row) and not the last > id. What does the code look like? -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
> # Juergen.Rose@sag-el.com / 2005-07-21 19:11:04 +0200: > > I use some updateable views to handle my data (which are amazingly > > slow), which gives me ultimate flexibility to handle my data. > > > > there are some insert rules which use currval() to get the last > > sequence id for my data which I have to insert. > > > > The problem now is, it works fine if I do the statement via normal > > insert into satatements, even within a transaction block. So far so > > good. But If I migrate my old data via **Insert into my_new_view > > Select ... From my_old_table**, Only the last retrieved value of the > > sequences is used which blows my whole internal logic, because > > obviously I want to use the current (for that row) and not the last > > id. > > What does the code look like? > This is a quite accurate example of what I trying to do, just stripped down to the bare minimum. Please look at the statements at the end, and execute them just one after another (test 1-3). create table olddata( id int, lastname varchar(50), firstname varchar(50) ); insert into olddata values (1, 'Picard', 'Jean Luc'); insert into olddata values (3, 'Spock', 'Harold'); insert into olddata values (6, 'Zimmerman', 'Doc'); insert into olddata values (9, 'Lefler', 'Robin'); insert into olddata values (10, 'Kirk', 'James T.'); create table neworg( orgid serial, legacyid int, orgname varchar(100) ); create table newpersons( persid serial, orgid int, lastname varchar(50), firstname varchar(50) ); create view v_persons as select P.orgid, O.legacyid, P.persid, P.lastname, P.firstname from neworg O, newpersons P where O.orgid = P.orgid; create or replace rule r_insert_a_organisation as on insert to v_persons do instead insert into neworg( orgid, legacyid, orgname) values ( coalesce(new.orgid, nextval('neworg_orgid_seq')), new.legacyid, coalesce(new.lastname, '') || ', ' || coalesce(new.firstname, '') ); create or replace rule r_insert_b_persons as on insert to v_persons do insert into newpersons( orgid, lastname, firstname) values ( coalesce(new.orgid, currval('neworg_orgid_seq')), new.lastname, new.firstname ); /* test 1 */ insert into v_persons(legacyid, lastname, firstname) values (11, 'Dax', 'Jadzia'); select * from v_persons; /* my result: orgid | legacyid | persid | lastname | firstname -------+----------+--------+----------+----------- 1 | 11 | 1 | Dax | Jadzia (1 row) */ /* test 2 */ begin; insert into v_persons(legacyid, lastname, firstname) values (12, 'Bashir', 'Dr.'); insert into v_persons(legacyid, lastname, firstname) values (13, '', 'Odo'); insert into v_persons(legacyid, lastname, firstname) values (14, '', 'Worf'); end; select * from v_persons; /* my result: orgid | legacyid | persid | lastname | firstname -------+----------+--------+----------+----------- 1 | 11 | 1 | Dax | Jadzia 2 | 12 | 2 | Bashir | Dr. 3 | 13 | 3 | | Odo 4 | 14 | 4 | | Worf (4 rows) */ /* test 3 */ insert into v_persons(legacyid, lastname, firstname) select * from olddata; select * from v_persons; /* my result: orgid | legacyid | persid | lastname | firstname -------+----------+--------+-----------+----------- 1 | 11 | 1 | Dax | Jadzia 2 | 12 | 2 | Bashir | Dr. 3 | 13 | 3 | | Odo 4 | 14 | 4 | | Worf 9 | 10 | 5 | Picard | Jean Luc 9 | 10 | 6 | Spock | Harold 9 | 10 | 7 | Zimmerman | Doc 9 | 10 | 8 | Lefler | Robin 9 | 10 | 9 | Kirk | James T. (9 rows) */ And exactly in test 3 you see my problem, it should actually look like orgid | legacyid | persid | lastname | firstname -------+----------+--------+-----------+----------- 1 | 11 | 1 | Dax | Jadzia 2 | 12 | 2 | Bashir | Dr. 3 | 13 | 3 | | Odo 4 | 14 | 4 | | Worf 5 | 1 | 5 | Picard | Jean Luc 6 | 3 | 6 | Spock | Harold 7 | 6 | 7 | Zimmerman | Doc 8 | 9 | 8 | Lefler | Robin 9 | 10 | 9 | Kirk | James T. Why the heck gets the wrong data inserted if it is an int!!!??? I hope somebody will help me out on this, for me this looks very much like a bug. Juergen PS: By the way it is a postgres 7.4.7. on Debian stable
# Juergen.Rose@sag-el.com / 2005-07-22 09:10:01 +0200: > > # Juergen.Rose@sag-el.com / 2005-07-21 19:11:04 +0200: > > > I use some updateable views to handle my data (which are amazingly > > > slow), which gives me ultimate flexibility to handle my data. > > > > > > there are some insert rules which use currval() to get the last > > > sequence id for my data which I have to insert. > > > > > > The problem now is, it works fine if I do the statement via normal > > > insert into satatements, even within a transaction block. So far so > > > good. But If I migrate my old data via **Insert into my_new_view > > > Select ... From my_old_table**, Only the last retrieved value of the > > > sequences is used which blows my whole internal logic, because > > > obviously I want to use the current (for that row) and not the last > > > id. > create table olddata( > id int, > lastname varchar(50), > firstname varchar(50) > ); > > insert into olddata values (1, 'Picard', 'Jean Luc'); > insert into olddata values (3, 'Spock', 'Harold'); > insert into olddata values (6, 'Zimmerman', 'Doc'); > insert into olddata values (9, 'Lefler', 'Robin'); > insert into olddata values (10, 'Kirk', 'James T.'); > > create table neworg( > orgid serial, > legacyid int, > orgname varchar(100) > ); > > create table newpersons( > persid serial, > orgid int, > lastname varchar(50), > firstname varchar(50) > ); > > create view v_persons as > select > P.orgid, > O.legacyid, > P.persid, > P.lastname, > P.firstname > from > neworg O, > newpersons P > where > O.orgid = P.orgid; > > create or replace rule r_insert_a_organisation as on insert to v_persons > do instead > insert into neworg( > orgid, > legacyid, > orgname) > values ( > coalesce(new.orgid, nextval('neworg_orgid_seq')), > new.legacyid, > coalesce(new.lastname, '') || ', ' || coalesce(new.firstname, > '') > ); > > create or replace rule r_insert_b_persons as on insert to v_persons > do > insert into newpersons( > orgid, > lastname, > firstname) > values ( > coalesce(new.orgid, currval('neworg_orgid_seq')), > new.lastname, > new.firstname > ); > /* test 3 */ > insert into v_persons(legacyid, lastname, firstname) select * from > olddata; > select * from v_persons; > > /* my result: > > orgid | legacyid | persid | lastname | firstname > -------+----------+--------+-----------+----------- > 1 | 11 | 1 | Dax | Jadzia > 2 | 12 | 2 | Bashir | Dr. > 3 | 13 | 3 | | Odo > 4 | 14 | 4 | | Worf > 9 | 10 | 5 | Picard | Jean Luc > 9 | 10 | 6 | Spock | Harold > 9 | 10 | 7 | Zimmerman | Doc > 9 | 10 | 8 | Lefler | Robin > 9 | 10 | 9 | Kirk | James T. > (9 rows) > > */ > > And exactly in test 3 you see my problem, it should actually look like > > orgid | legacyid | persid | lastname | firstname > -------+----------+--------+-----------+----------- > 1 | 11 | 1 | Dax | Jadzia > 2 | 12 | 2 | Bashir | Dr. > 3 | 13 | 3 | | Odo > 4 | 14 | 4 | | Worf > 5 | 1 | 5 | Picard | Jean Luc > 6 | 3 | 6 | Spock | Harold > 7 | 6 | 7 | Zimmerman | Doc > 8 | 9 | 8 | Lefler | Robin > 9 | 10 | 9 | Kirk | James T. > > Why the heck gets the wrong data inserted if it is an int!!!??? > > I hope somebody will help me out on this, for me this looks very much > like a bug. PostgreSQL did exactly what you told it to do. RULEs *rewrite queries*, which means the INSERT INTO ... SELECT gets transformed to something like insert into neworg( orgid, legacyid, orgname) select coalesce(new.orgid, nextval('neworg_orgid_seq')), new.id as legacyid, coalesce(new.lastname, '') || ', ' || coalesce(new.firstname, '') from olddata new; insert into newpersons ( orgid, lastname, firstname) select coalesce(new.orgid, currval('neworg_orgid_seq')), new.lastname, new.firstname from olddata new; and this is run once, not for every row. IOW, you'll have this problem with any multi-row inserts. -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
> # Juergen.Rose@sag-el.com / 2005-07-22 09:10:01 +0200: > > > # Juergen.Rose@sag-el.com / 2005-07-21 19:11:04 +0200: > > > > I use some updateable views to handle my data (which > are amazingly > > > > slow), which gives me ultimate flexibility to handle my data. > > > > > > > > there are some insert rules which use currval() to get the last > > > > sequence id for my data which I have to insert. > > > > > > > > The problem now is, it works fine if I do the statement > via normal > > > > insert into satatements, even within a transaction > block. So far so > > > > good. But If I migrate my old data via **Insert into my_new_view > > > > Select ... From my_old_table**, Only the last retrieved > value of the > > > > sequences is used which blows my whole internal logic, because > > > > obviously I want to use the current (for that row) and > not the last > > > > id. > > > create table olddata( > > id int, > > lastname varchar(50), > > firstname varchar(50) > > ); > > > > insert into olddata values (1, 'Picard', 'Jean Luc'); > > insert into olddata values (3, 'Spock', 'Harold'); > > insert into olddata values (6, 'Zimmerman', 'Doc'); > > insert into olddata values (9, 'Lefler', 'Robin'); > > insert into olddata values (10, 'Kirk', 'James T.'); > > > > create table neworg( > > orgid serial, > > legacyid int, > > orgname varchar(100) > > ); > > > > create table newpersons( > > persid serial, > > orgid int, > > lastname varchar(50), > > firstname varchar(50) > > ); > > > > create view v_persons as > > select > > P.orgid, > > O.legacyid, > > P.persid, > > P.lastname, > > P.firstname > > from > > neworg O, > > newpersons P > > where > > O.orgid = P.orgid; > > > > create or replace rule r_insert_a_organisation as on insert > to v_persons > > do instead > > insert into neworg( > > orgid, > > legacyid, > > orgname) > > values ( > > coalesce(new.orgid, nextval('neworg_orgid_seq')), > > new.legacyid, > > coalesce(new.lastname, '') || ', ' || > coalesce(new.firstname, > > '') > > ); > > > > create or replace rule r_insert_b_persons as on insert to v_persons > > do > > insert into newpersons( > > orgid, > > lastname, > > firstname) > > values ( > > coalesce(new.orgid, currval('neworg_orgid_seq')), > > new.lastname, > > new.firstname > > ); > > > /* test 3 */ > > insert into v_persons(legacyid, lastname, firstname) select * from > > olddata; > > select * from v_persons; > > > > /* my result: > > > > orgid | legacyid | persid | lastname | firstname > > -------+----------+--------+-----------+----------- > > 1 | 11 | 1 | Dax | Jadzia > > 2 | 12 | 2 | Bashir | Dr. > > 3 | 13 | 3 | | Odo > > 4 | 14 | 4 | | Worf > > 9 | 10 | 5 | Picard | Jean Luc > > 9 | 10 | 6 | Spock | Harold > > 9 | 10 | 7 | Zimmerman | Doc > > 9 | 10 | 8 | Lefler | Robin > > 9 | 10 | 9 | Kirk | James T. > > (9 rows) > > > > */ > > > > And exactly in test 3 you see my problem, it should > actually look like > > > > orgid | legacyid | persid | lastname | firstname > > -------+----------+--------+-----------+----------- > > 1 | 11 | 1 | Dax | Jadzia > > 2 | 12 | 2 | Bashir | Dr. > > 3 | 13 | 3 | | Odo > > 4 | 14 | 4 | | Worf > > 5 | 1 | 5 | Picard | Jean Luc > > 6 | 3 | 6 | Spock | Harold > > 7 | 6 | 7 | Zimmerman | Doc > > 8 | 9 | 8 | Lefler | Robin > > 9 | 10 | 9 | Kirk | James T. > > > > Why the heck gets the wrong data inserted if it is an int!!!??? > > > > I hope somebody will help me out on this, for me this looks > very much > > like a bug. > > PostgreSQL did exactly what you told it to do. RULEs *rewrite > queries*, which means the INSERT INTO ... SELECT gets > transformed to > something like > > insert into neworg( orgid, legacyid, orgname) > select > coalesce(new.orgid, nextval('neworg_orgid_seq')), > new.id as legacyid, > coalesce(new.lastname, '') || ', ' || > coalesce(new.firstname, '') > from olddata new; > > insert into newpersons ( orgid, lastname, firstname) > select > coalesce(new.orgid, currval('neworg_orgid_seq')), > new.lastname, > new.firstname > from olddata new; > > and this is run once, not for every row. IOW, you'll have this > problem with any multi-row inserts. So I can't actually solve this problem, but what I could do would be to not create views, but tables with rules, and put some trigger on the tables? Further if I understand you right, the rules are transformed actually to two different queries which are executed one after another and not row by row? Thanks for the enlightment so far Juergen
# Juergen.Rose@sag-el.com / 2005-07-22 13:04:27 +0200: > > > > # Juergen.Rose@sag-el.com / 2005-07-22 09:10:01 +0200: > > > > # Juergen.Rose@sag-el.com / 2005-07-21 19:11:04 +0200: > > > > > I use some updateable views to handle my data (which are > > > > > amazingly slow), which gives me ultimate flexibility to handle > > > > > my data. > > > > > > > > > > there are some insert rules which use currval() to get the last > > > > > sequence id for my data which I have to insert. > > > > > > > > > > The problem now is, it works fine if I do the statement via > > > > > normal insert into satatements, even within a transaction > > > > > block. So far so good. But If I migrate my old data via > > > > > **Insert into my_new_view Select ... From my_old_table**, Only > > > > > the last retrieved value of the sequences is used which blows > > > > > my whole internal logic, because obviously I want to use the > > > > > current (for that row) and not the last id. > > > Why the heck gets the wrong data inserted if it is an int!!!??? > > > > > > I hope somebody will help me out on this, for me this looks > > > very much like a bug. > > > > PostgreSQL did exactly what you told it to do. RULEs *rewrite > > queries*, which means the INSERT INTO ... SELECT gets > > transformed to something like > > > > insert into neworg( orgid, legacyid, orgname) > > select > > coalesce(new.orgid, nextval('neworg_orgid_seq')), > > new.id as legacyid, > > coalesce(new.lastname, '') || ', ' || > > coalesce(new.firstname, '') > > from olddata new; > > > > insert into newpersons ( orgid, lastname, firstname) > > select > > coalesce(new.orgid, currval('neworg_orgid_seq')), > > new.lastname, > > new.firstname > > from olddata new; > > > > and this is run once, not for every row. IOW, you'll have this > > problem with any multi-row inserts. > > So I can't actually solve this problem, but what I could do would be to > not create views, but tables with rules, and put some trigger on the > tables? Yup, a row-level trigger. > Further if I understand you right, the rules are transformed actually to > two different queries which are executed one after another and not row > by row? Right. Have you read the manual? http://www.postgresql.org/docs/current/static/sql-createrule.html "It is important to realize that a rule is really a command transformation mechanism, or command macro. The transformation happens before the execution of the commands starts. If you actually want an operation that fires independently for each physical row, you probably want to use a trigger, not a rule. More information about the rules system is in Chapter 33." Chapter 33 is http://www.postgresql.org/docs/current/static/rules.html -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991