Thread: Differential (transactional) REFRESH
<div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt">In thefirst CF for 9.4 I plan to submit a patch to allow<br />transactional REFRESH of a materialized view using differential<br/>update. Essentially I expect this to be the equivalent of running<br />the query specified for the viewand saving the results into a<br />temporary table, and then doing DELETE and INSERT passes to make<br />the matviewmatch the new data. If mv is the matview and mv_temp<br />is the temporary storage for the new value for its data,the logic<br />would be roughly the equivalent of:<br /><br />BEGIN;<br />LOCK mv IN SHARE ROW EXCLUSIVE MODE;<br />CREATETEMP TABLE mv_temp AS [mv query];<br />-- Create indexes here??? Capture statistics on temp table???<br />DELETEFROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp<br /> WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*));<br />INSERTINTO mv SELECT * FROM mv_temp WHERE NOT EXISTS<br /> (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*));<br/>COMMIT;<br /><br />I can see more than one way to code this, but would appreciate<br />input on the bestway sooner rather than later, if anyone is going<br />to have an opinion.<br /><br />Thoughts?<br /><br />-- <br />KevinGrittner<br />EnterpriseDB: http://www.enterprisedb.com<br />The Enterprise PostgreSQL Company</div>
On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote: > In the first CF for 9.4 I plan to submit a patch to allow > transactional REFRESH of a materialized view using differential > update. Essentially I expect this to be the equivalent of running > the query specified for the view and saving the results into a > temporary table, and then doing DELETE and INSERT passes to make > the matview match the new data. If mv is the matview and mv_temp > is the temporary storage for the new value for its data, the logic > would be roughly the equivalent of: > > BEGIN; > LOCK mv IN SHARE ROW EXCLUSIVE MODE; > CREATE TEMP TABLE mv_temp AS [mv query]; > -- Create indexes here??? Capture statistics on temp table??? > DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp > WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*)); > INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS > (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*)); > COMMIT; > > I can see more than one way to code this, but would appreciate > input on the best way sooner rather than later, if anyone is going > to have an opinion. > > Thoughts? Wouldn't this either delete everything or nothing, followed by inserting everything or nothing? WHERE NOT EXISTS wouldn't perform any matching, just check to see whether there were matches or no matches. -- Thom
2013/5/14 Thom Brown <thom@linux.com>: > On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote: >> In the first CF for 9.4 I plan to submit a patch to allow >> transactional REFRESH of a materialized view using differential >> update. Essentially I expect this to be the equivalent of running >> the query specified for the view and saving the results into a >> temporary table, and then doing DELETE and INSERT passes to make >> the matview match the new data. If mv is the matview and mv_temp >> is the temporary storage for the new value for its data, the logic >> would be roughly the equivalent of: >> >> BEGIN; >> LOCK mv IN SHARE ROW EXCLUSIVE MODE; >> CREATE TEMP TABLE mv_temp AS [mv query]; >> -- Create indexes here??? Capture statistics on temp table??? >> DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp >> WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*)); >> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS >> (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*)); >> COMMIT; >> >> I can see more than one way to code this, but would appreciate >> input on the best way sooner rather than later, if anyone is going >> to have an opinion. >> >> Thoughts? > > Wouldn't this either delete everything or nothing, followed by > inserting everything or nothing? WHERE NOT EXISTS wouldn't perform > any matching, just check to see whether there were matches or no > matches. depends how much rows is changed. When view is almost stable, then delete everything can be very slow. But refresh of view can be implemented with some simplified version of MERGE statement, and it can be more effective than two independent statements. > > -- > Thom > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On 14 May 2013 20:55, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2013/5/14 Thom Brown <thom@linux.com>: >> On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote: >>> In the first CF for 9.4 I plan to submit a patch to allow >>> transactional REFRESH of a materialized view using differential >>> update. Essentially I expect this to be the equivalent of running >>> the query specified for the view and saving the results into a >>> temporary table, and then doing DELETE and INSERT passes to make >>> the matview match the new data. If mv is the matview and mv_temp >>> is the temporary storage for the new value for its data, the logic >>> would be roughly the equivalent of: >>> >>> BEGIN; >>> LOCK mv IN SHARE ROW EXCLUSIVE MODE; >>> CREATE TEMP TABLE mv_temp AS [mv query]; >>> -- Create indexes here??? Capture statistics on temp table??? >>> DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp >>> WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*)); >>> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS >>> (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*)); >>> COMMIT; >>> >>> I can see more than one way to code this, but would appreciate >>> input on the best way sooner rather than later, if anyone is going >>> to have an opinion. >>> >>> Thoughts? >> >> Wouldn't this either delete everything or nothing, followed by >> inserting everything or nothing? WHERE NOT EXISTS wouldn't perform >> any matching, just check to see whether there were matches or no >> matches. > > depends how much rows is changed. When view is almost stable, then > delete everything can be very slow. But refresh of view can be > implemented with some simplified version of MERGE statement, and it > can be more effective than two independent statements. Well that's my point, if 1 row, or 100 million rows were different, that query would appear to perform the same action, namely to delete everything from the materialised view, regardless of whether there were rows that matched. Only if there were no differences would there be no action. I suspect this isn't the intended query. -- Thom
<div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt">Thom Brown<thom@linux.com> wrote:<br />> On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote:<br />>>In the first CF for 9.4 I plan to submit a patch to allow<br />>> transactional REFRESH of a materializedview using differential<br />>> update. Essentially I expect this to be the equivalent of running<br />>>the query specified for the view and saving the results into a<br />>> temporary table, and then doing DELETEand INSERT passes to make<br />>> the matview match the new data. If mv is the matview and mv_temp<br />>>is the temporary storage for the new value for its data, the logic<br />>> would be roughly the equivalentof:<br />>><br />>> BEGIN;<br />>> LOCK mv IN SHARE ROW EXCLUSIVE MODE;<br />>> CREATETEMP TABLE mv_temp AS [mv query];<br />>> -- Create indexes here??? Capture statistics on temp table???<br />>>DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp<br />>> WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*));<br/>>> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS<br />>> (SELECT * FROM mv WHERE (mv.*)IS NOT DISTINCT FROM (mv_temp.*));<br />>> COMMIT;<br />> <br />> Wouldn't this either delete everythingor nothing, followed by<br />> inserting everything or nothing? WHERE NOT EXISTS wouldn't perform<br />>any matching, just check to see whether there were matches or no<br />> matches.<br /><br />No.<br /><br />test=#-- Mock up the matview and the generated temp replacement in regular tables<br />test=# -- for purposes of demonstration.<br/>test=# create table mv (id int not null primary key, val text);<br />CREATE TABLE<br />test=# insert intomv values (1, 'one'), (2, 'two'), (3, null), (4, 'four');<br />INSERT 0 4<br />test=# create temp table mv_temp as select* from mv;<br />SELECT 4<br />test=# update mv_temp set val = null where id = 4;<br />UPDATE 1<br />test=# update mv_tempset val = 'zwei' where id = 2;<br />UPDATE 1<br />test=# delete from mv_temp where id = 1;<br />DELETE 1<br />test=#insert into mv_temp values (5, 'five');<br />INSERT 0 1<br />test=# -- Show both.<br />test=# select * from mv orderby id;<br /> id | val <br />----+------<br /> 1 | one<br /> 2 | two<br /> 3 |<br /> 4 | four<br />(4 rows)<br /><br/>test=# select * from mv_temp order by id;<br /> id | val <br />----+------<br /> 2 | zwei<br /> 3 |<br /> 4 |<br/> 5 | five<br />(4 rows)<br /><br />test=# -- Perform the differential update's delete.<br />test=# delete from mvwhere not exists (select * from mv_temp<br />test(# where (mv_temp.*) is not distinct from (mv.*));<br />DELETE 3<br/>test=# -- Show both.<br />test=# select * from mv order by id;<br /> id | val<br />----+-----<br /> 3 |<br />(1 row)<br/><br />test=# select * from mv_temp order by id;<br /> id | val <br />----+------<br /> 2 | zwei<br /> 3 |<br/> 4 |<br /> 5 | five<br />(4 rows)<br /><br />test=# -- Perform the differential update's insert.<br />test=# insertinto mv select * from mv_temp where not exists<br />test-# (select * from mv where (mv.*) is not distinct from (mv_temp.*));<br/>INSERT 0 3<br />test=# -- Show both.<br />test=# select * from mv order by id;<br /> id | val <br />----+------<br/> 2 | zwei<br /> 3 |<br /> 4 |<br /> 5 | five<br />(4 rows)<br /><br />test=# select * from mv_temporder by id;<br /> id | val <br />----+------<br /> 2 | zwei<br /> 3 |<br /> 4 |<br /> 5 | five<br />(4 rows)<br/><br />--<br />Kevin Grittner<br />EnterpriseDB: http://www.enterprisedb.com<br />The Enterprise PostgreSQL Company<br/><br /></div>
On 14 May 2013 21:04, Kevin Grittner <kgrittn@ymail.com> wrote: > Thom Brown <thom@linux.com> wrote: >> On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote: >>> In the first CF for 9.4 I plan to submit a patch to allow >>> transactional REFRESH of a materialized view using differential >>> update. Essentially I expect this to be the equivalent of running >>> the query specified for the view and saving the results into a >>> temporary table, and then doing DELETE and INSERT passes to make >>> the matview match the new data. If mv is the matview and mv_temp >>> is the temporary storage for the new value for its data, the logic >>> would be roughly the equivalent of: >>> >>> BEGIN; >>> LOCK mv IN SHARE ROW EXCLUSIVE MODE; >>> CREATE TEMP TABLE mv_temp AS [mv query]; >>> -- Create indexes here??? Capture statistics on temp table??? >>> DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp >>> WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*)); >>> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS >>> (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*)); >>> COMMIT; >> >> Wouldn't this either delete everything or nothing, followed by >> inserting everything or nothing? WHERE NOT EXISTS wouldn't perform >> any matching, just check to see whether there were matches or no >> matches. > > No. > > test=# -- Mock up the matview and the generated temp replacement in regular > tables > test=# -- for purposes of demonstration. > test=# create table mv (id int not null primary key, val text); > CREATE TABLE > test=# insert into mv values (1, 'one'), (2, 'two'), (3, null), (4, 'four'); > INSERT 0 4 > test=# create temp table mv_temp as select * from mv; > SELECT 4 > test=# update mv_temp set val = null where id = 4; > UPDATE 1 > test=# update mv_temp set val = 'zwei' where id = 2; > UPDATE 1 > test=# delete from mv_temp where id = 1; > DELETE 1 > test=# insert into mv_temp values (5, 'five'); > INSERT 0 1 > test=# -- Show both. > test=# select * from mv order by id; > id | val > ----+------ > 1 | one > 2 | two > 3 | > 4 | four > (4 rows) > > test=# select * from mv_temp order by id; > id | val > ----+------ > 2 | zwei > 3 | > 4 | > 5 | five > (4 rows) > > test=# -- Perform the differential update's delete. > test=# delete from mv where not exists (select * from mv_temp > test(# where (mv_temp.*) is not distinct from (mv.*)); > DELETE 3 > test=# -- Show both. > test=# select * from mv order by id; > id | val > ----+----- > 3 | > (1 row) > > test=# select * from mv_temp order by id; > id | val > ----+------ > 2 | zwei > 3 | > 4 | > 5 | five > (4 rows) > > test=# -- Perform the differential update's insert. > test=# insert into mv select * from mv_temp where not exists > test-# (select * from mv where (mv.*) is not distinct from (mv_temp.*)); > INSERT 0 3 > test=# -- Show both. > test=# select * from mv order by id; > id | val > ----+------ > 2 | zwei > 3 | > 4 | > 5 | five > (4 rows) > > test=# select * from mv_temp order by id; > id | val > ----+------ > 2 | zwei > 3 | > 4 | > 5 | five > (4 rows) You're right, I think I'm having a brain-fail! I'm too used to seeing the behaviour from WHERE NOT EXISTS (SELECT 1...). Apologies for the noise. -- Thom