Thread: Differential (transactional) REFRESH

Differential (transactional) REFRESH

From
Kevin Grittner
Date:
<div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt">In
thefirst CF for 9.4 I plan to submit a patch to allow<br />transactional REFRESH of a materialized view using
differential<br/>update.  Essentially I expect this to be the equivalent of running<br />the query specified for the
viewand saving the results into a<br />temporary table, and then doing DELETE and INSERT passes to make<br />the
matviewmatch the new data.  If mv is the matview and mv_temp<br />is the temporary storage for the new value for its
data,the logic<br />would be roughly the equivalent of:<br /><br />BEGIN;<br />LOCK mv IN SHARE ROW EXCLUSIVE MODE;<br
/>CREATETEMP TABLE mv_temp AS [mv query];<br />-- Create indexes here???  Capture statistics on temp table???<br
/>DELETEFROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp<br />  WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*));<br
/>INSERTINTO mv SELECT * FROM mv_temp WHERE NOT EXISTS<br />  (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM
(mv_temp.*));<br/>COMMIT;<br /><br />I can see more than one way to code this, but would appreciate<br />input on the
bestway sooner rather than later, if anyone is going<br />to have an opinion.<br /><br />Thoughts?<br /><br />-- <br
/>KevinGrittner<br />EnterpriseDB: http://www.enterprisedb.com<br />The Enterprise PostgreSQL Company</div> 

Re: Differential (transactional) REFRESH

From
Thom Brown
Date:
On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote:
> In the first CF for 9.4 I plan to submit a patch to allow
> transactional REFRESH of a materialized view using differential
> update.  Essentially I expect this to be the equivalent of running
> the query specified for the view and saving the results into a
> temporary table, and then doing DELETE and INSERT passes to make
> the matview match the new data.  If mv is the matview and mv_temp
> is the temporary storage for the new value for its data, the logic
> would be roughly the equivalent of:
>
> BEGIN;
> LOCK mv IN SHARE ROW EXCLUSIVE MODE;
> CREATE TEMP TABLE mv_temp AS [mv query];
> -- Create indexes here???  Capture statistics on temp table???
> DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp
>   WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*));
> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS
>   (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*));
> COMMIT;
>
> I can see more than one way to code this, but would appreciate
> input on the best way sooner rather than later, if anyone is going
> to have an opinion.
>
> Thoughts?

Wouldn't this either delete everything or nothing, followed by
inserting everything or nothing?  WHERE NOT EXISTS wouldn't perform
any matching, just check to see whether there were matches or no
matches.

--
Thom



Re: Differential (transactional) REFRESH

From
Pavel Stehule
Date:
2013/5/14 Thom Brown <thom@linux.com>:
> On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote:
>> In the first CF for 9.4 I plan to submit a patch to allow
>> transactional REFRESH of a materialized view using differential
>> update.  Essentially I expect this to be the equivalent of running
>> the query specified for the view and saving the results into a
>> temporary table, and then doing DELETE and INSERT passes to make
>> the matview match the new data.  If mv is the matview and mv_temp
>> is the temporary storage for the new value for its data, the logic
>> would be roughly the equivalent of:
>>
>> BEGIN;
>> LOCK mv IN SHARE ROW EXCLUSIVE MODE;
>> CREATE TEMP TABLE mv_temp AS [mv query];
>> -- Create indexes here???  Capture statistics on temp table???
>> DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp
>>   WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*));
>> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS
>>   (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*));
>> COMMIT;
>>
>> I can see more than one way to code this, but would appreciate
>> input on the best way sooner rather than later, if anyone is going
>> to have an opinion.
>>
>> Thoughts?
>
> Wouldn't this either delete everything or nothing, followed by
> inserting everything or nothing?  WHERE NOT EXISTS wouldn't perform
> any matching, just check to see whether there were matches or no
> matches.

depends how much rows is changed. When view is almost stable, then
delete everything can be very slow. But refresh of view can be
implemented with some simplified version of MERGE statement, and it
can be more effective than two independent statements.

>
> --
> Thom
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: Differential (transactional) REFRESH

From
Thom Brown
Date:
On 14 May 2013 20:55, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2013/5/14 Thom Brown <thom@linux.com>:
>> On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote:
>>> In the first CF for 9.4 I plan to submit a patch to allow
>>> transactional REFRESH of a materialized view using differential
>>> update.  Essentially I expect this to be the equivalent of running
>>> the query specified for the view and saving the results into a
>>> temporary table, and then doing DELETE and INSERT passes to make
>>> the matview match the new data.  If mv is the matview and mv_temp
>>> is the temporary storage for the new value for its data, the logic
>>> would be roughly the equivalent of:
>>>
>>> BEGIN;
>>> LOCK mv IN SHARE ROW EXCLUSIVE MODE;
>>> CREATE TEMP TABLE mv_temp AS [mv query];
>>> -- Create indexes here???  Capture statistics on temp table???
>>> DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp
>>>   WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*));
>>> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS
>>>   (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*));
>>> COMMIT;
>>>
>>> I can see more than one way to code this, but would appreciate
>>> input on the best way sooner rather than later, if anyone is going
>>> to have an opinion.
>>>
>>> Thoughts?
>>
>> Wouldn't this either delete everything or nothing, followed by
>> inserting everything or nothing?  WHERE NOT EXISTS wouldn't perform
>> any matching, just check to see whether there were matches or no
>> matches.
>
> depends how much rows is changed. When view is almost stable, then
> delete everything can be very slow. But refresh of view can be
> implemented with some simplified version of MERGE statement, and it
> can be more effective than two independent statements.

Well that's my point, if 1 row, or 100 million rows were different,
that query would appear to perform the same action, namely to delete
everything from the materialised view, regardless of whether there
were rows that matched.  Only if there were no differences would there
be no action.  I suspect this isn't the intended query.

--
Thom



Re: Differential (transactional) REFRESH

From
Kevin Grittner
Date:
<div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt">Thom
Brown<thom@linux.com> wrote:<br />> On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote:<br
/>>>In the first CF for 9.4 I plan to submit a patch to allow<br />>> transactional REFRESH of a
materializedview using differential<br />>> update.  Essentially I expect this to be the equivalent of running<br
/>>>the query specified for the view and saving the results into a<br />>> temporary table, and then doing
DELETEand INSERT passes to make<br />>> the matview match the new data.  If mv is the matview and mv_temp<br
/>>>is the temporary storage for the new value for its data, the logic<br />>> would be roughly the
equivalentof:<br />>><br />>> BEGIN;<br />>> LOCK mv IN SHARE ROW EXCLUSIVE MODE;<br />>>
CREATETEMP TABLE mv_temp AS [mv query];<br />>> -- Create indexes here???  Capture statistics on temp table???<br
/>>>DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp<br />>>  WHERE (mv_temp.*) IS NOT DISTINCT FROM
(mv.*));<br/>>> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS<br />>>  (SELECT * FROM mv WHERE
(mv.*)IS NOT DISTINCT FROM (mv_temp.*));<br />>> COMMIT;<br />> <br />> Wouldn't this either delete
everythingor nothing, followed by<br />> inserting everything or nothing?  WHERE NOT EXISTS wouldn't perform<br
/>>any matching, just check to see whether there were matches or no<br />> matches.<br /><br />No.<br /><br
/>test=#-- Mock up the matview and the generated temp replacement in regular tables<br />test=# -- for purposes of
demonstration.<br/>test=# create table mv (id int not null primary key, val text);<br />CREATE TABLE<br />test=# insert
intomv values (1, 'one'), (2, 'two'), (3, null), (4, 'four');<br />INSERT 0 4<br />test=# create temp table mv_temp as
select* from mv;<br />SELECT 4<br />test=# update mv_temp set val = null where id = 4;<br />UPDATE 1<br />test=# update
mv_tempset val = 'zwei' where id = 2;<br />UPDATE 1<br />test=# delete from mv_temp where id = 1;<br />DELETE 1<br
/>test=#insert into mv_temp values (5, 'five');<br />INSERT 0 1<br />test=# -- Show both.<br />test=# select * from mv
orderby id;<br /> id | val  <br />----+------<br />  1 | one<br />  2 | two<br />  3 |<br />  4 | four<br />(4 rows)<br
/><br/>test=# select * from mv_temp order by id;<br /> id | val  <br />----+------<br />  2 | zwei<br />  3 |<br />  4
|<br/>  5 | five<br />(4 rows)<br /><br />test=# -- Perform the differential update's delete.<br />test=# delete from
mvwhere not exists (select * from mv_temp<br />test(#    where (mv_temp.*) is not distinct from (mv.*));<br />DELETE
3<br/>test=# -- Show both.<br />test=# select * from mv order by id;<br /> id | val<br />----+-----<br />  3 |<br />(1
row)<br/><br />test=# select * from mv_temp order by id;<br /> id | val  <br />----+------<br />  2 | zwei<br />  3
|<br/>  4 |<br />  5 | five<br />(4 rows)<br /><br />test=# -- Perform the differential update's insert.<br />test=#
insertinto mv select * from mv_temp where not exists<br />test-#   (select * from mv where (mv.*) is not distinct from
(mv_temp.*));<br/>INSERT 0 3<br />test=# -- Show both.<br />test=# select * from mv order by id;<br /> id | val  <br
/>----+------<br/>  2 | zwei<br />  3 |<br />  4 |<br />  5 | five<br />(4 rows)<br /><br />test=# select * from
mv_temporder by id;<br /> id | val  <br />----+------<br />  2 | zwei<br />  3 |<br />  4 |<br />  5 | five<br />(4
rows)<br/><br />--<br />Kevin Grittner<br />EnterpriseDB: http://www.enterprisedb.com<br />The Enterprise PostgreSQL
Company<br/><br /></div> 

Re: Differential (transactional) REFRESH

From
Thom Brown
Date:
On 14 May 2013 21:04, Kevin Grittner <kgrittn@ymail.com> wrote:
> Thom Brown <thom@linux.com> wrote:
>> On 14 May 2013 19:51, Kevin Grittner <kgrittn@ymail.com> wrote:
>>> In the first CF for 9.4 I plan to submit a patch to allow
>>> transactional REFRESH of a materialized view using differential
>>> update.  Essentially I expect this to be the equivalent of running
>>> the query specified for the view and saving the results into a
>>> temporary table, and then doing DELETE and INSERT passes to make
>>> the matview match the new data.  If mv is the matview and mv_temp
>>> is the temporary storage for the new value for its data, the logic
>>> would be roughly the equivalent of:
>>>
>>> BEGIN;
>>> LOCK mv IN SHARE ROW EXCLUSIVE MODE;
>>> CREATE TEMP TABLE mv_temp AS [mv query];
>>> -- Create indexes here???  Capture statistics on temp table???
>>> DELETE FROM mv WHERE NOT EXISTS (SELECT * FROM mv_temp
>>>  WHERE (mv_temp.*) IS NOT DISTINCT FROM (mv.*));
>>> INSERT INTO mv SELECT * FROM mv_temp WHERE NOT EXISTS
>>>  (SELECT * FROM mv WHERE (mv.*) IS NOT DISTINCT FROM (mv_temp.*));
>>> COMMIT;
>>
>> Wouldn't this either delete everything or nothing, followed by
>> inserting everything or nothing?  WHERE NOT EXISTS wouldn't perform
>> any matching, just check to see whether there were matches or no
>> matches.
>
> No.
>
> test=# -- Mock up the matview and the generated temp replacement in regular
> tables
> test=# -- for purposes of demonstration.
> test=# create table mv (id int not null primary key, val text);
> CREATE TABLE
> test=# insert into mv values (1, 'one'), (2, 'two'), (3, null), (4, 'four');
> INSERT 0 4
> test=# create temp table mv_temp as select * from mv;
> SELECT 4
> test=# update mv_temp set val = null where id = 4;
> UPDATE 1
> test=# update mv_temp set val = 'zwei' where id = 2;
> UPDATE 1
> test=# delete from mv_temp where id = 1;
> DELETE 1
> test=# insert into mv_temp values (5, 'five');
> INSERT 0 1
> test=# -- Show both.
> test=# select * from mv order by id;
>  id | val
> ----+------
>   1 | one
>   2 | two
>   3 |
>   4 | four
> (4 rows)
>
> test=# select * from mv_temp order by id;
>  id | val
> ----+------
>   2 | zwei
>   3 |
>   4 |
>   5 | five
> (4 rows)
>
> test=# -- Perform the differential update's delete.
> test=# delete from mv where not exists (select * from mv_temp
> test(#    where (mv_temp.*) is not distinct from (mv.*));
> DELETE 3
> test=# -- Show both.
> test=# select * from mv order by id;
>  id | val
> ----+-----
>   3 |
> (1 row)
>
> test=# select * from mv_temp order by id;
>  id | val
> ----+------
>   2 | zwei
>   3 |
>   4 |
>   5 | five
> (4 rows)
>
> test=# -- Perform the differential update's insert.
> test=# insert into mv select * from mv_temp where not exists
> test-#   (select * from mv where (mv.*) is not distinct from (mv_temp.*));
> INSERT 0 3
> test=# -- Show both.
> test=# select * from mv order by id;
>  id | val
> ----+------
>   2 | zwei
>   3 |
>   4 |
>   5 | five
> (4 rows)
>
> test=# select * from mv_temp order by id;
>  id | val
> ----+------
>   2 | zwei
>   3 |
>   4 |
>   5 | five
> (4 rows)

You're right, I think I'm having a brain-fail!  I'm too used to seeing
the behaviour from WHERE NOT EXISTS (SELECT 1...).  Apologies for the
noise.

--
Thom