Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers

From Yugo Nagata
Subject Re: Implementing Incremental View Maintenance
Date
Msg-id 20190628195620.c306e3003a83bb85a12f54c5@sraoss.co.jp
Whole thread Raw
In response to Re: Implementing Incremental View Maintenance  (Yugo Nagata <nagata@sraoss.co.jp>)
Responses Re: Implementing Incremental View Maintenance
List pgsql-hackers
Hi,

Attached is a WIP patch of IVM which supports some aggregate functions.

Currently, only count and sum are supported. Avg, min, or max is not supported 
although I think supporting this would not be so hard. 

As a restriction, expressions specified in GROUP BY must appear in the target 
list of views because tuples to be updated in MV are identified by using this 
group keys.


In the case of views without aggregate functions, only the number of tuple 
duplicates (__ivm_count__) are updated at incremental maintenance. On the other 
hand, in the case of vies with aggregations, the aggregated values are also 
updated. The way of update depends the kind of aggregate function.

In the case of sum (or agg functions except to count), NULL in input values is 
ignored, and this returns a null value when no rows are selected. To support 
this specification, the number of not-NULL input values is counted and stored 
in MV as a hidden column whose name is like __ivm_count_sum__, for example.

In the case of count, this returns zero when no rows are selected, and count(*) 
doesn't ignore NULL input. These specification are also supported.

Tuples to be updated in MV are identified by using keys specified by GROUP BY 
clause. However, in the case of aggregation without GROUP BY, there is only one 
tuple in the view, so keys are not uses to identify tuples.


In addition, a race condition which occurred in the previous version is 
prevented in this patch. In the previous version, when two translocations 
change a base tables concurrently, an anormal update of MV was possible because 
a change in one transaction was not visible for another transaction even in 
READ COMMITTED level. 

To prevent this, I fix this to take a lock in early stage of view maintenance 
to wait for concurrent transactions which are updating the same MV end. Also, 
we have to get the latest snapshot before computting delta tables because any 
changes which occurs in other transaction during lock waiting is not visible 
even in READ COMMITTED level.

In REPEATABLE READ or SERIALIZABLE level, don't wait a lock, and raise an error 
immediately to prevent anormal update. These solutions might be ugly, but 
something to prevent anormal update is anyway necessary. There may be better 
way.


Moreover, some regression test are added for aggregate functions support.
This is Hoshiai-san's work.

Although the code is not refined yet and I will need a deal of refactoring
and  reorganizing, I submitted this to share the current status.


* Exapmle (from regression test)

=======================================================================
(1) creating tables

CREATE TABLE mv_base_a (i int, j int);
INSERT INTO mv_base_a VALUES
  (1,10),
  (2,20),
  (3,30),
  (4,40),
  (5,50);


(2) Views sith SUM() and COUNT() aggregation function

BEGIN;
CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_agg AS SELECT i, SUM(j), COUNT(i)  FROM mv_base_a GROUP BY i;
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-----+-------
 1 |  10 |     1
 2 |  20 |     1
 3 |  30 |     1
 4 |  40 |     1
 5 |  50 |     1
(5 rows)

INSERT INTO mv_base_a VALUES(2,100);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-----+-------
 1 |  10 |     1
 2 | 120 |     2
 3 |  30 |     1
 4 |  40 |     1
 5 |  50 |     1
(5 rows)

UPDATE mv_base_a SET j = 200 WHERE (i,j) = (2,100);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-----+-------
 1 |  10 |     1
 2 | 220 |     2
 3 |  30 |     1
 4 |  40 |     1
 5 |  50 |     1
(5 rows)

DELETE FROM mv_base_a WHERE (i,j) = (2,200);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-----+-------
 1 |  10 |     1
 2 |  20 |     1
 3 |  30 |     1
 4 |  40 |     1
 5 |  50 |     1
(5 rows)

ROLLBACK;


(3) Views with COUNT(*) aggregation function

BEGIN;
CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_agg AS SELECT i, SUM(j),COUNT(*)  FROM mv_base_a GROUP BY i;
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-----+-------
 1 |  10 |     1
 2 |  20 |     1
 3 |  30 |     1
 4 |  40 |     1
 5 |  50 |     1
(5 rows)

INSERT INTO mv_base_a VALUES(2,100);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-----+-------
 1 |  10 |     1
 2 | 120 |     2
 3 |  30 |     1
 4 |  40 |     1
 5 |  50 |     1
(5 rows)

ROLLBACK;

(4) Views with aggregation function without GROUP clause

BEGIN;
CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_group AS SELECT SUM(j)FROM mv_base_a;
SELECT * FROM mv_ivm_group ORDER BY 1;
 sum
-----
 150
(1 row)

INSERT INTO mv_base_a VALUES(6,20);
SELECT * FROM mv_ivm_group ORDER BY 1;
 sum
-----
 170
(1 row)
=======================================================================



On Thu, 20 Jun 2019 16:44:10 +0900
Yugo Nagata <nagata@sraoss.co.jp> wrote:

> Hi hackers,
> 
> Thank you for your many questions and feedbacks at PGCon 2019.
> Attached is the patch rebased for the current master branch.
> 
> Regards,
> Yugo Nagata
> 
> On Tue, 14 May 2019 15:46:48 +0900
> Yugo Nagata <nagata@sraoss.co.jp> wrote:
> 
> > On Mon, 1 Apr 2019 12:11:22 +0900
> > Yugo Nagata <nagata@sraoss.co.jp> wrote:
> > 
> > > On Thu, 27 Dec 2018 21:57:26 +0900
> > > Yugo Nagata <nagata@sraoss.co.jp> wrote:
> > > 
> > > > Hi,
> > > > 
> > > > I would like to implement Incremental View Maintenance (IVM) on PostgreSQL.  
> > > 
> > > I am now working on an initial patch for implementing IVM on PostgreSQL.
> > > This enables materialized views to be updated incrementally after one
> > > of their base tables is modified.
> > 
> > Attached is a WIP patch of Incremental View Maintenance (IVM).
> > Major part is written by me, and changes in syntax and pg_class
> > are Hoshiai-san's work.
> > 
> > Although this is sill a draft patch in work-in-progress, any
> > suggestions or thoughts would be appreciated.
> >  
> > * What it is
> > 
> > This allows a kind of Immediate Maintenance of materialized views. if a 
> > materialized view is created by CRATE INCREMENTAL MATERIALIZED VIEW command,
> > the contents of the mateview is updated automatically and incrementally
> > after base tables are updated. Noted this syntax is just tentative, so it
> > may be changed.
> > 
> > ====== Example 1 ======
> > postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m AS SELECT * FROM t0;
> > SELECT 3
> > postgres=# SELECT * FROM m;
> >  i 
> > ---
> >  3
> >  2
> >  1
> > (3 rows)
> > 
> > postgres=# INSERT INTO t0 VALUES (4);
> > INSERT 0 1
> > postgres=# SELECt * FROM m; -- automatically updated
> >  i 
> > ---
> >  3
> >  2
> >  1
> >  4
> > (4 rows)
> > =============================
> > 
> > This implementation also supports matviews including duplicate tuples or
> > DISTINCT clause in its view definition query.  For example, even if a matview
> > is defined with DISTINCT to remove duplication of tuples in a base table, this
> > can perform incremental update of the matview properly. That is, the contents
> > of the matview doesn't change when exiting tuples are inserted into the base
> > tables, and a tuple in the matview is deleted only when duplicity of the
> > corresponding tuple in the base table becomes zero.
> > 
> > This is due to "colunting alogorithm" in which the number of each tuple is
> > stored in matviews as a special column value.
> > 
> > ====== Example 2 ======
> > postgres=# SELECT * FROM t1;
> >  id | t 
> > ----+---
> >   1 | A
> >   2 | B
> >   3 | C
> >   4 | A
> > (4 rows)
> > 
> > postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m1 AS SELECT t FROM t1;
> > SELECT 3
> > postgres=# CREATE INCREMENTAL MATERIALIZED VIEW m2 AS SELECT DISTINCT t FROM t1;
> > SELECT 3
> > postgres=# SELECT * FROM m1; -- with duplicity
> >  t 
> > ---
> >  A
> >  A
> >  C
> >  B
> > (4 rows)
> > 
> > postgres=# SELECT * FROM m2;
> >  t 
> > ---
> >  A
> >  B
> >  C
> > (3 rows)
> > 
> > postgres=# INSERT INTO t1 VALUES (5, 'B');
> > INSERT 0 1
> > postgres=# DELETE FROM t1 WHERE id IN (1,3); -- delete (1,A),(3,C)
> > DELETE 2
> > postgres=# SELECT * FROM m1; -- one A left and one more B
> >  t 
> > ---
> >  B
> >  B
> >  A
> > (3 rows)
> > 
> > postgres=# SELECT * FROM m2; -- only C is removed
> >  t 
> > ---
> >  B
> >  A
> > (2 rows)
> > =============================
> > 
> > * How it works
> > 
> > 1. Creating matview
> > 
> > When a matview is created, AFTER triggers are internally created
> > on its base tables. When the base tables is modified (INSERT, DELETE,
> > UPDATE), the matview is updated incrementally in the trigger function.
> > 
> > When populating the matview, GROUP BY and count(*) are added to the
> > view definition query before this is executed for counting duplicity
> > of tuples in the matview. The result of count is stored in the matview
> > as a special column named "__ivm_count__". 
> > 
> > 2. Maintenance of matview
> > 
> > When base tables are modified, the change set of the table can be
> > referred as Ephemeral Named Relations (ENRs) thanks to Transition Table
> > (a feature of trigger implemented since PG10). We can calculate the diff
> > set of the matview by replacing the base table in the view definition
> > query with the ENR (at least if it is Selection-Projection -Join view). 
> > As well as view definition time, GROUP BY and count(*) is added in order
> > to count the duplicity of tuples in the diff set. As a result, two diff
> > sets (to be deleted from and to be inserted into the matview) are
> > calculated, and the results are stored into temporary tables respectively.
> > 
> > The matiview is updated by merging these change sets. Instead of executing
> > DELETE or INSERT simply, the values of __ivm_count__ column in the matview
> > is decreased or increased. When the values becomes zero, the corresponding
> > tuple is deleted from the matview.
> > 
> > 3. Access to matview
> > 
> > When SELECT is issued for IVM matviews defined with DISTINCT, all columns
> > except to __ivm_count__ of each tuple in the matview are returned.  This is 
> > correct because duplicity of tuples are eliminated by GROUP BY.
> > 
> > When DISTINCT is not used, SELECT for the IVM matviews returns each tuple
> > __ivm_count__ times. Currently, this is implemented by rewriting the SELECT
> > query to replace the matview RTE with a subquery which joins the matview
> > and generate_series function as bellow. 
> > 
> >  SELECT mv.* FROM mv, generate_series(1, mv.__ivm_count__);
> > 
> > __ivm_count__ column is invisible for users when "SELECT * FROM ..." is
> > issued, but users can see the value by specifying in target list explicitly.
> > 
> > ====== Example 3 ======
> > postgres=# \d+ m1
> >                                   Materialized view "public.m1"
> >     Column     |  Type  | Collation | Nullable | Default | Storage  | Stats target | Description 
> > ---------------+--------+-----------+----------+---------+----------+--------------+-------------
> >  t             | text   |           |          |         | extended |              | 
> >  __ivm_count__ | bigint |           |          |         | plain    |              | 
> > View definition:
> >  SELECT t1.t
> >    FROM t1;
> > Access method: heap
> > 
> > postgres=# \d+ m2
> >                                   Materialized view "public.m2"
> >     Column     |  Type  | Collation | Nullable | Default | Storage  | Stats target | Description 
> > ---------------+--------+-----------+----------+---------+----------+--------------+-------------
> >  t             | text   |           |          |         | extended |              | 
> >  __ivm_count__ | bigint |           |          |         | plain    |              | 
> > View definition:
> >  SELECT DISTINCT t1.t
> >    FROM t1;
> > Access method: heap
> > 
> > postgres=# SELECT *, __ivm_count__ FROM m1;
> >  t | __ivm_count__ 
> > ---+---------------
> >  B |             2
> >  B |             2
> >  A |             1
> > (3 rows)
> > 
> > postgres=# SELECT *, __ivm_count__ FROM m2;
> >  t | __ivm_count__ 
> > ---+---------------
> >  B |             2
> >  A |             1
> > (2 rows)
> > 
> > postgres=# EXPLAIN SELECT * FROM m1;
> >                                   QUERY PLAN                                  
> > ------------------------------------------------------------------------------
> >  Nested Loop  (cost=0.00..61.03 rows=3000 width=2)
> >    ->  Seq Scan on m1 mv  (cost=0.00..1.03 rows=3 width=10)
> >    ->  Function Scan on generate_series  (cost=0.00..10.00 rows=1000 width=0)
> > (3 rows)
> > =============================
> > 
> > * Simple Performance Evaluation
> > 
> > I confirmed that "incremental" update of matviews is more effective
> > than the standard REFRESH by using simple exapmle.  I used tables
> > of pgbench (SF=100) here.
> > 
> > Create two matviews, that is, without and with IVM.
> > 
> > test=# CREATE MATERIALIZED VIEW bench1 AS
> >         SELECT aid, bid, abalance, bbalance
> >         FROM pgbench_accounts JOIN pgbench_branches USING (bid)
> >         WHERE abalance > 0 OR bbalance > 0;
> > SELECT 5001054
> > test=# CREATE INCREMENTAL MATERIALIZED VIEW bench2 AS
> >         SELECT aid, bid, abalance, bbalance
> >         FROM pgbench_accounts JOIN pgbench_branches USING (bid)
> >         WHERE abalance > 0 OR bbalance > 0;
> > SELECT 5001054
> > 
> > The standard REFRESH of bench1 took more than 10 seconds.
> > 
> > test=# \timing 
> > Timing is on.
> > test=# REFRESH MATERIALIZED VIEW bench1 ;
> > REFRESH MATERIALIZED VIEW
> > Time: 11210.563 ms (00:11.211)
> > 
> > Create an index on the IVM matview (bench2).
> > 
> > test=# CREATE INDEX on bench2(aid,bid);
> > CREATE INDEX
> > 
> > Updating a tuple in pgbench_accounts took 18ms. After this, bench2
> > was updated automatically and correctly.
> > 
> > test=# SELECT * FROM bench2 WHERE aid = 1;
> >  aid | bid | abalance | bbalance 
> > -----+-----+----------+----------
> >    1 |   1 |       10 |       10
> > (1 row)
> > 
> > Time: 2.498 ms
> > test=# UPDATE pgbench_accounts SET abalance = 1000 WHERE aid = 1;
> > UPDATE 1
> > Time: 18.634 ms
> > test=# SELECT * FROM bench2 WHERE aid = 1;
> >  aid | bid | abalance | bbalance 
> > -----+-----+----------+----------
> >    1 |   1 |     1000 |       10
> > (1 row)
> > 
> > However, if there is not the index on bench2, it took 4 sec, so
> > appropriate indexes are needed on IVM matviews.
> > 
> > test=# DROP INDEX bench2_aid_bid_idx ;
> > DROP INDEX
> > Time: 10.613 ms
> > test=# UPDATE pgbench_accounts SET abalance = 2000 WHERE aid = 1;
> > UPDATE 1
> > Time: 3931.274 ms (00:03.931)
> > 
> > * Restrictions on view definition
> > 
> > This patch is still in Work-in-Progress and there are many restrictions
> > on the view definition query of matviews.
> > 
> > The current implementation supports views including selection, projection,
> > and inner join with or without DISTINCT. Aggregation and GROUP BY are not
> > supported yet, but I plan to deal with these by the first release. 
> > Self-join, subqueries, OUTER JOIN, CTE, window functions are not
> > considered well, either. I need more investigation on these type of views
> > although I found some papers explaining how to handle sub-queries and
> > outer-joins. 
> > 
> > These unsupported views should be checked when a matview is created, but
> > this is not implemented yet.  Hoshiai-san are working on this.
> > 
> > * Timing of view maintenance
> > 
> > This patch implements a kind of Immediate Maintenance, that is, a matview
> > is updated immediately when a base table is modified.  On other hand, in
> > "Deferred Maintenance", matviews are updated after the transaction, for
> > example, by the user command like REFRESH. 
> > 
> > For implementing "deferred", it is need to implement a mechanism to maintain
> > logs for recording changes of base tables and an algorithm to compute the
> > delta to be applied to matviews. 
> >  
> > In addition, there could be another implementation of Immediate Maintenance
> > in which matview is updated at the end of a transaction that modified base
> > table, rather than in AFTER trigger. Oracle supports this type of IVM. To
> > implement this, we will need a mechanism to maintain change logs on base
> > tables as well as Deferred maintenance.
> > 
> > * Counting algorithm implementation
> > 
> > There will be also discussions on counting-algorithm implementation.
> > Firstly, the current patch treats "__ivm_count__" as a special column name
> > in a somewhat ad hoc way. This is used when maintaining and accessing matviews,
> > and when "SELECT * FROM ..." is issued,  __ivm_count__ column is invisible for
> > users.  Maybe this name has to be inhibited in user tables. Is it acceptable
> > to use such columns for IVM, and is there better way, if not?
> > 
> > Secondly, a matview with duplicate tuples is replaces with a subquery which
> > uses generate_series function. It does not have to be generate_series, and we
> > can make a new set returning function for this. Anyway, this internal behaviour
> > is visible in EXPLAIN results as shown in Example 3. Also, there is a
> > performance impact because   estimated rows number is wrong, and what is worse,
> > the cost of join is not small when the size of matview is large. Therefore, we
> > might have to add a new plan node for selecting from matviews rather than using
> > such a special set returning function.
> > 
> > 
> > Ragards,
> > -- 
> > Yugo Nagata <nagata@sraoss.co.jp>
> 
> 
> -- 
> Yugo Nagata <nagata@sraoss.co.jp>


-- 
Yugo Nagata <nagata@sraoss.co.jp>

Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: postgres_fdw: Minor improvement to postgresAcquireSampleRowsFunc
Next
From: Surafel Temesgen
Date:
Subject: Re: Conflict handling for COPY FROM