Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers

From Yugo Nagata
Subject Re: Implementing Incremental View Maintenance
Date
Msg-id 20191126160225.a0e95389a0f83b9257080745@sraoss.co.jp
Whole thread Raw
In response to Re: Implementing Incremental View Maintenance  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Responses Re: Implementing Incremental View Maintenance  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Re: Implementing Incremental View Maintenance  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Re: Implementing Incremental View Maintenance  (Yugo Nagata <nagata@sraoss.co.jp>)
List pgsql-hackers
Hi,

Attached is the latest patch (v8) to add support for Incremental View
Maintenance (IVM).  This patch adds OUTER join support in addition
to the patch (v7) submitted last week in the following post.

On Fri, 22 Nov 2019 15:29:45 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> Up to now, IVM supports materialized views using:
> 
> - Inner joins
> - Some aggregate functions (count, sum, min, max, avg)
> - GROUP BY
> - Self joins
> 
> With the latest patch now IVM supports subqueries in addition to
> above.
> 
> Known limitations are listed here:
> 
> https://github.com/sraoss/pgsql-ivm/issues
> 
> See more details at:
> https://wiki.postgresql.org/wiki/Incremental_View_Maintenance

* About outer join support:

In case of outer-join, when a table is modified, in addition to deltas
which occur in inner-join case, we also need to deletion or insertion of
dangling tuples, that is, null-extended tuples generated when a join
condition isn't met.

[Example]
---------------------------------------------
-- Create base tables and an outer join view
CREATE TABLE r(i int);
CREATE TABLE s(j int);
INSERT INTO r VALUES (1);
CREATE INCREMENTAL MATERIALIZED VIEW mv 
 AS SELECT * FROM r LEFT JOIN s ON r.i=s.j;
SELECT * FROM mv;
 i | j 
---+---
(1 row)

-- After an insertion to a base table ...
INSERT INTO s VALUES (1);
 -- (1,1) is inserted and (1,null) is deleted from the view.
SELECT * FROM mv;
 i | j 
---+---
 1 | 1
(1 row)
---------------------------------------------

Our implementation is basically based on the algorithm of Larson & Zhou
(2007) [1]. Before view maintenances, the view definition query's jointree
is analysed to make "view maintenance graph". This graph represents
which tuples in the views are affected when a base table is modified.
Specifically, tuples which are not null-extended on the modified table
(that is, tuples generated by joins with the modiifed table) are directly
affected. The delta of such effects are calculated similarly to inner-joins.

On the other hand, dangling tuples generated by anti-joins with directly
affected tuples can be indirectly affected. This means that we may need to
delete dangling tuples when any tuples are inserted to a table, as well as
to insert dangling tuples when tuples are deleted from a table.

[1] Efficient Maintenance of Materialized Outer-Join Views (Larson & Zhou, 2007)
https://ieeexplore.ieee.org/document/4221654

Although the original paper assumes that every base table and view have a
unique key and tuple duplicates is disallowed, we allow this. If a view has
tuple duplicates, we have to determine the number of each dangling tuple to
be inserted into the view when tuples in a table are deleted. For this purpose,
we count the number of each tuples which constitute a deleted tuple. These
counts are stored as JSONB object in the delta table, and we use this
information to maintain outer-join views. Also, we support outer self-joins
that is not assumed in the original paper.

* Restrictions

Currently, we have following restrictions:

- outer join view's targetlist must contain attributes used in join conditions
- outer join view's targetlist cannot contain non-strict functions
- outer join supports only simple equijoin
- outer join view's WHERE clause cannot contain non null-rejecting predicates
- aggregate is not supported with outer join
- subquery (including EXSITS) is not supported with outer join


Regression tests for all patterns of 3-way outer join and are added. 

Moreover, I reordered IVM related functions in matview.c so that ones
which have relationship will be located closely. Moreover, I added more
comments.

Regards,
Yugo Nagata


-- 
Yugo Nagata <nagata@sraoss.co.jp>

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Next
From: Tatsuo Ishii
Date:
Subject: Re: Implementing Incremental View Maintenance