Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers

From Yugo NAGATA
Subject Re: Implementing Incremental View Maintenance
Date
Msg-id 20200413141835.04fcdf6424985369554b6a4d@sraoss.co.jp
Whole thread Raw
In response to Re: Implementing Incremental View Maintenance  (Yugo NAGATA <nagata@sraoss.co.jp>)
Responses Re: Implementing Incremental View Maintenance  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Re: Implementing Incremental View Maintenance  (Yugo NAGATA <nagata@sraoss.co.jp>)
List pgsql-hackers
On Fri, 10 Apr 2020 23:26:58 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:

> Hi, 
> 
> Attached is the latest patch (v15) to add support for Incremental Materialized
> View Maintenance (IVM). It is possible to apply to current latest master branch.

I found a mistake of splitting patch, so I attached the fixed patch (v15a).
 
> Differences from the previous patch (v14) include:
> 
> * Fix to not use generate_series when views are queried
>     
> In the previous implementation, multiplicity of each tuple was stored
> in ivm_count column in views. When SELECT was issued for views with
> duplicate, the view was replaced with a subquery in which each tuple
> was joined with generate_series function in order to output tuples
> of the number of ivm_count.
> 
> This was problematic for following reasons:
>   
> - The overhead was huge. When almost of tuples in a view were selected,
>   it took much longer time than the original query. This lost the meaning
>   of materialized views.
>     
> - Optimizer could not estimate row numbers correctly because this had to
>   know ivm_count values stored in tuples.
>     
> - System columns of materialized views like cmin, xmin, xmax could not
>   be used because a view was replaced with a subquery.
>     
> To resolve this, the new implementation doen't store multiplicities
> for views with tuple duplicates, and doesn't use generate_series
> when SELECT query is issued for such views.
>     
> Note that we still have to use ivm_count for supporting DISTINCT and
> aggregates.

I also explain the way of updating views with tuple duplicates.

Although a view itself doesn't have ivm_count column, multiplicities
for old delta and new delta are calculated and the count value is
contained in a column named __ivm_count__ in each delta table.
    
The old delta table is applied using ctid and row_number function. 
row_number is used to numbering tuples in the view, and tuples whose
number is equal or is less than __ivm_count__ are deleted from the
view using a query like:
    
 DELETE FROM matviewname WHERE ctid IN (
    SELECT tid FROM (
       SELECT row_number() over (partition by c1, c2, ...) AS __ivm_row_number__,
              mv.ctid AS tid,
              diff.__ivm_count__
       FROM matviewname AS mv, old_delta AS diff "
       WHERE mv.c1 = diff.c1 AND mv.c2 = diff.c2 AND ... ) v
 WHERE v.__ivm_row_number__ <=  v.__ivm_count__
    
The new delta is applied using generate_seriese to insert mutiple same
tuples, using a query like:

 INSERT INTO matviewname (c1, c2, ...)
   SELECT c1,c2,... FROM (
     SELECT diff.*, generate_series(

> 
> *  Add query checks for IVM restrictions
>     
> Query checks for following restrictions are added:
>     
> - DISTINCT ON
> - TABLESAMPLE parameter
> - inheritance parent table
> - window function
> - some aggregate options(such as FILTER, DISTINCT, ORDER and GROUPING SETS)
> - targetlist containing IVM column
> - simple subquery is only supported
> - FOR UPDATE/SHARE
> - empty target list
> - UNION/INTERSECT/EXCEPT
> - GROUPING SETS clauses
> 
> * Improve error messages
>     
> Add error code ERRCODE_FEATURE_NOT_SUPPORTED to each IVM error message.
> Also, the message format was unified.
> 
> * Support subqueries containig joins in FROM clause
>     
> Previously, when multi tables are updated simultaneously, incremental
> view maintenance with subqueries including JOIN didn't work correctly
> due to a bug. 
> 
> Best Regards,
> Takuma Hoshiai
> 
> -- 
> Yugo NAGATA <nagata@sraoss.co.jp>


-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: doc review for parallel vacuum
Next
From: Amit Kapila
Date:
Subject: Re: WAL usage calculation patch