Home > mailing lists
Re: Incremental View Maintenance, take 2 - Mailing list pgsql-hackers

From	Yugo Nagata
Subject	Re: Incremental View Maintenance, take 2
Date	August 29 23:04:03
Msg-id	20250830050403.73fd450cd2712acca11494a8@sraoss.co.jp Whole thread Raw
In response to	Re: Incremental View Maintenance, take 2 (Kirill Reshke <reshkekirill@gmail.com>)
List	pgsql-hackers
Tree view
Hello,

I apologize for the long silence.
I would like to resume working on this now.

Thank you, Kirill Reshke, for your many comments and suggestions.
Here are my (admittedly) late responses:

> 1) `Add-relisivm-column-to-pg_class-system-catalog` commit message
> should be fixed, there is "isimmv" in the last line.

I'll fix it.

> 2) I dont get why `Add-Incremental-View-Maintenance-support.patch`
> goes after 0005 & 0004. Shoulndt we first implement feature server
> side, only when client (psql & pg_dump) side?

Makes sense. Perhaps they don’t even need to be separated.
I’ll plan to reorganize the patch set anyway.

> 3) Can we provide regression tests for each function separately? Test
> for main feature in main patch, test for DISTINCT support in
> v34-0007-Add-DISTINCT-support-for-IVM.patch etc? This way the patchset
> will be easier to review, and can be committed separelety.

Makes sense. I'll do that while reorganizing the patch set.

> 4) v34-0006-Add-Incremental-View-Maintenance-support.patch no longer
> applies due to 4b74ebf726d444ba820830cad986a1f92f724649. After
> resolving issues manually, it does not compile, because
> 4b74ebf726d444ba820830cad986a1f92f724649 also removes
> save_userid/save_sec_context fields from ExecCreateTableAs.
>
> >   if (RelationIsIVM(matviewRel) && stmt->skipData)
> Now this function accepts skipData param.

Yeah, a rebase is definitely required for the patch set.

> 5) For DISTINCT support patch uses hidden __ivm* columns. Is this
> design discussed anywhere? I wonder if this is a necessity (only
> solution) or if there are alternatives.

I believe we need to store the counting information somehow.
More discussion on how hidden columns should be handled would still be
necessary, though.

> 6)
> What are the caveats of supporting some simple cases for aggregation
> funcs like in example?
> ```
> regress=# CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_2 AS SELECT
> sum(j) + sum(i) from mv_base_a;
> ERROR:  expression containing an aggregate in it is not supported on
> incrementally maintainable materialized view
> ```
> I can see some difficulties with division CREATE IMMV .... AS SELECT
> 1/sum(i) from mv_base_a;  (sum(i) == 0 case), but adding &
> multiplication should be ok, aren't they?

This could theoretically be supported. For example, for sum(j) + sum(i),
if the values of sum(j) and sum(i) are stored in the view respectively,
their new values could be incrementally calculated, and the new column
value could be derived from them. However, this logic is not yet implemented.

The current aggregate support, as you’ve seen, is implemented individually
for each supported aggregate using SQL via SPI, in a somewhat ad hoc manner.
Therefore, we may need more sophisticated infrastructure, as well as further
design discussion on this topic.

> Another suggestion: support for \d and \d+ commands in psql. With v34
> patchset applied, psql does not show anything IMMV-related in \d mode.

I'll fix it.

> ```
> db2=# create incremental materialized view v2 as select * from v1;
> ERROR:  VIEW or MATERIALIZED VIEW is not supported on incrementally
> maintainable materialized view
> ```
> Error messaging is not true, create view v2 as select * from v1; works fine.

The current implementation does not handle situations in which a base table
of v1 is modified.

> ```
> db2=# create incremental materialized view vv2 as select i,j2, i / j2
> from t1 join t2 on true;
> db2=# insert into t2 values(1,0);
> ERROR:  division by zero
> ```
> It is very strange to receive `division by zero` while inserting into
> relation, isn't it? Can we add some hints/CONTEXT here?

Hmm, the current behavior is similar to stored generated columns.
However, it is not as intuitive that a table is involved in incremental
maintenance as it is when the table has generated columns.
Therefore, it would be better to add CONTEXT to the error message, 
although we may need to add some infrastructure for this in the logging system.

> 1) Provided patches do not set process title correctly:
> ```
> reshke   2602433 18.7  0.1 203012 39760 ?        Rs   20:41   1:58
> postgres: reshke ivm [local] CREATE MATERIALIZED VIEW
> ```

Ok, it would be better if the process title reflected the actual syntax.

> 2) We allow to REFRESH IMMV. Why? IMMV should be always up to date.
> Well, I can see that this utility command may be useful in case of
> corruption of some base relation/view itself, so there will be a need
> to rebuild the whole from scratch.
> But we already have VACUUM FULL for this, aren't we?

Since REFRESH is already allowed for normal materialized views, I see no
reason to prohibit it for IMMVs. Moreover, incremental updates of numeric
aggregates may cause drift between the view and the actual data due
to accumulated error. REFRESH can be used to correct the discrepancy.

> 3) Triggers created for IMMV are not listed via \dS [tablename]

The triggers are not listed because they are internal. Likewise, triggers
created on tables with foreign key constraints are not displayed by \dS.
Still, it may be helpful to indicate that the table is involved in an IMMV.

There is also an alternative approach that implements IVM without relying
on triggers.

> 4) apply_old_delta_with_count executes non-trivial SQL statements for
> IMMV. It would be really helpful to see this in EXPLAIN ANALYZE.

It would indeed be helpful if EXPLAIN ANALYZE provided some useful information
about incremental maintenance. That said, I don’t yet have a clear idea of the
best way to achieve this.

> 5)
> + "DELETE FROM %s WHERE ctid IN ("
> + "SELECT tid FROM (SELECT pg_catalog.row_number() over (partition by %s) AS \"__ivm_row_number__\","
> +  "mv.ctid AS tid,"
> +  "diff.\"__ivm_count__\""
> + "FROM %s AS mv, %s AS diff "
> + "WHERE %s) v "
> + "WHERE v.\"__ivm_row_number__\" OPERATOR(pg_catalog.<=) v.\"__ivm_count__\")",
> + matviewname,
> + keysbuf.data,
> + matviewname, deltaname_old,
> + match_cond);
> 
> `SELECT pg_catalog.row_number()` is too generic to my taste. Maybe
> pg_catalog.immv_row_number() /  pg_catalog.get_immv_row_number() ?

I personally think the current query is not ideal, and unless the new window
function can simplify or improve the logic, it might not be worth creating a
new function.

> 6)
> +static void
> +apply_new_delta(const char *matviewname, const char *deltaname_new,
> + StringInfo target_list)
> +{
> + StringInfoData querybuf;
>+
> + /* Search for matching tuples from the view and update or delete if found. */
> 
> Is this comment correct? we only insert tuples here?

Seems incorrect. I'll fix it.

> 7)
>  During patch development, one should pick OIDs from range 8000-9999

I'll fix it.

> == Major suggestions.
>
> 1) At first glance, working with this IVM/IMMV infrastructure feels
> really unintuitive about what servers actually do for query execution.
> I do think It will be much better for user experience to add more
> EXPLAIN about IVM work done inside IVM triggers. This way it is much
> clearer which part is working slow, so which index should be created,
> etc.

Yes, as mentioned above, it would be helpful if EXPLAIN ANALYZE provided
useful information about incremental maintenance, although I don’t yet have
a clear idea of the best way to achieve this.

> 2) The kernel code for IVM lacks possibility to be extended for
> further IVM optimizations. The one example is foreign key optimization
> described here[1]. I'm not saying we should implement this within this
> patchset, but we surely should pave the way for this. I don't have any
> good suggestions for how to do this though.

I have read the paper you mentioned. If I understood and remember correctly,
it is based on a counting algorithm and assumes that the count is always
stored in the view or that views do not have duplicate rows. This is different
from our current approach, which allows duplicate rows, so I’m not sure whether
this optimization could be directly applied to our implementation.
(Perhaps we might need to restrict the view so that it does not contain
duplicate rows.)

In any case, I will try to reconsider the design to improve extensibility,
so that future optimizations can be more easily supported.


> 3) I don't really think SQL design is good. CREATE [INCREMENTAL] M.V.
> is too ad-hoc. I would prefer CREATE M.V. with (maintain_incr=true).
> (reloption name is just an example).
> This way we can change regular M.V. to IVM and vice versa via ALTER
> M.V. SET *reloptions* - a type of syntax that is already present in
> PostgreSQL core.

I like the idea of using a reloption rather than adding a new keyword
to the syntax. I’ll consider changing the interface accordingly.

> == Other thoughts

> In OLAP databases (see [2]), IVM opens the door for 'view
> exploitation' feature. That is, use IVM (which is always up-to-date)
> for query execution. But current IVM implementation is not compatible
> with Cloudberry Append-optimized Table Access Method. The problem is
> the 'table_tuple_fetch_row_version' call, which is used by
> ivm_visible_in_prestate to check tuple visibility within a snapshot. I
> am trying to solve this somehow. My current idea is the following:
> multiple base table modification via single statement along with tuple
> deletion from base tables are features. We can error-out these cases
> (at M.V. creation time) all for some TAMs, and support only insert &
> truncate. However, I don't know how to check if TAM supports
> 'tuple_fetch_row_version' other than calling it and receiving
> ERROR[3].

Thank you for sharing the information about Apache cloudberry AM.
Hmm, I am not sure whether PostgreSQL expects table_tuple_fetch_row_version
to raise an error, but if this were allowed as a table access method behavior,
some consideration would be required. 

> I reread this and I find this a little bit unclear. What I'm proposing
> here is specifying the type of operations IVM supports on creation
> time. So, one can run
> 
> CREATE IVM immv1 WITH (support_deletion = true/false,
> support_multiple_relation_change = true/false). Then, in the query
> execution time, we just ERROR if the query leads to deletion from IVM
> and support_deletion if false.

Such options indeed might enable some optimizations for certain kinds
of table access methods.

> == Minor nitpicks and typos.
> 
> reshke=# insert into tt select * from generate_series(1, 1000090);
> ^CCancel request sent
> ERROR:  canceling statement due to user request
> CONTEXT:  SQL statement "INSERT INTO public.mv1 (i, j) SELECT i, j
> FROM (SELECT diff.*, pg_catalog.generate_series(1,
> diff."__ivm_count__") AS __ivm_generate_series__ FROM new_delta AS
> diff) AS v"
> Time: 18883.883 ms (00:18.884)
> 
> This is very surprising, isn't it? We can set HINT here, to indicate
> where this query comes from.

I understand this concern. It would definitely be better to properly modify
the CONTEXT of the error message. As mentioned above, doing so would require
adding some infrastructure in the logging system.

> 2)
> deleted/deleted -> updated/deleted
> + /*
> + * XXX: When using DELETE or UPDATE, we must use exclusive lock for now
> + * because apply_old_delta(_with_count) uses ctid to identify the tuple
> + * to be deleted/deleted, but doesn't work in concurrent situations.

I'll fix it.

> 3) Typo in rewrite_query_for_postupdate_state:
> /* Retore the original RTE */

I'll fix it.

> 4) in apply_delta function has exactly one usage, so the 'use_count'
> param is redundant, because we already pass the 'query' param, and
> 'use_count' is calculated from the 'query'.

Right. I'll fix it.

> 5) in  calc_delta:
> > ListCell *lc = list_nth_cell(query->rtable, rte_index - 1);
> > RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
> Should we add Assert(list_lenght(lc) == 1) here? Can there be multiple
> items in this list?

I’m sorry, I didn’t quite catch your point. I believe this is meaningless
since 'lc' is not a list.

> 6) In get_prestate_rte:
> > appendStringInfo(&str,
> > "SELECT t.* FROM %s t"
> > " WHERE pg_catalog.ivm_visible_in_prestate(t.tableoid, t.ctid ,%d::pg_catalog.oid)",
> > relname, matviewid);
> 
> Identitation issue. This will not be fixed via pg_ident run, because
> this is str contant, so better so fix it by-hand.

I'll fix it.

> 7) in apply_new_delta_with_count:
> 
> > appendStringInfo(&querybuf,
> > "WITH updt AS (" /* update a tuple if this exists in the view */
> > "UPDATE %s AS mv SET %s = mv.%s OPERATOR(pg_catalog.+) diff.%s "
> 
> SET % OPERATOR(pg_catalog.=) mv.%s ?
> 
> same for append_set_clause_for_count, append_set_clause_for_sum,
> append_set_clause_for_minmax

> > /* avg = (mv.sum - t.sum)::aggtype / (mv.count - t.count) */
> > appendStringInfo(buf_old,
> > ", %s = %s OPERATOR(pg_catalog./) %s",
> 
> should be
> /* avg OPERATOR(pg_catalog.=) (mv.sum - t.sum)::aggtype / (mv.count -
> t.count) */
> appendStringInfo(buf_old,
> ", %s = %s OPERATOR(pg_catalog./) %s",

Here, “=” denotes assignment in the UPDATE ... SET ... command, rather
than the equality operator in pg_catalog.
 
> One little follow-up here. Why do we do prepstate visibility the way
> it is done? Can we instead export the snapshot in BEFORE trigger, save
> it somewhere and use it after?

We need the state of a table before it is updated. By importing the
snapshot exported in a BEFORE trigger, we can obtain the query results
from a past state, but this applies to the entire query, not to a specific
table within the query. What we really need is the result of the query
where only a specific table is seen in its past state.

Perhaps this could be done by modifying the executor (scan nodes?) to allow
using a specified snapshot for a specified table scan, which would be similar
to Oracle’s AS OF. However, such a change would not be trivial.


Thanks to your comments and suggestions, I made a poster about this patch [1]
for PGConf.dev 2025. Below is a list of open questions highlighted there:

1. Trigger-based Design: Should we avoid relying on triggers like declarative
partitioning?

2. Features Scope: Exclude aggregate, DISTINCT, and tuple duplicate supports
in the first release to simplify the patch and improve its reviewability?

3. Hidden Columns: How should they be handled?

4. Pre-update State of Table: Need infrastructure to scan a table using a
specified snapshot, instead of using the crafted sub-query?

5. Syntax: “CREATE INCREMENTAL MATERIALIZED VIEW” is tentative. Is
"CREATE MATERIALIZED VIEW … WITH (reloptions)" preferable?.

6. Other issues: EXPLAIN outputs, CONTEXT in an error message, etc.

In particular, I wonder whether it would be better to limit the scope of
the patch so that the proposal and discussion can move forward (#2).
This could help avoid some of the debates around aggregates support, hidden
columns, etc., and allow us to focus more on the core design.

Anyway, I will start rebasing the patches, reorganizing the patch set,
and applying fixes made in pg_ivm [2].


Regards,
Yugo Nagata

[1] https://2025.pgconf.dev/static/posters/poster_incremental_view_maintenance.pdf
[2] https://github.com/sraoss/pg_ivm
-- 
Yugo Nagata <nagata@sraoss.co.jp>
pgsql-hackers by date:
From: "Joel Jacobson"
Date: 29 August, 23:01:18
Subject: Re: Assert single row returning SQL-standard functions
From: Tom Lane
Date: 29 August, 23:09:40
Subject: Re: Making jsonb_agg() faster
Re: Incremental View Maintenance, take 2 - Mailing list pgsql-hackers

Previous

Next