Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers

From Yugo Nagata
Subject Re: Implementing Incremental View Maintenance
Date
Msg-id 20191129173328.e5a0e9f81e369a3769c4fd0c@sraoss.co.jp
Whole thread Raw
In response to Re: Implementing Incremental View Maintenance  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Implementing Incremental View Maintenance  (Tatsuo Ishii <ishii@sraoss.co.jp>)
List pgsql-hackers
On Thu, 28 Nov 2019 11:03:33 -0300
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> One thing pending in this development line is how to catalogue aggregate
> functions that can be used in incrementally-maintainable views.
> I saw a brief mention somewhere that the devels knew it needed to be
> done, but I don't see in the thread that they got around to doing it.
> Did you guys have any thoughts on how it can be represented in catalogs?
> It seems sine-qua-non ...

Yes, this is a pending issue. Currently, supported aggregate functions are
identified their name, that is, we support aggregate functions named "count",
"sum", "avg", "min", or "max". As mentioned before, this is not robust
because there might be user-defined aggregates with these names although all
built-in aggregates can be used in IVM.

In our implementation, the new aggregate values are calculated using "+" and
"-" operations for sum and count, "/" for agv, and ">=" / "<=" for min/max.
Therefore, if there is a user-defined aggregate on a user-defined type which
doesn't support these operators, errors will raise. Obviously, this is a
problem.  Even if these operators are defined, the semantics of user-defined
aggregate functions might not match with the way of maintaining views, and
resultant might be incorrect.

I think there are at least three options to prevent these problems.

In the first option, we support only built-in aggregates which we know able
to handle correctly. Supported aggregates can be identified using their OIDs.
User-defined aggregates are not supported. I think this is the simplest and
easiest way.

Second,  supported aggregates can be identified using name, like the current
implementation, but also it is checked if required operators are defined. In
this case, user-defined aggregates are allowed to some extent and we can
prevent errors during IVM although aggregates value in view might be
incorrect if the semantics doesn't match.

Third, we can add a new attribute to pg_aggregate which shows if each
aggregate can be used in IVM. We don't need to use names or OIDs list of
supported aggregates although we need modification of the system catalogue.

Regarding pg_aggregate, now we have aggcombinefn attribute for supporting
partial aggregation. Maybe we could use combine functions to calculate new
aggregate values in IVM when tuples are inserted into a table. However, in
the context of IVM, we also need other function used when tuples are deleted
from a table, so we can not use partial aggregation for IVM in the current
implementation. This might be another option to implement "inverse combine
function"(?) for IVM, but I am not sure it worth.

Regards,
Yugo Nagata

>
> --
> Álvaro Herrera                https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Yugo Nagata <nagata@sraoss.co.jp>



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Remove page-read callback from XLogReaderState.
Next
From: Julien Rouhaud
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activityview?