Re: Query Rewrite for Materialized Views (Postgres Extension) - Mailing list pgsql-hackers
From | Nico Williams |
---|---|
Subject | Re: Query Rewrite for Materialized Views (Postgres Extension) |
Date | |
Msg-id | 20180618191355.GC4200@localhost Whole thread Raw |
In response to | Re: Query Rewrite for Materialized Views (Postgres Extension) (Dent John <denty@QQdd.eu>) |
Responses |
Re: Query Rewrite for Materialized Views (Postgres Extension)
|
List | pgsql-hackers |
On Mon, Jun 18, 2018 at 07:38:13PM +0100, Dent John wrote: > I commented to Corey (privately) that, while my rewrite extension has > gotten me a server that responds quickly to aggregate queries, the > constant need to refresh the supporting MVs means the system’s load > average is constant and much higher than before. I’m happy with the > tradeoff for now, but it’s a huge waste of energy, and I’m sure it > must thrash my disk. > > I’m very interested in what other people think of Corey’s idea. I've written an alternative materialization extension (entirely as PlPgSQL) based on PG's internals, but my version has a few big wins that might help here. I'm thinking of properly integrating it with PG. Some of the features include: - you can write triggers that update the materialization This is because the materialization is just a regular table in my implementation. - you can mark a view as needing a refresh (e.g., in a trigger) - you can declare a PK, other constraints, and indexes on a materialization The DMLs used to refresh a view concurrently can take advantage of the PK and/or other indexes to go fast. - you get a history table which records updates to the materialization This is useful for generating incremental updates to external systems. Keeping track of refresh times should help decide whether to use or not use a materialization in some query, or whether to refresh it first, or not use it at all. One of the things I'd eventually like to do is analyze the view query AST to automatically generate triggers to update materializations or mark them as needing refreshes. A first, very very rough sketch of such an analysis looks like this: - if the view query has CTEs -> create triggers on all its table sources to mark the materialization as needing a refresh - else if a table appears more than once as a table source in the view query -> create triggers on that table that mark the materialization as needing a refresh - else if a table appears anywhere other than the top-level -> create triggers .. mark as needing refresh - else if a table is a right-side of a left join -> create triggers .. mark as needing refresh - else if a table has no PK -> create triggers .. mark as needing refresh - else if the query has no GROUP BY, or only does a GROUP BY on this table and a list of columns prefixed by the table's PK -> rewrite the query to have WHERE eq conditions on values for the table's PK columns analyze this query if the result shows this table source as the first table in the plan -> create triggers on this table to update the materialization directly from querying the source view - else -> create triggers .. mark as needing refresh Nico --
pgsql-hackers by date: