Re: Query Rewrite for Materialized Views (Postgres Extension) - Mailing list pgsql-hackers
From | Nico Williams |
---|---|
Subject | Re: Query Rewrite for Materialized Views (Postgres Extension) |
Date | |
Msg-id | 20180619154723.GD4200@localhost Whole thread Raw |
In response to | Re: Query Rewrite for Materialized Views (Postgres Extension) (Dent John <denty@qqdd.eu>) |
Responses |
Re: Query Rewrite for Materialized Views (Postgres Extension)
|
List | pgsql-hackers |
On Tue, Jun 19, 2018 at 08:46:06AM +0100, Dent John wrote: > I’m pretty impressed anything in this space can be written entirely in > PlPGQSL! https://github.com/twosigma/postgresql-contrib PG is quite powerful! I have even implemented a COMMIT TRIGGER in pure PlPgSQL. You'll notice I make extensive use of record/table types. > If you did integrate your implementation, it would be easy for my > Extension to read from a table other than the one which it gets the MV > definition from... Although having said that, if you went down the > route you suggest, would not you make that “regular table” into a > first class scheme object very much like Corey’s CONTINUOUS > MATERIALIZED VIEW object concept? I know nothing about the CONTINUOUS MATERIALIZED VIEW concept. What that would imply to me seems... difficult to achieve. There will be view queries that are difficult or impossible to automatically write triggers for that update an MV synchronously. > It is interesting that you can put triggers onto the table though, as > that leads well in to use cases where it is desirable to “stack” MVs > upon each other. (I’m not immediately sure whether such a use case is > still needed in face of an always-up-to-date MV feature such as is > described, but I’ve seen it elsewhere.) I have done exactly this sort of MV chaining. In my use case I had an MV of a nesting group transitive closure and then another of a join between that and user group memberships to get a complete user group transitive closure. The transitive closure of nesting groups being computed via a RECURSIVE CTE... In principle one can understand such a query and automatically write DMLs to update the MV on the fly (I've done this _manually_), but the moment you do any bulk updates out of sequence you can't, and then you have to refresh the view, so you see, I don't quite believe we can have a true continuously materialized view :( For me the key requirement is the ability to generate incremental updates to an external system, but also the whole thing has to be fast. > You say you’d like to base it off a VIEW’s AST (rather than, I > presume, you must parse the reconstructed VIEW source text as SQL?), PG already stores the AST. There's no need to write a new parser when PG already has one. At the end of the day you need to analyze an AST for the MV's source query in order to automatically write the triggers to keep it updated (or mark it as needing a refresh). > and I do agree — that’s probably the right direction... it does seem > to me there is scope to leverage the “bottom half” of the ASSERTION > stuff from Dave Fetter that Corey linked to — i.e., the part of it > that manages triggers. Still leaves the AST crawling deciding what to > actually do once a change is caught. I'll search for this. > Really good to hear about progress in this area. Eh, I've not actually implemented any automatic generation of triggers to update MVs. I've written enough such triggers manually to believe that *some* of them could be written by software. If you look at my sketch for how to do it, you'll notice that many of the sorts of queries that one would choose to materialize... are not really amenable to this treatment -- that's precisely because those make for the sorts of slow queries that make you reach for materialization in the first place :( But even so, automatically-generated triggers that mark an MV as needing a refresh are always possible, and that is a huge improvement anyways, especially if concurrent view refreshes can be made to go faster (by having PKs on the MVs). The idea is to have some sort of adaptive automatic background, concurrent MV refresh running on a frequency based in part of the amount of time it takes to refresh the VIEW. BTW, MERGE would be a significant optimization for concurrent MV refresh. Think of MERGE as a statement that can scan a source, FULL OUTER JOIN it to a target table, and for each row do an INSERT, UPDATE, or DELETE -- this is 3x faster than the three INSERT/UPDATE/DELETE statements you need to do the same work without a MERGE! Nico --
pgsql-hackers by date: