Re: Materialized views WIP patch - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: Materialized views WIP patch |
Date | |
Msg-id | CAM-w4HMDEx1FSckQicTr+oRDKMKaivE-dqQDwGmndBd8i7BstA@mail.gmail.com Whole thread Raw |
In response to | Re: Materialized views WIP patch (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Tue, Mar 5, 2013 at 9:08 PM, Robert Haas <robertmhaas@gmail.com> wrote: > All that having been said, it's hard for me to imagine that anyone > really cares about any of this until we have an incremental update > feature, which right now we don't. Actually, I'm betting that's going > to be significantly harder than automatic-query-rewrite, when all is > said and done. Automatic-query-rewrite, if and when we get it, will > not be easy and will require a bunch of work from someone with a good > understanding of the planner, but it strikes me as the sort of thing > that might work out to one large project and then it's done. Whereas, > incremental update sounds to me like a series of projects over a > series of releases targeting various special cases, where we can > always point to some improvements vs. release N-1 but we're never > actually done and able to move on to the next thing. As a roadmap > goes, I think that's OK. Even a reasonably simplistic and partial > implementation of incremental update will benefit a lot of users. But > in terms of relative difficulty, it's not at all obvious to me that > that's the easier part of the project. While true that's true for a lot of Postgres features. The only ones that are one-shot projects are buried deep in the internals. Anything with UI implications inevitably has limitations and then other people come along and and work on removing or extending those features. I do agree with Tom though -- the most frequently asked for materialized view in the past has always been "select count(*) from tab". People assume we already do this and are surprised when we don't. The cookie cutter solution for it is basically exactly what a incrementally updated materialized view solution would look like (with the queue of updates with transacion information that are periodically flattened into the aggregate). Rewriting this might be a bit tricky and require heuristics to determine just how much work to expend trying to match materialized views, this type of view would be where most of the win would be. I also can't see implementing query rewriting for non-transactionally-accurate materialized views. If people want a snapshot of the data that may be out of date that's great. I can tons of use cases for that. But then surely they won't be surprised to have to query the snapshot explicitly. If can't see going to all this trouble to implement transactions and snapshots and wal logging and so on and then silently rewriting queries to produce data that is not up to date. I think users would be surprised to find bog-standard SQL occasionally producing "incorrect" results. That said, there are cases where snapshots might be up to date even though we don't implement any incremental updates. If the underlying data is read-only or hasn't received any update commits since the snapshot was taken then it might still be useful. There are tons of ETL applications where you load the data once and then build MVs for it and never touch the underlying data again. -- greg
pgsql-hackers by date: