Re: Materialized views WIP patch - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Materialized views WIP patch
Date
Msg-id CAM-w4HMDEx1FSckQicTr+oRDKMKaivE-dqQDwGmndBd8i7BstA@mail.gmail.com
Whole thread Raw
In response to Re: Materialized views WIP patch  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Mar 5, 2013 at 9:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> All that having been said, it's hard for me to imagine that anyone
> really cares about any of this until we have an incremental update
> feature, which right now we don't.  Actually, I'm betting that's going
> to be significantly harder than automatic-query-rewrite, when all is
> said and done.  Automatic-query-rewrite, if and when we get it, will
> not be easy and will require a bunch of work from someone with a good
> understanding of the planner, but it strikes me as the sort of thing
> that might work out to one large project and then it's done.  Whereas,
> incremental update sounds to me like a series of projects over a
> series of releases targeting various special cases, where we can
> always point to some improvements vs. release N-1 but we're never
> actually done and able to move on to the next thing.  As a roadmap
> goes, I think that's OK.  Even a reasonably simplistic and partial
> implementation of incremental update will benefit a lot of users.  But
> in terms of relative difficulty, it's not at all obvious to me that
> that's the easier part of the project.

While true that's true for a lot of Postgres features. The only ones
that are one-shot projects are buried deep in the internals. Anything
with UI implications inevitably has limitations and then other people
come along and and work on removing or extending those features.

I do agree with Tom though -- the most frequently asked for
materialized view in the past has always been "select count(*) from
tab". People assume we already do this and are surprised when we
don't. The cookie cutter solution for it is basically exactly what a
incrementally updated materialized view  solution would look like
(with the queue of updates with transacion information that are
periodically flattened into the aggregate). Rewriting this might be a
bit tricky and require heuristics to determine just how much work to
expend trying to match materialized views, this type of view would be
where most of the win would be.

I also can't see implementing query rewriting for
non-transactionally-accurate materialized views. If people want a
snapshot of the data that may be out of date that's great. I can tons
of use cases for that. But then surely they won't be surprised to have
to query the snapshot explicitly. If can't see going to all this
trouble to implement transactions and snapshots and wal logging and so
on and then silently rewriting queries to produce data that is not up
to date. I think users would be surprised to find bog-standard SQL
occasionally producing "incorrect" results.

That said, there are cases where snapshots might be up to date even
though we don't implement any incremental updates. If the underlying
data is read-only or hasn't received any update commits since the
snapshot was taken then it might still be useful. There are tons of
ETL applications where you load the data once and then build MVs for
it and  never touch the underlying data again.

-- 
greg



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Materialized views WIP patch
Next
From: Garick Hamlin
Date:
Subject: Re: Enabling Checksums