Re: Materialized views WIP patch - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Materialized views WIP patch
Date
Msg-id 1362520312.80777.YahooMailNeo@web162903.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: Materialized views WIP patch  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Materialized views WIP patch  (Nicolas Barbier <nicolas.barbier@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:

> All that having been said, it's hard for me to imagine that
> anyone really cares about any of this until we have an
> incremental update feature, which right now we don't.

These are actually independent of one another, as long as we nail
down how we're determining "freshness" -- which is probably needed
for either.  Someone who's immersed in tuning long-running DW
queries might be interested in this before incremental update.
(They might load the data once per month, so refreshing the MVs as
a step in that process might be cheaper than incrementally
maintaining them.)  Someone could base "freshness" on
pg_relation_is_scannable() and start working on automatic query
rewrite right now, if they wanted to.

> Actually, I'm betting that's going to be significantly harder
> than automatic-query-rewrite, when all is said and done.
> Automatic-query-rewrite, if and when we get it, will not be easy
> and will require a bunch of work from someone with a good
> understanding of the planner, but it strikes me as the sort of
> thing that might work out to one large project and then it's
> done.

I still think we're going to hit the wall on planning time under
certain circumstances and need to tweak that over the course of
several releases, but now is not the time to get into the details
of why I think that.  We've spent way too much time on it already
for the point we're at in the 9.3 cycle.  I've kept my concerns
hand-wavy on purpose, and am trying hard to resist the temptation
to spend a lot of time demonstrating the problems.

> Whereas, incremental update sounds to me like a series of
> projects over a series of releases targeting various special
> cases, where we can always point to some improvements vs. release
> N-1 but we're never actually done

Exactly.  I predict that we will eventually have some special sort
of trigger for maintaining MVs based on base table changes to
handle the ones that are just too expensive (in developer time or
run time) to fully automate.  But there is a lot of low-hanging
fruit for automation.

> Even a reasonably simplistic and partial implementation of
> incremental update will benefit a lot of users.

Agreed.

> But in terms of relative difficulty, it's not at all obvious to
> me that that's the easier part of the project.

I totally agree that getting something working to use MVs in place
of underlying tables is not all that different or more difficult
than using partial indexes.  I just predict that we'll get a lot of
complaints about cases where it results in worse performance and
we'll need to deal with those issues.  I don't seem that as being
brain surgery; just a messy matter of trying to get this pretty
theory to work well in the real world -- probably using a bunch of
not-so-elegant heuristics.  And in the end, the best you can hope
for is performance not noticeably worse than you would get if you
modified your query to explicitly use the MV(s) -- you're just
saving yourself the rewrite.  Well, OK, there is the point that,
(like indexes) if you run the query which hits the base tables with
different parameters, and a new plan is generated each time, it
might pick different MVs or exclude them as is most efficient for
the given parameters.  That's the Holy Grail of all this.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: sql_drop Event Trigger
Next
From: Nicolas Barbier
Date:
Subject: Re: Materialized views WIP patch