Re: Materialized views WIP patch - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Materialized views WIP patch
Date
Msg-id 1362344816.73277.YahooMailNeo@web162902.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: Materialized views WIP patch  (Nicolas Barbier <nicolas.barbier@gmail.com>)
Responses Re: Materialized views WIP patch  (Nicolas Barbier <nicolas.barbier@gmail.com>)
List pgsql-hackers
Nicolas Barbier <nicolas.barbier@gmail.com> wrote:
> 2013/3/3 Kevin Grittner <kgrittn@ymail.com>:

>> Rewriting queries using expressions which match the MV's query
>> to pull from the MV instead of the underlying tables is the
>> exception.  While that is a "sexy" feature, and I'm sure one can
>> construct examples where it helps performance, it seems to me
>> unlikely to be very generally useful.  I suspect that it exists
>> mostly so that people who want to write an RFP to pick a
>> particular product can include that as a requirement.  In other
>> words, I think the main benefit of automatic rewrite using an MV
>> is marketing, not technical or performance.
>
> I think that automatically using materialized views even when the
> query doesn’t mention them directly, is akin to automatically
> using indexes without having to mention them in the query. That
> way, queries can be written the natural way, and “creating
> materialized views” is an optimization that can be applied by a
> DBA without having to update the application queries to use them.

Oh, I understand that concept perfectly well, I just wonder how
often it is useful in practice.  The cost of planning with indexes
tends to go up dramatically the planner needs to evaluate all
possible combinations of access paths.  We've devoted quite a bit
of energy keeping that from being something like the factorial of
the number of indexes.  If you now need to find all materialized
views which could substitute for parts of a query, and look at all
permutations of how those could be used, and which indexes can be
used for each of those combinations, you have planning time which
can explode to extreme levels.

Where the number of database objects are small and their sizes are
large (like some data warehouse situations), you could come out
ahead; and if I wanted to showcase the capability you describe
that's what I would use.  With a large number of database objects
with only a few tens of millions of rows per table, I doubt you
will come out ahead.

Granted, you could say the same thing about indexes, and they are
very often useful.  I'm saying that I expect the usefulness of the
technique you describe is generally very low, but not zero.  Except
for marketing, where it's a flashy feature.  I would be interested
in seeing information to show where it works well, though.  There
is probably something to be learned by looking at the details of
the environment and workload of such a site.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Enabling Checksums
Next
From: Josh Berkus
Date:
Subject: Re: plpgsql_check_function - rebase for 9.3