Re: Materialized views WIP patch - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Materialized views WIP patch
Date
Msg-id 1361384903.52749.YahooMailNeo@web162906.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: Materialized views WIP patch  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
Greg Stark <stark@mit.edu> wrote:
> Kevin Grittner <kgrittn@ymail.com> wrote:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:

>>> The way to fix it is to not try to use the sub-production but
>>> spell it all out:
>>>
>>>       TRUNCATE TABLE relation_expr_list ...
>>>     | TRUNCATE MATERIALIZED VIEW relation_expr_list ...
>>>     | TRUNCATE relation_expr_list ...
>>>
>>> Now the parser doesn't have to make any shift-reduce decision
>>> until after it can "see past" the first identifier.  It's a bit
>>> tedious but beats making a word more reserved than it has to
>>> be.
>>
>> Thanks!  Will do.
>
> Fwiw I think worrying about stuff like this at this point is
> probably a waste of time. There'll be a period of bike-shedding
> where people debate what the command should be called so worrying
> about parser conflicts before there's a consensus is kind
> pointless.

That sort of bikeshedding already happened three months ago.  Too
late now.

> I would like to know what operations you plan to support
> independently of the command names. I may have missed much
> earlier in the discussion but then I suspect things have evolved
> since then.
>
> It sounds like you want to support:
>
> 1) Selecting from materialized viws
> 2) Manually refreshing materialized views
> 3) Manually truncating materialized views
>
> And explicitly not support
>
> 1) Automatically rewriting queries to select from matching views
> 2) Incrementally refreshing materialized views

Those are material for later releases, building on the base of what
goes into this release.

> 3) Manual DML against data in materialized views (except truncate
> which is kind of DDL)

There is quite a lot of DML allowed -- changing tablespace,
changing schema, changing name of the MV or of individual columns
in it, changing statistics targets, creating indexes, and other
operations are supported.

> 4) Keeping track of whether the data in the materialized view is
> up to date

Only keeping track of whether data has been populated or not, for
now.  There has been agreement that one or more timestamps relating
to freshness will make sense, but these are not in the initial
patch.

> I have to say I find this model a bit odd.

It's not a model, it's a starting point.  Several people have
already said that even this much is useful and they expect to take
advantage of it.  I'm doing what I can to not paint us into a
corner where it's hard to extend to all the features everyone
dreams of, but if we waited for that to commit something, it will
never happen.

> I would expect a materialized view to be up to date all the time.

I expect that this will eventually be an option, but I expect that
is will be a seldom-used one.  Most cases that I've seen, people
want summary data that is reasonably up-to-date without unduly
affecting the performance of incremental changes to the underlying
data.  I've sketched out the roadmap from this patch to all of
these options in a vauge, handwavy fashion, and don't have a lot of
interest in taking it farther until we're past 9.3 beta.

> If we don't support incremental updates (which seems like a fine
> thing not to support in a first cut) then I would expect any DML
> against the table to mark the view invalid and any queries
> against it to produce an error (or possibly go to the source
> tables using the view definition but that's probably a bad idea
> for most use cases). Ie. they should behave like a view at all
> times and have up to date information or fail entirely.

That would render them completely useless for the use-cases I've
seen.  If you want to offer a patch to do that as an option, feel
free, but I will strongly argue against that as unconditional
behavior.

> I would expect a command like TRUNCATE MATERIALIZED VIEW to exist
> but I would expect it to be called something like INVALIDATE
> rather than TRUNCATE and dropping the storage is a side effect of
> simply telling the database that it doesn't need to maintain this
> materialized view. Though I could be convinced "truncate" is a
> good name as long as it's documented well.

I'm trying to minimize the number of new keywords.  The initial
patch only added MATERIALIZED.  I added REFRESH due to
near-universal demand for something other than the LOAD
MATERIALIZED VIEW I initially used.  Have you seen the statistics
Tom gave out on how much the size of the executable bloats with
every new keyword?  Until now nobody has expressed concern about
TRUNCATE MATERIALIZED VIEW, so it would take quite a groundswell of
concern at this point to even consider a new keyword for this
functionality this late in the game.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Materialized views WIP patch
Next
From: Josh Berkus
Date:
Subject: Re: Materialized views WIP patch