Re: Materialized views WIP patch - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Materialized views WIP patch
Date
Msg-id 1361037194.53579.YahooMailNeo@web162902.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: Materialized views WIP patch  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Materialized views WIP patch  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Feb 15, 2013 at 8:01 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

>> There is one odd aspect to pg_dump, but I think the way it is
>> behaving is the best way to handle it, although I invite other
>> opinions.  If you load from pg_dump output, it will try to
>> populated materialized views which were populated on dump, and
>> leave the ones which were not scannable because the contents had
>> not been generated in an empty and unscannable state on restore.
>> That much seems pretty obvious.  Where it gets  a little tricky is
>> if mva is generated with data, and mvb is generated based on mva.
>> Then mva is truncated.  Then you dump.  mvb was populated at the
>> time of the dump, but its contents can't be regenerated on restore
>> because mva is not scannable.  As the patch currently stands, you
>> get an error on the attempt to REFRESH mvb.  I think that's a good
>> thing, but I'm open to arguments to the contrary.
>
> Hmm, anything that means a dump-and-restore can fail seems like a bad
> thing to me.  There's nothing outrageous about that scenario.  It's
> arguable what state dump-and-restore should leave the new database in,
> but I don't see why it shouldn't work.  I predict we'll end up with
> unhappy users if we leave it like this.

Keeping in mind that mva may take hours to refresh, and mvb may
take only minutes once you have the data from mva, what behavior do
you think is preferable?

The alternatives I can think of are:

(1)  Force mva to refresh on restore, even though it was empty and
not scannable on dump.  This may delay completion of the restore
for an extended time.  It would leave both mva and mvb populated.

(2)  Populate mvb by using mva's query as a regular view.  This
would leave things in the same state as they were on dump, and
might possibly optimized to something faster than generating mva
and then mvb; but probably would not be much faster in most cases.

(3)  Change the failure to generate data for mvb in this case as a
WARNING rather than an ERROR.

(4)  Actually dump and restore data with COPY statements for
materialized views, rather than having the dump create REFRESH
statements.  The main down side of this, it seems to me, is that it
opens up materialized views to direct tinkering of contents via SQL
statements, which I was hoping to avoid.  Perhaps this can be
mitigated in some way.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] indirect toast tuple support
Next
From: Noah Misch
Date:
Subject: Re: is it bug? - printing boolean domains in sql/xml function