Re: Materialized views WIP patch - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: Materialized views WIP patch |
Date | |
Msg-id | 20130216194645.GA31264@tornado.leadboat.com Whole thread Raw |
In response to | Re: Materialized views WIP patch (Kevin Grittner <kgrittn@ymail.com>) |
Responses |
Re: Materialized views WIP patch
|
List | pgsql-hackers |
On Sat, Feb 16, 2013 at 09:53:14AM -0800, Kevin Grittner wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Feb 15, 2013 at 8:01 PM, Kevin Grittner <kgrittn@ymail.com> wrote: > >> There is one odd aspect to pg_dump, but I think the way it is > >> behaving is the best way to handle it, although I invite other > >> opinions. If you load from pg_dump output, it will try to > >> populated materialized views which were populated on dump, and > >> leave the ones which were not scannable because the contents had > >> not been generated in an empty and unscannable state on restore. > >> That much seems pretty obvious. Where it gets a little tricky is > >> if mva is generated with data, and mvb is generated based on mva. > >> Then mva is truncated. Then you dump. mvb was populated at the > >> time of the dump, but its contents can't be regenerated on restore > >> because mva is not scannable. As the patch currently stands, you > >> get an error on the attempt to REFRESH mvb. I think that's a good > >> thing, but I'm open to arguments to the contrary. > > > > Hmm, anything that means a dump-and-restore can fail seems like a bad > > thing to me. There's nothing outrageous about that scenario. It's > > arguable what state dump-and-restore should leave the new database in, > > but I don't see why it shouldn't work. I predict we'll end up with > > unhappy users if we leave it like this. I agree that making the dump fail on this account is bad. > Keeping in mind that mva may take hours to refresh, and mvb may > take only minutes once you have the data from mva, what behavior do > you think is preferable? > > The alternatives I can think of are: > > (1) Force mva to refresh on restore, even though it was empty and > not scannable on dump. This may delay completion of the restore > for an extended time. It would leave both mva and mvb populated. This is reasonable. If the user doesn't like it, he can presumably use an edited dump list to remove the REFRESHes. > (2) Populate mvb by using mva's query as a regular view. This > would leave things in the same state as they were on dump, and > might possibly optimized to something faster than generating mva > and then mvb; but probably would not be much faster in most cases. Interesting idea, but I don't think adding novel server behavior is justified to achieve this. > (3) Change the failure to generate data for mvb in this case as a > WARNING rather than an ERROR. This is also fair. However, I think it's better to restore more valid MVs (option 1) than fewer. > (4) Actually dump and restore data with COPY statements for > materialized views, rather than having the dump create REFRESH > statements. The main down side of this, it seems to me, is that it > opens up materialized views to direct tinkering of contents via SQL > statements, which I was hoping to avoid. Perhaps this can be > mitigated in some way. This is a door better left closed. Overall, I recommend option 1. None of the options will furnish the desire of every database, but the DBA can always tailor the outcome by replacing the dumped REFRESH statements with his own. I'm not envisioning that MVs left invalid for the long term will be a typical thing, anyway. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: