Re: Remaining beta blockers - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Remaining beta blockers
Date
Msg-id 1367332819.38842.YahooMailNeo@web162905.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: Remaining beta blockers  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Remaining beta blockers
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
> Kevin Grittner <kgrittn@ymail.com> wrote:

>> Let's look at the "corner" this supposedly paints us into.  If a
>> later major release creates a better mechanism, current pg_dump and
>> load will already use it, based on the way matviews are created
>> empty and REFRESHed by pg_dump.  Worst case, we need to modify the
>> behavior of pg_dump running with the switch used by pg_upgrade to
>> use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
>> syntax is chosen) -- a command we would probably want at that point
>> anyway.  I'm not seeing cause for panic here.
>>
>> What is a real problem or risk with using this mechanism until we
>> engineer something better?  What problems with converting to a
>> later major release does anyone see?
>
> Well, it's a pg_upgrade hazard, if nothing else, isn't it?

I don't think so.  What do you see as a problem?

> Sure, I wouldn't allow that either.  My point is that I feel that
> could be engineered around in user space.  If you have a materialized
> view which could legitimately be empty (there are many for which that
> won't be an issue), then you can either arrange the view definition so
> that it isn't (e.g. by including a dummy record that clients can look
> for), or you can include a sentinel unlogged table that will contain a
> row if and only if materialized views have been refreshed since the
> last crash.  In the examples I can think of,
> indefinitely-stale-but-valid-at-some-point wouldn't be very good
> either, so I would anticipate needing to do some engineering around
> relative freshness anyway - e.g. keeping a side table that lists the
> last-refreshed-time for each matview.  Or, maybe I'd wouldn't care
> about tracking elapsed time per se, but instead want to track
> freshness relative to updates - e.g. set things up so that anyone who
> performs an action that would invalidate a row in the materialized
> view would also update a row someplace that would allow me to identify
> the row as stale.  In either case, handling the case where the view is
> altogether invalid doesn't seem to add a whole lot of additional
> complexity.
>
> Now, I can imagine cases where it does.  For example, suppose you have
> a cron job (which you trust to work) to refresh your materialized
> views every night.  Well, that means that you'll never be more than 24
> hours stale - but if any of those materialized views are unlogged,
> that also means that you could have no data at all for up to 24 hours
> following a crash.  Not great, because now you need some logic to
> handle just that one case that wouldn't be necessary if the DB did it
> for you.  But I just think it's a judgement call how serious one
> thinks that scenario is, vs. any other scenario where a boolean isn't
> adequate either.

"Staleness" is a completely different issue, in my view, from
quietly returning results that are not, and never were, accurate.
Sure we need to implement more refined "scannability" tests than
whether valid data from *some* point in time is present.  But that
should always be *part* of the scannability testing, and without it
I don't feel we have a feature of a quality suitable for delivery.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: Remaining beta blockers
Next
From: Tom Lane
Date:
Subject: Re: The missing pg_get_*def functions