Re: GSoC - Materialized Views - is stale or fresh? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: GSoC - Materialized Views - is stale or fresh?
Date
Msg-id AANLkTikLr_rFSPi3SUVVDicutgasuZaeQda4aRPetKEx@mail.gmail.com
Whole thread Raw
In response to Re: GSoC - Materialized Views - is stale or fresh?  (Magnus Hagander <magnus@hagander.net>)
Responses Re: GSoC - Materialized Views - is stale or fresh?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Mon, Jun 14, 2010 at 5:00 AM, Magnus Hagander <magnus@hagander.net> wrote:
> 2010/6/14 Greg Smith <greg@2ndquadrant.com>:
>> Pavel Baros wrote:
>>>
>>> After each INSERT, UPDATE, DELETE statement (transaction)
>>> pg_class.rellastxid would be updated. That should not be time- or memory-
>>> consuming (not so much) since pg_class is cached, I guess.
>>
>> An update in PostgreSQL is essentially an INSERT followed a later DELETE
>> when VACUUM gets to the dead row no longer visible.  The problem with this
>> approach is that it will leave behind so many dead rows in pg_class due to
>> the heavy updates that the whole database could grind to a halt, as so many
>> operations will have to sort through all that garbage.  It could potentially
>> double the total write volume on the system, and you'll completely kill
>> people who don't have autovacuum running during some periods of the day.
>>
>> The basic idea of saving the last update time for each relation is not
>> unreasonable, but you can't store the results by updating pg_class.  My
>> first thought would be to send this information as a message to the
>> statistics collector.  It's already being sent updates at the point you're
>> interested in for the counters of how many INSERT/UPDATE/DELETE statements
>> are executing against the table.  You might bundle your last update
>> information into that existing message with minimal overhead.
>
> Right. Do remember that the stats collector is designed to be lossy,
> though, so you're not guaranteed that the information reaches the
> other end. In reality it tends to do that, but there needs to be some
> sort of recovery path for the case when it doesn't.

What Pavel's trying to do here is be smart about figuring out when an
MV needs to be refreshed.  I'm pretty sure this is the wrong way to go
about it, but it seems entirely premature considering that we don't
have a working implementation of a *manually* refreshed MV.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: pg_archive_bypass
Next
From: Heikki Linnakangas
Date:
Subject: Re: warning message in standby