Progress Report on Materialized Views - Mailing list pgsql-hackers

From Jonathan M. Gardner
Subject Progress Report on Materialized Views
Date
Msg-id 200402201322.43898.jgardner@jonathangardner.net
Whole thread Raw
Responses Re: Progress Report on Materialized Views  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-hackers
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've implemented a pretty simple Materialized Views scheme. It's not
terribly complicated, and it works quite well.

This is what I do.

0) Initialize the materialized view environment. This will allow the
system to track which tables and views are part of a materialized view,
as well as when the last time it was refreshed was.

CREATE TABLE matviews (   mv_name NAME NOT NULL   , mv_view NAME NOT NULL   , last_refresh TIMESTAMP
);

1) Create a view. This will be what the materialized view should keep in
sync with. One column must be unique and non-null. This will be the
primary key of the materialized view.

2) Create a table -- the materialized view -- from the view. I've wrapped
this up into a pl/PgSQL function. Pseudo-code is basically:- Create a table, the materialized view.- Select everything
fromthe corresponding view into the materialized  
view.- Insert a row into matviews, last_refresh = now().

3) Create a function called "<mv name>_refresh_row(<primary key type>)".
This will:- Delete the row from the materialized view with that primary key- Select the row with that primary key from
theview and insert it into  
the materialized view.

4) If there is any sort of time-dependence, create a function called "<mv
name>_refresh()". This will find all the rows that have changed due to
the time-dependence. It uses "last_refresh" from the matviews table, and
"now()" to determine the timespan.

This function needs to be called periodically.

5) Create triggers on all tables that contribute to the view.- An insert trigger, that will discover the primary key(s)
thatthe  
inserted row will affect, and refreshes those rows (using the
*_refresh_row function)- An update trigger, that will discover all the primary key(s) that the
updated row will affect, and refreshes those rows. Note that the primary
keys may be different if the column that determines the primary key is
changing.- A delete trigger, that will dicover all the primary key(s) that the
updated row will affect, and refreshes those rows.

The system has been running in a production environment for over a week,
with only one problem: deadlock when we were inserting vast amounts of
new data.

Adding appropriate indexes to the materialized views has reduced the query
times for some of our most important queries to 1/300 of the original
time.

There were some issues with the time-sensitivity of the queries. For
instance, our list of members will decrease as time goes along due to the
expiration date of accounts. Because we were running the refresh once a
day, there were a few hours of the day where the materialized view would
say that the person is a member, but the actual data says he is not. We
rewrote our code to pull everything from the materialized view, greatly
simplifying the code, and also increasing performance.

My next step is to write a generic algorithm for handling the tasks
mentioned in step s 4 and 5. I've written these by hand so far, because I
can tell which columns of the updated/inserted/deleted row determine
which rows in the materialized view will be affected. Any help in this
area would be greatly appreciated.

After that, I would like to investigate whether or not it is possible to
register a function to be called when the transaction is committed. That
way, the materialized view update can be deferred until the transaction
is complete. This would enhance performance.

- --
Jonathan Gardner
jgardner@jonathangardner.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFANnqdqp6r/MVGlwwRAveMAJ0TsLyG2w3wlOv+LvtbePvzmkueFwCeJxlX
arJKaqFBxGOuXS0L4DJeIwQ=
=FGig
-----END PGP SIGNATURE-----


pgsql-hackers by date:

Previous
From: Robert Treat
Date:
Subject: Re: [PATCHES] NO WAIT ...
Next
From: Tom Lane
Date:
Subject: Re: 7.4.1 release status - Turkish Locale