Alternative MATERIALIZED VIEW design and implementation with history table and other features - Mailing list pgsql-hackers

From Nico Williams
Subject Alternative MATERIALIZED VIEW design and implementation with history table and other features
Date
Msg-id 20161123031126.GQ32683@localhost
Whole thread Raw
Responses Re: Alternative MATERIALIZED VIEW design and implementation with history table and other features  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-hackers
I love MATERIALIZED VIEWs.

But we needed a method for recording deltas from REFRESHes, and that's
not supported.  So I coded up my own version of materialized views, in
PlPgSQL, that does provide a history feature.

Besides a history feature, this includes the ability to record changes
made to a materialized view's materialization table, which means I can
have triggers that update the materialized view.

We use this for updating a view whose query is a bit slow.  Some
triggers are also slow (well, they're fast, but used in transactions
that might potentially run fire these triggers many times), in which
case I mark a "view" as needing a refresh.  Other triggers are fast and
directly update the "view".

https://github.com/twosigma/postgresql-contrib
https://github.com/twosigma/postgresql-contrib/blob/master/pseudo_mat_views.sql
https://raw.githubusercontent.com/twosigma/postgresql-contrib/master/pseudo_mat_views.sql

I'd be willing to do some of the work of integrating this more closely
with PG, but I may need some pointers (but hopefully not much hand-
holding).  Ideally we could have CREATE MATERIALIZED VIEW syntax like
this:
   CREATE MATERIALIZED VIEW schema_name.view_name      [ ( <column-name> [, ...] ) ]      [ WITH ( storage_parameter [=
value][, ... ] ) ]      [ TABLESPACE tablespace_name ]   AS <query>   WITH [ [ UNLOGGED ] HISTORY TABLE [
schema_name.view_name_history], ]        [ PRIMARY KEY ( <column-name> [, ...] ), ]        [ [ NO ] DATA ];
 

Of particular interest may be the fact that the FULL OUTER JOIN that PG
does for REFRESH CONCURRENTLY, and which I copied here, doesn't deal
well with views that have NULLs in any columns used in the join.  It
would be nice to have an equijoin that uses IS NOT DISTINCT FROM rather
than just =, and then refreshing could use such a join in order to deal
properly with NULLs.

Any help with integration, or comments, even flames, are welcomed, but
keep in mind that this is my first foray into making a contribution to
PG, so please do be kind.  Pointers to C and SQL style guides and
standards for in-tree code would be particularly helpful.  Thanks!

Nico
-- 



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: UNDO and in-place update
Next
From: Peter Geoghegan
Date:
Subject: Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location