Thread: [GENERAL] Divert triggers on materialized views

[GENERAL] Divert triggers on materialized views

From
Ewen McNeill
Date:
PostgreSQL VIEWs have a useful feature where INSTEAD OF triggers can be 
defined to divert INSERT/DELETE/UPDATE actions into an underlying table 
(or other location), creating the effect of a "writeable view" (and I 
believe in more recent PostgreSQL versions this is pretty much automatic).

However MATERIALIZED VIEWs are implemented differently (effectively as 
TABLE storage with a saved SQL query used to REFRESH them), and it 
appears (at least in 9.5, but AFAICT from documentation through 10) no 
TRIGGERs at all can be created on MATERIALIZED VIEWs.  This makes 
MATERIALIZED VIEWs difficult to substitute for VIEWs in some instances.

In the situation I'm trying to help with:
(a) there's a "history" table which has tens of millions of rows of 
accumulated daily updates;

(b) there's a "today view" on the history table which shows the current 
data via some date range checks on the history table, working out to 
about 200k rows in the view

(c) there's a daily batch task that ingests the latest data, which has a 
couple of hundred thousand rows of "daily updates", which for legacy 
reasons is done via (diverted by triggers) actions on the "today view"

and they've reported that over time their "dashboards" of reports 
against the "today view" have become slower, presumably as the 
underlying history table has grown.

Given that almost all of the changes happen in a daily batch update 
through an easily wrappable process (SFTP, process-on-upload), and the 
majority of the queries (but not all) are against the current data, 
turning that "today view" into a MATERIALIZED VIEW seems like an obvious 
win... except that it breaks the ability to use divert (INSTEAD OF) 
TRIGGERs which then means a larger application rewrite.

Unfortunately the dashboard/reporting front end now has the name of the 
VIEW hardcoded all over the place (changeable only by one-at-a-time GUI 
point and click :-( ) so pointing the reporting tool at a different 
MATERIALIZED VIEW does not seem like a realistic option.

Is it likely that TRIGGERs, especially INSTEAD OF triggers, would be 
supported on MATERIALIZED VIEWs in some later version of PostgreSQL in a 
similar manner to (non-MATERIALIZED) VIEWs?  Ideally 9.6/10, but even 
being on a todo list would be useful.

Alternatively does anyone have any idea of any minimal change 
rearrangement of TABLEs/VIEWs/TRIGGERs that might achieve the same 
effect without requiring much of the application to be rewritten?

The only one that comes to my mind is to make the "today view" an actual 
table, with AFTER TRIGGERs on it that mirrors the same action into the 
"history table" -- which would require recreating all the TRIGGERs, and 
the VIEWs that hook into that "daily" view, but otherwise superficially 
seems plausible.  It stores the data twice, but so does a MATERIALIZED 
VIEW, and the daily view data is relatively small.  (It would also 
require a audit for anything that manipulates the history table 
directly, but from a quick look I think most things rely on the existing 
INSTEAD OF triggers on the view....)

Ewen


PS: It also appears that INSTEAD OF triggers are not supported on 
TABLEs, although this has been discussed a few times before, eg

https://www.postgresql.org/message-id/1305138588.8811.3.camel@vanquo.pezone.net
https://www.postgresql.org/message-id/20150401160440.GD583%40awork2.anarazel.de

and this might be part of why they're not supported on MATERIALIZED VIEWs.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Divert triggers on materialized views

From
Brent Wood
Date:

Hi Ewen,

My advice would be to use table partitions... split your history table up into (maybe annual)? partitions, have these inherited into a parent table, which becomes your new "history table" (perhaps instead of a view?)

If times are a common component of a where clause, given the partitions (except for the current one) are static (no new data), you can also apply a clustered index on the static partitions, which reduces disk activity on sequential data reads.

We are running a reasonably effective database with approaching a biiliion records & twenty years of data using this approach.

There is also a fork of Postgres which is optimised for timeseries data, see: https://www.timescale.com/
Some of their optimisation is to build in automatic partitioning, much as described above.

You can use your existing normal view as well, which will only be on the "current" partition, so the overhead of a single large index & table is no longer an issue.


Cheers

Brent Wood

Programme leader: Environmental Information Delivery
NIWA
DDI:  +64 (4) 3860529


Brent Wood
Principal Technician - GIS and Spatial Data Management
Programme Leader - Environmental Information Delivery
+64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nz
NIWA
To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems.
________________________________________
From: pgsql-general-owner@postgresql.org [pgsql-general-owner@postgresql.org] on behalf of Ewen McNeill [pgsql@ewen.mcneill.gen.nz]
Sent: Wednesday, October 18, 2017 13:34
To: pgsql-general@postgresql.org
Subject: [GENERAL] Divert triggers on materialized views

PostgreSQL VIEWs have a useful feature where INSTEAD OF triggers can be
defined to divert INSERT/DELETE/UPDATE actions into an underlying table
(or other location), creating the effect of a "writeable view" (and I
believe in more recent PostgreSQL versions this is pretty much automatic).

However MATERIALIZED VIEWs are implemented differently (effectively as
TABLE storage with a saved SQL query used to REFRESH them), and it
appears (at least in 9.5, but AFAICT from documentation through 10) no
TRIGGERs at all can be created on MATERIALIZED VIEWs.  This makes
MATERIALIZED VIEWs difficult to substitute for VIEWs in some instances.

In the situation I'm trying to help with:
(a) there's a "history" table which has tens of millions of rows of
accumulated daily updates;

(b) there's a "today view" on the history table which shows the current
data via some date range checks on the history table, working out to
about 200k rows in the view

(c) there's a daily batch task that ingests the latest data, which has a
couple of hundred thousand rows of "daily updates", which for legacy
reasons is done via (diverted by triggers) actions on the "today view"

and they've reported that over time their "dashboards" of reports
against the "today view" have become slower, presumably as the
underlying history table has grown.

Given that almost all of the changes happen in a daily batch update
through an easily wrappable process (SFTP, process-on-upload), and the
majority of the queries (but not all) are against the current data,
turning that "today view" into a MATERIALIZED VIEW seems like an obvious
win... except that it breaks the ability to use divert (INSTEAD OF)
TRIGGERs which then means a larger application rewrite.

Unfortunately the dashboard/reporting front end now has the name of the
VIEW hardcoded all over the place (changeable only by one-at-a-time GUI
point and click :-( ) so pointing the reporting tool at a different
MATERIALIZED VIEW does not seem like a realistic option.

Is it likely that TRIGGERs, especially INSTEAD OF triggers, would be
supported on MATERIALIZED VIEWs in some later version of PostgreSQL in a
similar manner to (non-MATERIALIZED) VIEWs?  Ideally 9.6/10, but even
being on a todo list would be useful.

Alternatively does anyone have any idea of any minimal change
rearrangement of TABLEs/VIEWs/TRIGGERs that might achieve the same
effect without requiring much of the application to be rewritten?

The only one that comes to my mind is to make the "today view" an actual
table, with AFTER TRIGGERs on it that mirrors the same action into the
"history table" -- which would require recreating all the TRIGGERs, and
the VIEWs that hook into that "daily" view, but otherwise superficially
seems plausible.  It stores the data twice, but so does a MATERIALIZED
VIEW, and the daily view data is relatively small.  (It would also
require a audit for anything that manipulates the history table
directly, but from a quick look I think most things rely on the existing
INSTEAD OF triggers on the view....)

Ewen


PS: It also appears that INSTEAD OF triggers are not supported on
TABLEs, although this has been discussed a few times before, eg

https://www.postgresql.org/message-id/1305138588.8811.3.camel@vanquo.pezone.net
https://www.postgresql.org/message-id/20150401160440.GD583%40awork2.anarazel.de

and this might be part of why they're not supported on MATERIALIZED VIEWs.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general