Thread: [GENERAL] Divert triggers on materialized views
PostgreSQL VIEWs have a useful feature where INSTEAD OF triggers can be defined to divert INSERT/DELETE/UPDATE actions into an underlying table (or other location), creating the effect of a "writeable view" (and I believe in more recent PostgreSQL versions this is pretty much automatic). However MATERIALIZED VIEWs are implemented differently (effectively as TABLE storage with a saved SQL query used to REFRESH them), and it appears (at least in 9.5, but AFAICT from documentation through 10) no TRIGGERs at all can be created on MATERIALIZED VIEWs. This makes MATERIALIZED VIEWs difficult to substitute for VIEWs in some instances. In the situation I'm trying to help with: (a) there's a "history" table which has tens of millions of rows of accumulated daily updates; (b) there's a "today view" on the history table which shows the current data via some date range checks on the history table, working out to about 200k rows in the view (c) there's a daily batch task that ingests the latest data, which has a couple of hundred thousand rows of "daily updates", which for legacy reasons is done via (diverted by triggers) actions on the "today view" and they've reported that over time their "dashboards" of reports against the "today view" have become slower, presumably as the underlying history table has grown. Given that almost all of the changes happen in a daily batch update through an easily wrappable process (SFTP, process-on-upload), and the majority of the queries (but not all) are against the current data, turning that "today view" into a MATERIALIZED VIEW seems like an obvious win... except that it breaks the ability to use divert (INSTEAD OF) TRIGGERs which then means a larger application rewrite. Unfortunately the dashboard/reporting front end now has the name of the VIEW hardcoded all over the place (changeable only by one-at-a-time GUI point and click :-( ) so pointing the reporting tool at a different MATERIALIZED VIEW does not seem like a realistic option. Is it likely that TRIGGERs, especially INSTEAD OF triggers, would be supported on MATERIALIZED VIEWs in some later version of PostgreSQL in a similar manner to (non-MATERIALIZED) VIEWs? Ideally 9.6/10, but even being on a todo list would be useful. Alternatively does anyone have any idea of any minimal change rearrangement of TABLEs/VIEWs/TRIGGERs that might achieve the same effect without requiring much of the application to be rewritten? The only one that comes to my mind is to make the "today view" an actual table, with AFTER TRIGGERs on it that mirrors the same action into the "history table" -- which would require recreating all the TRIGGERs, and the VIEWs that hook into that "daily" view, but otherwise superficially seems plausible. It stores the data twice, but so does a MATERIALIZED VIEW, and the daily view data is relatively small. (It would also require a audit for anything that manipulates the history table directly, but from a quick look I think most things rely on the existing INSTEAD OF triggers on the view....) Ewen PS: It also appears that INSTEAD OF triggers are not supported on TABLEs, although this has been discussed a few times before, eg https://www.postgresql.org/message-id/1305138588.8811.3.camel@vanquo.pezone.net https://www.postgresql.org/message-id/20150401160440.GD583%40awork2.anarazel.de and this might be part of why they're not supported on MATERIALIZED VIEWs. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Hi Ewen,
My advice would be to use table partitions... split your history table up into (maybe annual)? partitions, have these inherited into a parent table, which becomes your new "history table" (perhaps instead of a view?)
If times are a common component of a where clause, given the partitions (except for the current one) are static (no new data), you can also apply a clustered index on the static partitions, which reduces disk activity on sequential data reads.
We are running a reasonably effective database with approaching a biiliion records & twenty years of data using this approach.
There is also a fork of Postgres which is optimised for timeseries data, see: https://www.timescale.com/
Some of their optimisation is to build in automatic partitioning, much as described above.
You can use your existing normal view as well, which will only be on the "current" partition, so the overhead of a single large index & table is no longer an issue.
Cheers
Brent Wood
Programme leader: Environmental Information Delivery
NIWA
DDI: +64 (4) 3860529
My advice would be to use table partitions... split your history table up into (maybe annual)? partitions, have these inherited into a parent table, which becomes your new "history table" (perhaps instead of a view?)
If times are a common component of a where clause, given the partitions (except for the current one) are static (no new data), you can also apply a clustered index on the static partitions, which reduces disk activity on sequential data reads.
We are running a reasonably effective database with approaching a biiliion records & twenty years of data using this approach.
There is also a fork of Postgres which is optimised for timeseries data, see: https://www.timescale.com/
Some of their optimisation is to build in automatic partitioning, much as described above.
You can use your existing normal view as well, which will only be on the "current" partition, so the overhead of a single large index & table is no longer an issue.
Cheers
Brent Wood
Programme leader: Environmental Information Delivery
NIWA
DDI: +64 (4) 3860529
Brent Wood |
Principal Technician - GIS and Spatial Data Management Programme Leader - Environmental Information Delivery |
+64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nz |
To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems. |
________________________________________
From: pgsql-general-owner@postgresql.org [pgsql-general-owner@postgresql.org] on behalf of Ewen McNeill [pgsql@ewen.mcneill.gen.nz]
Sent: Wednesday, October 18, 2017 13:34
To: pgsql-general@postgresql.org
Subject: [GENERAL] Divert triggers on materialized views
PostgreSQL VIEWs have a useful feature where INSTEAD OF triggers can be
defined to divert INSERT/DELETE/UPDATE actions into an underlying table
(or other location), creating the effect of a "writeable view" (and I
believe in more recent PostgreSQL versions this is pretty much automatic).
However MATERIALIZED VIEWs are implemented differently (effectively as
TABLE storage with a saved SQL query used to REFRESH them), and it
appears (at least in 9.5, but AFAICT from documentation through 10) no
TRIGGERs at all can be created on MATERIALIZED VIEWs. This makes
MATERIALIZED VIEWs difficult to substitute for VIEWs in some instances.
In the situation I'm trying to help with:
(a) there's a "history" table which has tens of millions of rows of
accumulated daily updates;
(b) there's a "today view" on the history table which shows the current
data via some date range checks on the history table, working out to
about 200k rows in the view
(c) there's a daily batch task that ingests the latest data, which has a
couple of hundred thousand rows of "daily updates", which for legacy
reasons is done via (diverted by triggers) actions on the "today view"
and they've reported that over time their "dashboards" of reports
against the "today view" have become slower, presumably as the
underlying history table has grown.
Given that almost all of the changes happen in a daily batch update
through an easily wrappable process (SFTP, process-on-upload), and the
majority of the queries (but not all) are against the current data,
turning that "today view" into a MATERIALIZED VIEW seems like an obvious
win... except that it breaks the ability to use divert (INSTEAD OF)
TRIGGERs which then means a larger application rewrite.
Unfortunately the dashboard/reporting front end now has the name of the
VIEW hardcoded all over the place (changeable only by one-at-a-time GUI
point and click :-( ) so pointing the reporting tool at a different
MATERIALIZED VIEW does not seem like a realistic option.
Is it likely that TRIGGERs, especially INSTEAD OF triggers, would be
supported on MATERIALIZED VIEWs in some later version of PostgreSQL in a
similar manner to (non-MATERIALIZED) VIEWs? Ideally 9.6/10, but even
being on a todo list would be useful.
Alternatively does anyone have any idea of any minimal change
rearrangement of TABLEs/VIEWs/TRIGGERs that might achieve the same
effect without requiring much of the application to be rewritten?
The only one that comes to my mind is to make the "today view" an actual
table, with AFTER TRIGGERs on it that mirrors the same action into the
"history table" -- which would require recreating all the TRIGGERs, and
the VIEWs that hook into that "daily" view, but otherwise superficially
seems plausible. It stores the data twice, but so does a MATERIALIZED
VIEW, and the daily view data is relatively small. (It would also
require a audit for anything that manipulates the history table
directly, but from a quick look I think most things rely on the existing
INSTEAD OF triggers on the view....)
Ewen
PS: It also appears that INSTEAD OF triggers are not supported on
TABLEs, although this has been discussed a few times before, eg
https://www.postgresql.org/message-id/1305138588.8811.3.camel@vanquo.pezone.net
https://www.postgresql.org/message-id/20150401160440.GD583%40awork2.anarazel.de
and this might be part of why they're not supported on MATERIALIZED VIEWs.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
From: pgsql-general-owner@postgresql.org [pgsql-general-owner@postgresql.org] on behalf of Ewen McNeill [pgsql@ewen.mcneill.gen.nz]
Sent: Wednesday, October 18, 2017 13:34
To: pgsql-general@postgresql.org
Subject: [GENERAL] Divert triggers on materialized views
PostgreSQL VIEWs have a useful feature where INSTEAD OF triggers can be
defined to divert INSERT/DELETE/UPDATE actions into an underlying table
(or other location), creating the effect of a "writeable view" (and I
believe in more recent PostgreSQL versions this is pretty much automatic).
However MATERIALIZED VIEWs are implemented differently (effectively as
TABLE storage with a saved SQL query used to REFRESH them), and it
appears (at least in 9.5, but AFAICT from documentation through 10) no
TRIGGERs at all can be created on MATERIALIZED VIEWs. This makes
MATERIALIZED VIEWs difficult to substitute for VIEWs in some instances.
In the situation I'm trying to help with:
(a) there's a "history" table which has tens of millions of rows of
accumulated daily updates;
(b) there's a "today view" on the history table which shows the current
data via some date range checks on the history table, working out to
about 200k rows in the view
(c) there's a daily batch task that ingests the latest data, which has a
couple of hundred thousand rows of "daily updates", which for legacy
reasons is done via (diverted by triggers) actions on the "today view"
and they've reported that over time their "dashboards" of reports
against the "today view" have become slower, presumably as the
underlying history table has grown.
Given that almost all of the changes happen in a daily batch update
through an easily wrappable process (SFTP, process-on-upload), and the
majority of the queries (but not all) are against the current data,
turning that "today view" into a MATERIALIZED VIEW seems like an obvious
win... except that it breaks the ability to use divert (INSTEAD OF)
TRIGGERs which then means a larger application rewrite.
Unfortunately the dashboard/reporting front end now has the name of the
VIEW hardcoded all over the place (changeable only by one-at-a-time GUI
point and click :-( ) so pointing the reporting tool at a different
MATERIALIZED VIEW does not seem like a realistic option.
Is it likely that TRIGGERs, especially INSTEAD OF triggers, would be
supported on MATERIALIZED VIEWs in some later version of PostgreSQL in a
similar manner to (non-MATERIALIZED) VIEWs? Ideally 9.6/10, but even
being on a todo list would be useful.
Alternatively does anyone have any idea of any minimal change
rearrangement of TABLEs/VIEWs/TRIGGERs that might achieve the same
effect without requiring much of the application to be rewritten?
The only one that comes to my mind is to make the "today view" an actual
table, with AFTER TRIGGERs on it that mirrors the same action into the
"history table" -- which would require recreating all the TRIGGERs, and
the VIEWs that hook into that "daily" view, but otherwise superficially
seems plausible. It stores the data twice, but so does a MATERIALIZED
VIEW, and the daily view data is relatively small. (It would also
require a audit for anything that manipulates the history table
directly, but from a quick look I think most things rely on the existing
INSTEAD OF triggers on the view....)
Ewen
PS: It also appears that INSTEAD OF triggers are not supported on
TABLEs, although this has been discussed a few times before, eg
https://www.postgresql.org/message-id/1305138588.8811.3.camel@vanquo.pezone.net
https://www.postgresql.org/message-id/20150401160440.GD583%40awork2.anarazel.de
and this might be part of why they're not supported on MATERIALIZED VIEWs.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general