Re: Feature: triggers on materialized views - Mailing list pgsql-hackers
From | Mitar |
---|---|
Subject | Re: Feature: triggers on materialized views |
Date | |
Msg-id | CAKLmikNX84jxB3=4BMUtHP91XBSTiW8GY6mT-iP3P-m_LpYFWQ@mail.gmail.com Whole thread Raw |
In response to | Feature: triggers on materialized views (Mitar <mmitar@gmail.com>) |
Responses |
Re: Feature: triggers on materialized views
Re: Feature: triggers on materialized views |
List | pgsql-hackers |
Hi! I have made an updated version of the patch, added tests and documentation changes. This is my view now a complete patch. Please provide any feedback or comments you might have for me to improve the patch. I will also add it to commitfest. A summary of the patch: This patch enables adding AFTER triggers (both ROW and STATEMENT) on materialized views. They are fired when doing REFRESH MATERIALIZED VIEW CONCURRENTLY for rows which have changed. Triggers are not fired if you call REFRESH without CONCURRENTLY. This is based on some discussion on the mailing list because implementing it for without CONCURRENTLY would require us to add logic for firing triggers where there was none before (and is just an efficient heap swap). To be able to create a materialized view without data, specify triggers, and REFRESH CONCURRENTLY so that those triggers would be called also for initial data, I have tested and determined that there is no reason why REFRESH CONCURRENTLY could not be run on uninitialized materialized view. So I removed that check and things seem to just work. Including triggers being called for initial data. I think this makes REFRESH CONCURRENTLY have one less special case which is in general nicer. I have run tests and all old tests still succeed. I have added more tests for the new feature. I have run benchmark to evaluate the impact of me changing refresh_by_match_merge to do UPDATE instead of DELETE and INSERT for rows which were just updated. In fact it seems this improves performance slightly (4% in my benchmark, mean over 10 runs). I guess that this is because it is cheaper to just change one column's values (what benchmark is doing when changing rows) instead of removing and inserting the whole row. Because REFRESH MATERIALIZED VIEW CONCURRENTLY is meant to be used when not a lot of data has been changed anyway, I find this a pleasantly surprising additional improvement in this patch. I am attaching the benchmark script I have used. I compared the time of the final refresh query in the script. (I would love if pgbench could take a custom init script to run before the timed part of the script.) Mitar On Mon, Dec 24, 2018 at 12:59 PM Mitar <mmitar@gmail.com> wrote: > > Hi! > > Based on discussion about observing changes on an open query in a > reactive manner (to support reactive web applications) [1], I > identified that one critical feature is missing to fully implement > discussed design of having reactive queries be represented as > materialized views, and changes to these materialized views would then > be observed and pushed to the client through LISTEN/NOTIFY. > > This is my first time contributing to PostgreSQL, so I hope I am > starting this process well. > > I would like to propose that support for AFTER triggers are added to > materialized views. I experimented a bit and it seems this is mostly > just a question of enabling/exposing them. See attached patch. This > enabled me to add trigger to a material view which mostly worked. Here > are my findings. > > Running REFRESH MATERIALIZED VIEW CONCURRENTLY calls triggers. Both > per statement and per row. There are few improvements which could be > done: > > - Currently only insert and remove operations are done on the > materialized view. This is because the current logic just removes > changed rows and inserts new rows. > - In current concurrently refresh logic those insert and remove > operations are made even if there are no changes to be done. Which > triggers a statement trigger unnecessary. A small improvement could be > to skip the statement in that case, but looking at the code this seems > maybe tricky because both each of inserts and deletions are done > inside one query each. > - Current concurrently refresh logic does never do updates on existing > rows. It would be nicer to have that so that triggers are more aligned > with real changes to the data. So current two queries could be changed > to three, each doing one of the insert, update, and delete. > > Non-concurrent refresh does not trigger any trigger. But it seems all > data to do so is there (previous table, new table), at least for the > statement-level trigger. Row-level triggers could also be simulated > probably (with TRUNCATE and INSERT triggers). > > [1] https://www.postgresql.org/message-id/flat/CAKLmikP%2BPPB49z8rEEvRjFOD0D2DV72KdqYN7s9fjh9sM_32ZA%40mail.gmail.com > > > Mitar > > -- > http://mitar.tnode.com/ > https://twitter.com/mitar_m -- http://mitar.tnode.com/ https://twitter.com/mitar_m
Attachment
pgsql-hackers by date: