Re: Feature: triggers on materialized views - Mailing list pgsql-hackers
From | Mitar |
---|---|
Subject | Re: Feature: triggers on materialized views |
Date | |
Msg-id | CAKLmikOk9dxMh4krFH40-cO8pfmXcQmLyRAxVHKa3GJ2j3fFZA@mail.gmail.com Whole thread Raw |
In response to | Re: Feature: triggers on materialized views (Mitar <mmitar@gmail.com>) |
Responses |
Re: Feature: triggers on materialized views
|
List | pgsql-hackers |
Hi! One more version of the patch with slightly more deterministic tests. Mitar On Thu, Dec 27, 2018 at 11:43 PM Mitar <mmitar@gmail.com> wrote: > > Hi! > > I have made an updated version of the patch, added tests and > documentation changes. This is my view now a complete patch. Please > provide any feedback or comments you might have for me to improve the > patch. I will also add it to commitfest. > > A summary of the patch: This patch enables adding AFTER triggers (both > ROW and STATEMENT) on materialized views. They are fired when doing > REFRESH MATERIALIZED VIEW CONCURRENTLY for rows which have changed. > Triggers are not fired if you call REFRESH without CONCURRENTLY. This > is based on some discussion on the mailing list because implementing > it for without CONCURRENTLY would require us to add logic for firing > triggers where there was none before (and is just an efficient heap > swap). > > To be able to create a materialized view without data, specify > triggers, and REFRESH CONCURRENTLY so that those triggers would be > called also for initial data, I have tested and determined that there > is no reason why REFRESH CONCURRENTLY could not be run on > uninitialized materialized view. So I removed that check and things > seem to just work. Including triggers being called for initial data. I > think this makes REFRESH CONCURRENTLY have one less special case which > is in general nicer. > > I have run tests and all old tests still succeed. I have added more > tests for the new feature. > > I have run benchmark to evaluate the impact of me changing > refresh_by_match_merge to do UPDATE instead of DELETE and INSERT for > rows which were just updated. In fact it seems this improves > performance slightly (4% in my benchmark, mean over 10 runs). I guess > that this is because it is cheaper to just change one column's values > (what benchmark is doing when changing rows) instead of removing and > inserting the whole row. Because REFRESH MATERIALIZED VIEW > CONCURRENTLY is meant to be used when not a lot of data has been > changed anyway, I find this a pleasantly surprising additional > improvement in this patch. I am attaching the benchmark script I have > used. I compared the time of the final refresh query in the script. (I > would love if pgbench could take a custom init script to run before > the timed part of the script.) > > > Mitar > > On Mon, Dec 24, 2018 at 12:59 PM Mitar <mmitar@gmail.com> wrote: > > > > Hi! > > > > Based on discussion about observing changes on an open query in a > > reactive manner (to support reactive web applications) [1], I > > identified that one critical feature is missing to fully implement > > discussed design of having reactive queries be represented as > > materialized views, and changes to these materialized views would then > > be observed and pushed to the client through LISTEN/NOTIFY. > > > > This is my first time contributing to PostgreSQL, so I hope I am > > starting this process well. > > > > I would like to propose that support for AFTER triggers are added to > > materialized views. I experimented a bit and it seems this is mostly > > just a question of enabling/exposing them. See attached patch. This > > enabled me to add trigger to a material view which mostly worked. Here > > are my findings. > > > > Running REFRESH MATERIALIZED VIEW CONCURRENTLY calls triggers. Both > > per statement and per row. There are few improvements which could be > > done: > > > > - Currently only insert and remove operations are done on the > > materialized view. This is because the current logic just removes > > changed rows and inserts new rows. > > - In current concurrently refresh logic those insert and remove > > operations are made even if there are no changes to be done. Which > > triggers a statement trigger unnecessary. A small improvement could be > > to skip the statement in that case, but looking at the code this seems > > maybe tricky because both each of inserts and deletions are done > > inside one query each. > > - Current concurrently refresh logic does never do updates on existing > > rows. It would be nicer to have that so that triggers are more aligned > > with real changes to the data. So current two queries could be changed > > to three, each doing one of the insert, update, and delete. > > > > Non-concurrent refresh does not trigger any trigger. But it seems all > > data to do so is there (previous table, new table), at least for the > > statement-level trigger. Row-level triggers could also be simulated > > probably (with TRUNCATE and INSERT triggers). > > > > [1] https://www.postgresql.org/message-id/flat/CAKLmikP%2BPPB49z8rEEvRjFOD0D2DV72KdqYN7s9fjh9sM_32ZA%40mail.gmail.com > > > > > > Mitar > > > > -- > > http://mitar.tnode.com/ > > https://twitter.com/mitar_m > > > > -- > http://mitar.tnode.com/ > https://twitter.com/mitar_m -- http://mitar.tnode.com/ https://twitter.com/mitar_m
Attachment
pgsql-hackers by date: