Re: Feature: triggers on materialized views - Mailing list pgsql-hackers

From Mitar
Subject Re: Feature: triggers on materialized views
Date
Msg-id CAKLmikOk9dxMh4krFH40-cO8pfmXcQmLyRAxVHKa3GJ2j3fFZA@mail.gmail.com
Whole thread Raw
In response to Re: Feature: triggers on materialized views  (Mitar <mmitar@gmail.com>)
Responses Re: Feature: triggers on materialized views
List pgsql-hackers
Hi!

One more version of the patch with slightly more deterministic tests.


Mitar

On Thu, Dec 27, 2018 at 11:43 PM Mitar <mmitar@gmail.com> wrote:
>
> Hi!
>
> I have made an updated version of the patch, added tests and
> documentation changes. This is my view now a complete patch. Please
> provide any feedback or comments you might have for me to improve the
> patch. I will also add it to commitfest.
>
> A summary of the patch: This patch enables adding AFTER triggers (both
> ROW and STATEMENT) on materialized views. They are fired when doing
> REFRESH MATERIALIZED VIEW CONCURRENTLY for rows which have changed.
> Triggers are not fired if you call REFRESH without CONCURRENTLY. This
> is based on some discussion on the mailing list because implementing
> it for without CONCURRENTLY would require us to add logic for firing
> triggers where there was none before (and is just an efficient heap
> swap).
>
> To be able to create a materialized view without data, specify
> triggers, and REFRESH CONCURRENTLY so that those triggers would be
> called also for initial data, I have tested and determined that there
> is no reason why REFRESH CONCURRENTLY could not be run on
> uninitialized materialized view. So I removed that check and things
> seem to just work. Including triggers being called for initial data. I
> think this makes REFRESH CONCURRENTLY have one less special case which
> is in general nicer.
>
> I have run tests and all old tests still succeed. I have added more
> tests for the new feature.
>
> I have run benchmark to evaluate the impact of me changing
> refresh_by_match_merge to do UPDATE instead of DELETE and INSERT for
> rows which were just updated. In fact it seems this improves
> performance slightly (4% in my benchmark, mean over 10 runs). I guess
> that this is because it is cheaper to just change one column's values
> (what benchmark is doing when changing rows) instead of removing and
> inserting the whole row. Because REFRESH MATERIALIZED VIEW
> CONCURRENTLY is meant to be used when not a lot of data has been
> changed anyway, I find this a pleasantly surprising additional
> improvement in this patch. I am attaching the benchmark script I have
> used. I compared the time of the final refresh query in the script. (I
> would love if pgbench could take a custom init script to run before
> the timed part of the script.)
>
>
> Mitar
>
> On Mon, Dec 24, 2018 at 12:59 PM Mitar <mmitar@gmail.com> wrote:
> >
> > Hi!
> >
> > Based on discussion about observing changes on an open query in a
> > reactive manner (to support reactive web applications) [1], I
> > identified that one critical feature is missing to fully implement
> > discussed design of having reactive queries be represented as
> > materialized views, and changes to these materialized views would then
> > be observed and pushed to the client through LISTEN/NOTIFY.
> >
> > This is my first time contributing to PostgreSQL, so I hope I am
> > starting this process well.
> >
> > I would like to propose that support for AFTER triggers are added to
> > materialized views. I experimented a bit and it seems this is mostly
> > just a question of enabling/exposing them. See attached patch. This
> > enabled me to add trigger to a material view which mostly worked. Here
> > are my findings.
> >
> > Running REFRESH MATERIALIZED VIEW CONCURRENTLY calls triggers. Both
> > per statement and per row. There are few improvements which could be
> > done:
> >
> > - Currently only insert and remove operations are done on the
> > materialized view. This is because the current logic just removes
> > changed rows and inserts new rows.
> > - In current concurrently refresh logic those insert and remove
> > operations are made even if there are no changes to be done. Which
> > triggers a statement trigger unnecessary. A small improvement could be
> > to skip the statement in that case, but looking at the code this seems
> > maybe tricky because both each of inserts and deletions are done
> > inside one query each.
> > - Current concurrently refresh logic does never do updates on existing
> > rows. It would be nicer to have that so that triggers are more aligned
> > with real changes to the data. So current two queries could be changed
> > to three, each doing one of the insert, update, and delete.
> >
> > Non-concurrent refresh does not trigger any trigger. But it seems all
> > data to do so is there (previous table, new table), at least for the
> > statement-level trigger. Row-level triggers could also be simulated
> > probably (with TRUNCATE and INSERT triggers).
> >
> > [1]
https://www.postgresql.org/message-id/flat/CAKLmikP%2BPPB49z8rEEvRjFOD0D2DV72KdqYN7s9fjh9sM_32ZA%40mail.gmail.com
> >
> >
> > Mitar
> >
> > --
> > http://mitar.tnode.com/
> > https://twitter.com/mitar_m
>
>
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m



-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: plpgsql plugin - stmt_beg/end is not called for top level blockof statements
Next
From: Mitar
Date:
Subject: Re: Feature: triggers on materialized views