Thread: Using a TRIGGER with window functions.
Hi all, I have a rather contrived example, but I just can't get this to work. A fiddle with all my code is available here: https://dbfiddle.uk/?rdbms=postgres_12&fiddle=9a89ea79ffc617a11c36d63123d2f987 CREATE TABLE t1 ( user_id SMALLINT NOT NULL, activity_id SMALLINT NOT NULL, sort_order SMALLINT NULL, CONSTRAINT t1_uid_aid_uq UNIQUE (user_id, activity_id) ); Populate: INSERT INTO t1 VALUES (101, 123), (101, 124), (101, 125), (102, 123), (102, 124); Now, I want to update the sort_order column as follows: -- -- DESIRED RESULT -- -- 101, 123, 0 -- 101, 124, 1 -- 101, 125, 2 -- 102, 123, 0 -- 102, 124, 1 -- So, as you can see, I want to update the sort_order column, starting from 0 So, I tried this, UPDATE t1 SET sort_order = activity_id - FIRST_VALUE(activity_id) OVER (PARTITION BY user_id ORDER BY user_id, activity_id); and this gives ERROR: window functions are not allowed in UPDATE LINE 2: SET sort_order = activity_id - FIRST_VALUE(activity_id) Why are window functions now allowed in UPDATEs - would be very handy to have? I also tried putting it into a computed/generated column, but got this error (see fiddle) ERROR: window functions are not allowed in column generation expressions LINE 8: activity_id - FIRST_VALUE(activity_id) Why aren't window functions allowed in GENERATED columns? Seems that it would be pretty useful functionality to me and would be a trivial solution for this issue? Got it to work using a CTE as follows. WITH cte AS ( SELECT user_id, activity_id, activity_id - FIRST_VALUE(activity_id) OVER (PARTITION BY user_id ORDER BY user_id, activity_id) AS so FROM t1 ) UPDATE t1 SET sort_order = cte.so FROM cte WHERE t1.user_id = cte.user_id AND t1.activity_id = cte.activity_id; So, that's all well and good! But what happens when I want to add records and keep the sort_order in sync with my user_id and activity_id? So, I'm trying triggers. I run INSERT INTO t1 VALUES (102, 125); and am desperately trying to get a trigger to suitably update my sort_order field. I've tried various solutions - using INSERT/UPDATE and BEFORE/AFTER INSERT - can't seem to get anything to work. I think that my underlying understanding of triggers is flawed, so I wonder if anybody could be so kind as to explain to me where I'm going wrong. Sample (failing) trigger (see fiddle). CREATE OR REPLACE FUNCTION update_t1_sort_order() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$ BEGIN WITH cte AS ( SELECT NEW.user_id, NEW.activity_id, NEW.activity_id - FIRST_VALUE(NEW.activity_id) OVER (PARTITION BY NEW.user_id ORDER BY NEW.user_id, NEW.activity_id) AS so ) INSERT INTO t1 (user_id, activity_id, sort_order) VALUES (NEW.user_id, NEW.activity_id, (SELECT so FROM cte)); RETURN NEW; END; $$ -- see fiddle... CREATE TRIGGER sort_order_tr BEFORE INSERT -- tried AFTER INSERT and various combinations of UPDATE also... nada! ON t1 FOR EACH ROW EXECUTE PROCEDURE update_t1_sort_order(); Either I go into an infinite loop or I make all my sort_orders 0 or NULL or (best result) ther others untouched and my insert sort_order = 0, I leave the others alone but the newly inserted value has a sort order of 0. As I said, the example is contrived - basically, I'd like to learn more about triggers and the logic that makes them "tick". If triggers aren't the ideal solution, I'm open to other avenues, but would like to understand where I'm going wrong! TIA and rgs, Pól...
If you want to update many rows after updating N rows, you want an after STATEMENT trigger which is executed after insert, update or delete. You also want to ensure that only that function maintains sort_order field and that you don't update recursively, perhaps by executing that function when NEW.sort_order IS NOT DISTINCT FROM old.sort_order to prevent an infinite loop, and executing a different function when NEW.sort_order IS DISTINCT FROM OLD.sort_order such that you ensure all other fields have not changed.
By the way, your window function could be row_number() - 1 instead of activity_id - FIRST_VALUE(activity_id).
On Sun, Aug 15, 2021 at 1:24 AM Pól Ua Laoínecháin <linehanp@tcd.ie> wrote:
ERROR: window functions are not allowed in UPDATE LINE 2: SET
sort_order = activity_id - FIRST_VALUE(activity_id)
Why are window functions now allowed in UPDATEs
You can get it to work via a subquery/FROM clause computation. That it doesn't work directly in the SET clause I don't know off-hand, but most likely the development and runtime cost of making it work isn't worth the benefit.
Why aren't window functions allowed in GENERATED columns?
Because the expressions allowed in GENERATED can only immutably reference other columns in the same row. The underlying rationale is probably quite similar to the UPDATE comment above.
INSERT INTO t1 (user_id, activity_id, sort_order)
VALUES (NEW.user_id, NEW.activity_id, (SELECT so FROM cte));
Yes, an insert trigger that performs a literal insert into the same table is fundamentally broken due to exhibiting infinite loop behavior. Same goes for update/delete - during trigger execution you are already in the middle of doing the required thing.
If triggers aren't the ideal solution, I'm open to other avenues
Off the top of my head - I'd have a second table that is dedicated to dealing with ordering. It would have a one-to-one relationship with the main table. Upon executing DML against the main table, ideally for a single user_id at a time, remove all of the records from the ordering table for that user_id and then insert them anew. I would probably do this within functions and disallow direct access to the main and ordering tables generally - but triggers can probably be made to work.
Or just discard the idea of pre-computing this data and wrap the ordering logic in a view.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Sun, Aug 15, 2021 at 1:24 AM Pól Ua Laoínecháin <linehanp@tcd.ie> wrote: >> Why are window functions now allowed in UPDATEs > You can get it to work via a subquery/FROM clause computation. That it > doesn't work directly in the SET clause I don't know off-hand, but most > likely the development and runtime cost of making it work isn't worth the > benefit. I suspect the error check was just copied from the aggregate-function case. It's clear why we can't put aggregates in UPDATE: there'd no longer be a one-to-one correspondence with original rows. But that argument doesn't hold for window functions, so at least in principle it seems like we could allow it. The utility doesn't seem very high though, so if it takes more work than "delete the error check" I'm not sure anyone will care to bother. >> Why aren't window functions allowed in GENERATED columns? > Because the expressions allowed in GENERATED can only immutably reference > other columns in the same row. The underlying rationale is probably quite > similar to the UPDATE comment above. Yeah --- GENERATED is supposed to be immutable, and there's about 0% chance that anything involving a subquery or window function would really be immutable. I think there are implementation issues too, but they're not worth getting into given that point. regards, tom lane
I wrote: > "David G. Johnston" <david.g.johnston@gmail.com> writes: >> On Sun, Aug 15, 2021 at 1:24 AM Pól Ua Laoínecháin <linehanp@tcd.ie> wrote: >>> Why are window functions now allowed in UPDATEs > I suspect the error check was just copied from the aggregate-function > case. It's clear why we can't put aggregates in UPDATE: there'd no > longer be a one-to-one correspondence with original rows. But that > argument doesn't hold for window functions, so at least in principle > it seems like we could allow it. The utility doesn't seem very high > though, so if it takes more work than "delete the error check" I'm > not sure anyone will care to bother. Out of curiosity, I did spend a few minutes poking at this, and as I feared it's not all that trivial. I think that the planner and executor might mostly Just Work, but there are various gaps in the parser. One interesting point is that the UPDATE syntax has no provision for a WINDOW clause, so there'd be no way to share window definitions across different window functions. While that's not exactly a deal-breaker, it'd be weird compared to how things work in SELECT. Would we be willing to go to the trouble of adding such a clause to the syntax? I dunno; it'd certainly enlarge the footprint of a patch for this by a lot. regards, tom lane
Hi all, > If you want to update many rows after updating N rows, Thanks to those who helped me with this - I think that Michael Lewis has given me some good ideas on how I should go about this. Also, thanks to those who took the time to explain why some of my ideas were not well grounded (wouldn't be the first time! :-) ). Rgs, Pól...