Transition relations: correlating OLD TABLE and NEW TABLE - Mailing list pgsql-hackers

From Brent Kerby
Subject Transition relations: correlating OLD TABLE and NEW TABLE
Date
Msg-id CAH8WVsjQ104O3-CNSeFo48ku-qWh0vFCzcvr+K93bax30voJFA@mail.gmail.com
Whole thread Raw
Responses Re: Transition relations: correlating OLD TABLE and NEW TABLE  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
In a situation where we're using transition relations to capture changes after UPDATE statements, i.e., using a trigger of the form

CREATE TRIGGER trigger_name AFTER UPDATE ON table_name
REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE procedure();

there seems to be a challenge in how to join `old_table` and `new_table` so that the old and new version of each row can be matched up with each other. Of course if the table has a primary key, then this can be used, but I'm wondering how to handle this in the general case where a primary key might not exist.

According to this blog (http://databasedoings.blogspot.com/2017/07/cool-stuff-in-postgresql-10-transition.html) it is possible to use ROW_NUMBER() OVER () to create a key to join the old and new tables, but this depends on an implementation detail (or at least, isn't documented?) that the rows will be returned in the same order for the two tables. Is it correct that this will work under the existing implementation? 

If there's not a clean way of matching up the old and new versions, could the transition relation mechanism be extended in order to make this possible? Here's a couple ideas:

1) A special system column could be added to the two transition relations, OLD TABLE, and NEW TABLE, providing a common value that could be used to join corresponding rows; it could be a sequential value (like what ROW_NUMBER() would generate), or it could be some other unique identifier for the row that is convenient for implementation. But there's some awkwardness in the fact that this special column name could clash with the columns in the table (unless an existing reserved name is used). Also, exposing a unique row identifier might restrict potential future implementations. 

2) Maybe a cleaner way would be to add a third kind of transition table, say, a "CHANGE TABLE", which could be used as an alternative to "OLD TABLE" and "NEW TABLE". A change table could contain just two columns, say 'old_row' and 'new_row', each of which have the appropriate record type. In this way, the old table and new table are essentially "pre-joined" in the transition table.

Would this be workable? Or is there some other way of achieving this? 

- Brent Kerby

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Explain buffers wrong counter with parallel plans
Next
From: "David G. Johnston"
Date:
Subject: Re: Transition relations: correlating OLD TABLE and NEW TABLE