Thread: How to check whether the row was modified by this transaction before?
In a BEFORE UPDATE trigger I need to know whether the row was previously modified by this transaction. Is it safe to use xmin and txid_current() for this purpose (xmin is 32-bit txid type but txid_current() returns 64-bit bigint). CREATE FUNCTION test_trigger() RETURNS TRIGGER AS $$ BEGIN IF OLD.xmin = txid_current() THEN -- Do something. ELSE -- Do something else. END IF; END; $$ LANGUAGE plpgsql;
On Thu, Dec 6, 2012 at 3:58 AM, Vlad Arkhipov <arhipov@dc.baikal.ru> wrote: > In a BEFORE UPDATE trigger I need to know whether the row was previously > modified by this transaction. Is it safe to use xmin and txid_current() for > this purpose (xmin is 32-bit txid type but txid_current() returns 64-bit > bigint). > > CREATE FUNCTION test_trigger() > RETURNS TRIGGER AS $$ > BEGIN > IF OLD.xmin = txid_current() THEN > -- Do something. > ELSE > -- Do something else. > END IF; > END; > $$ LANGUAGE plpgsql; txid_current() will return a different value from xmin after the XID space has wrapped around at least once; also, you might need to consider subtransactions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Vlad Arkhipov <arhipov@dc.baikal.ru> writes: > In a BEFORE UPDATE trigger I need to know whether the row was previously > modified by this transaction. Is it safe to use xmin and txid_current() > for this purpose (xmin is 32-bit txid type but txid_current() returns > 64-bit bigint). > IF OLD.xmin = txid_current() THEN Comparing to txid_current() mod 2^32 would probably work, but note this will not think that subtransactions or parent transactions are "this transaction", so any use of savepoints or plpgsql exception blocks is likely to cause headaches. Why do you think you need to know this? regards, tom lane
Tom Lane wrote: > Vlad Arkhipov <arhipov@dc.baikal.ru> writes: > > In a BEFORE UPDATE trigger I need to know whether the row was previously > > modified by this transaction. Is it safe to use xmin and txid_current() > > for this purpose (xmin is 32-bit txid type but txid_current() returns > > 64-bit bigint). > > > IF OLD.xmin = txid_current() THEN > > Comparing to txid_current() mod 2^32 would probably work, I think we should be setting the initial epoch to something other than zero. That way, some quick testing would have revealed this problem immediately. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > I think we should be setting the initial epoch to something other than > zero. That way, some quick testing would have revealed this problem > immediately. Yeah, having initdb start the epoch at 1 doesn't seem unreasonable. regards, tom lane
On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote: > Tom Lane wrote: > > Vlad Arkhipov <arhipov@dc.baikal.ru> writes: > > > In a BEFORE UPDATE trigger I need to know whether the row was previously > > > modified by this transaction. Is it safe to use xmin and txid_current() > > > for this purpose (xmin is 32-bit txid type but txid_current() returns > > > 64-bit bigint). I wonder if we shouldn't have a function txid_is_current(xid); I could have used that previously to avoid queuing multiple external cache-invalidations if something gets repeatedly updated in the same transaction. And I seem to remember some people here asking this question before on the lists. > > > > > IF OLD.xmin = txid_current() THEN > > > > Comparing to txid_current() mod 2^32 would probably work, > > I think we should be setting the initial epoch to something other than > zero. That way, some quick testing would have revealed this problem > immediately. +1, currently the difference of xid vs bigint is hard to spot. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote: >> Vlad Arkhipov <arhipov@dc.baikal.ru> writes: >>> In a BEFORE UPDATE trigger I need to know whether the row was previously >>> modified by this transaction. Is it safe to use xmin and txid_current() >>> for this purpose (xmin is 32-bit txid type but txid_current() returns >>> 64-bit bigint). > I wonder if we shouldn't have a function txid_is_current(xid); Yeah, I was wondering that too, and wanted to know if the OP had a use-case that was mainstream enough to justify adding such a function. regards, tom lane
On 2012-12-06 13:59:32 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote: > >> Vlad Arkhipov <arhipov@dc.baikal.ru> writes: > >>> In a BEFORE UPDATE trigger I need to know whether the row was previously > >>> modified by this transaction. Is it safe to use xmin and txid_current() > >>> for this purpose (xmin is 32-bit txid type but txid_current() returns > >>> 64-bit bigint). > > > I wonder if we shouldn't have a function txid_is_current(xid); > > Yeah, I was wondering that too, and wanted to know if the OP had a > use-case that was mainstream enough to justify adding such a function. I think architectures with an invalidation-queue to external caches (be it web-proxies or something lower-level) are quite popular. And with the new NOTIFY or pgq relatively simple. Ad to those its sensible not to post a single primary key more than once. Magnus had talks about specifically that on various conferences if that counts as anything ;) Mainstreamy enough? Andres --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2012-12-06 13:59:32 -0500, Tom Lane wrote: >> Andres Freund <andres@2ndquadrant.com> writes: >>> On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote: >>> Vlad Arkhipov <arhipov@dc.baikal.ru> writes: >>>> In a BEFORE UPDATE trigger I need to know whether the row was previously >>>> modified by this transaction. Is it safe to use xmin and txid_current() >>>> for this purpose (xmin is 32-bit txid type but txid_current() returns >>>> 64-bit bigint). >>> I wonder if we shouldn't have a function txid_is_current(xid); > I think architectures with an invalidation-queue to external caches (be > it web-proxies or something lower-level) are quite popular. And with the > new NOTIFY or pgq relatively simple. Ad to those its sensible not to > post a single primary key more than once. It strikes me that the notion of "this row was previously modified by the current transaction" is squishier than it might look, and we'd do well to clarify it before we consider exporting anything. I think there are three ways you might define such a function: 1. xmin is exactly equal to current (sub)transaction's XID. 2. xmin is this (sub)transaction's XID, or the XID of any subcommitted subtransaction of it. 3. xmin is this (sub)transaction's XID, or the XID of any subcommitted subtransaction, or the XID of any open parent transaction or subcommitted subtransaction thereof. If I've got my head screwed on straight, test #2 gives you the semantics that "the previous row update cannot commit unless the action you are about to take (with the current XID) commits". Test #3 gives you the semantics "the action you are about to take cannot commit unless the previous row update does". And test #1 doesn't have much to recommend it except simplicity; while it might appear to have the semantics "the previous row update will commit if and only if the action you are about to take commits", it's actually narrower than that, because the same could be said for already-subcommitted subtransactions. In a cache invalidation context you probably want test #2, but TransactionIdIsCurrentTransactionId() presently performs test #3. (We discussed this point in connection with commit 7b90469b, and I've been intending since then to take a closer look at all the callers of TransactionIdIsCurrentTransactionId to see if these semantics are in fact what they need. We might have some bugs associated with confusion on this.) I'm not sure which of these semantics we might wish to expose to users. regards, tom lane
Re: Re: How to check whether the row was modified by this transaction before?
From
Andres Freund
Date:
On 2012-12-06 16:26:50 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2012-12-06 13:59:32 -0500, Tom Lane wrote: > >> Andres Freund <andres@2ndquadrant.com> writes: > >>> On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote: > >>> Vlad Arkhipov <arhipov@dc.baikal.ru> writes: > >>>> In a BEFORE UPDATE trigger I need to know whether the row was previously > >>>> modified by this transaction. Is it safe to use xmin and txid_current() > >>>> for this purpose (xmin is 32-bit txid type but txid_current() returns > >>>> 64-bit bigint). > > >>> I wonder if we shouldn't have a function txid_is_current(xid); > > > I think architectures with an invalidation-queue to external caches (be > > it web-proxies or something lower-level) are quite popular. And with the > > new NOTIFY or pgq relatively simple. Ad to those its sensible not to > > post a single primary key more than once. > > It strikes me that the notion of "this row was previously modified by > the current transaction" is squishier than it might look, and we'd do > well to clarify it before we consider exporting anything. You're right. I am also wondering if we can assume that for all interesting purposes enough context is available to determine if the row was just inserted or updated or whether we need to make a test for HEAP_UPDATED available in some form. > I think there > are three ways you might define such a function: > > 1. xmin is exactly equal to current (sub)transaction's XID. > > 2. xmin is this (sub)transaction's XID, or the XID of any subcommitted > subtransaction of it. > > 3. xmin is this (sub)transaction's XID, or the XID of any subcommitted > subtransaction, or the XID of any open parent transaction or > subcommitted subtransaction thereof. > > If I've got my head screwed on straight, test #2 gives you the semantics > that "the previous row update cannot commit unless the action you are > about to take (with the current XID) commits". Test #3 gives you the > semantics "the action you are about to take cannot commit unless the > previous row update does". And test #1 doesn't have much to recommend > it except simplicity; while it might appear to have the semantics "the > previous row update will commit if and only if the action you are about > to take commits", it's actually narrower than that, because the same > could be said for already-subcommitted subtransactions. > > In a cache invalidation context you probably want test #2, but > TransactionIdIsCurrentTransactionId() presently performs test #3. I agree that 1) isn't all that interesting but after that it really gets a bit harder... 2) would be useful to do stuff like avoiding to queue any invalidations if the table is newly truncated, but that seems like a fragile thing to do from userspace. A table rewrite seems to be something fundamentally different from a truncation here because that would need more detail that txid_is_current(xid) would give you. Maybe something like 'pg_relation_is_new()' would be interesting, but that seems like a different thing, and I am not sure the use-case for that is clear enough. From a userlevel perspective 3) seems to be enough if all you want to test whether the relation is new, but again, you couldn't test for that sensibly because there's no access to HEAP_UPDATED. For which cases do you think 2) is interesting wrt. cache invalidations? I think 3) might be more interesting for the (for me) common case where you just want to avoid queuing duplicate invalidations. Because all youre interested in that case is: "Can I be sure that if I commit another invalidation has already been queued for this tuple.". And 3) seems to gives you that. > (We discussed this point in connection with commit 7b90469b, and I've > been intending since then to take a closer look at all the callers of > TransactionIdIsCurrentTransactionId to see if these semantics are in > fact what they need. We might have some bugs associated with confusion > on this.) Its an easy error to make I think, so I wouldn't be too surprised somebody made it before me. On the other hand, it seems to me that to be dangerous you need a part of the system thats not obeying transactional semantics like the indexes which still contain enum oids that aren't in the catalogs anymore. So I hope there aren't too many places where people could have made that mistake. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/07/2012 02:53 AM, Tom Lane wrote: > Vlad Arkhipov <arhipov@dc.baikal.ru> writes: >> In a BEFORE UPDATE trigger I need to know whether the row was previously >> modified by this transaction. Is it safe to use xmin and txid_current() >> for this purpose (xmin is 32-bit txid type but txid_current() returns >> 64-bit bigint). >> IF OLD.xmin = txid_current() THEN > Comparing to txid_current() mod 2^32 would probably work, but note this > will not think that subtransactions or parent transactions are "this > transaction", so any use of savepoints or plpgsql exception blocks is > likely to cause headaches. Why do you think you need to know this? > > regards, tom lane > The use case is quite simple. I'm trying to rewrite our internal system versioning extension (SQL feature T180) in more abstract way. Any temporal versioned table uses its associated history table to store updated and deleted data rows. For this purpose the extension adds AFTER UPDATE/DELETE triggers to the table that insert OLD row in the history table for updated and deleted rows. But if there are multiple changes to a row in the same transaction the trigger should generate a history row only for the first change. On 12/07/2012 06:26 AM, Tom Lane wrote: > It strikes me that the notion of "this row was previously modified by > the current transaction" is squishier than it might look, and we'd do > well to clarify it before we consider exporting anything. I think there > are three ways you might define such a function: > > 1. xmin is exactly equal to current (sub)transaction's XID. > > 2. xmin is this (sub)transaction's XID, or the XID of any subcommitted > subtransaction of it. > > 3. xmin is this (sub)transaction's XID, or the XID of any subcommitted > subtransaction, or the XID of any open parent transaction or > subcommitted subtransaction thereof. If I understand you correctly, what I'm looking for is described by the 3rd case and I may use TransactionIdIsCurrentTransactionId() for this purpose?