Re: Can Postgres Not Do This Safely ?!? - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Can Postgres Not Do This Safely ?!? |
Date | |
Msg-id | 4CCB4D7C.7000506@gmail.com Whole thread Raw |
In response to | Re: Can Postgres Not Do This Safely ?!? (Karl Pickett <karl.pickett@gmail.com>) |
List | pgsql-general |
On 10/29/2010 07:32 AM, Karl Pickett wrote: > On Fri, Oct 29, 2010 at 8:58 AM, Adrian Klaver<adrian.klaver@gmail.com> wrote: >> On Thursday 28 October 2010 7:04:48 pm Karl Pickett wrote: >>> Hello Postgres Hackers, >>> >>> We have a simple 'event log' table that is insert only (by multiple >>> concurrent clients). It has an integer primary key. We want to do >>> incremental queries of this table every 5 minutes or so, i.e. "select >>> * from events where id> LAST_ID_I_GOT" to insert into a separate >>> reporting database. The problem is, this simple approach has a race >>> that will forever skip uncommitted events. I.e., if 5000 was >>> committed sooner than 4999, and we get 5000, we will never go back and >>> get 4999 when it finally commits. How can we solve this? Basically >>> it's a phantom row problem but it spans transactions. >>> >>> I looked at checking the internal 'xmin' column but the docs say that >>> is 32 bit, and something like 'txid_current_snapshot' returns a 64 bit >>> value. I don't get it. >> >> http://www.postgresql.org/docs/8.4/interactive/functions-info.html#FUNCTIONS-TXID-SNAPSHOT-PARTS >> "The internal transaction ID type (xid) is 32 bits wide and wraps around every 4 >> billion transactions. However, these functions export a 64-bit format that is >> extended with an "epoch" counter so it will not wrap around during the life of >> an installation. The data type used by these functions, txid_snapshot, stores >> information about transaction ID visibility at a particular moment in time. Its >> components are described in Table 9-53. " >> >> So: >> Current snapshot: >> >> test=> SELECT txid_current_snapshot(); >> txid_current_snapshot >> ----------------------- >> 5098:5098: >> >> xmin of snapshot: >> test=> SELECT txid_snapshot_xmin(txid_current_snapshot()); >> txid_snapshot_xmin >> -------------------- >> 5098 >> (1 row) > > So what happens when txid_snapshot_xmin() goes over 4 billion, and the > table's xmin doesn't? You can't compare a 32 bit value that rolls > over to a 64 bit that doesn't. The long explanation is here: http://www.postgresql.org/docs/9.0/interactive/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND The short version as I understand it is that if everything is working correctly the XID(hence xmin) values exist in a continuous loop where 2 billion are in the past and 2 billion are in the future(assuming default settings). At some point the old values are frozen i.e. replaced with a special FrozenXID. This would mean that the *snapshot functions should only return currently valid xmins. Since I have never rolled over a database I can only speak to theory as I understand it. > >> >> >>> All I want to is make sure I skip over any >>> rows that are newer than the oldest currently running transaction. >>> Has nobody else run into this before? >>> >>> Thank you very much. >>> >>> -- >>> Karl Pickett >> >> >> >> -- >> Adrian Klaver >> adrian.klaver@gmail.com >> > > > -- Adrian Klaver adrian.klaver@gmail.com
pgsql-general by date: