Re: Can Postgres Not Do This Safely ?!? - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Can Postgres Not Do This Safely ?!?
Date
Msg-id 4CCB4D7C.7000506@gmail.com
Whole thread Raw
In response to Re: Can Postgres Not Do This Safely ?!?  (Karl Pickett <karl.pickett@gmail.com>)
List pgsql-general
On 10/29/2010 07:32 AM, Karl Pickett wrote:
> On Fri, Oct 29, 2010 at 8:58 AM, Adrian Klaver<adrian.klaver@gmail.com>  wrote:
>> On Thursday 28 October 2010 7:04:48 pm Karl Pickett wrote:
>>> Hello Postgres Hackers,
>>>
>>> We have a simple 'event log' table that is insert only (by multiple
>>> concurrent clients).  It has an integer primary key.  We want to do
>>> incremental queries of this table every 5 minutes or so, i.e. "select
>>> * from events where id>  LAST_ID_I_GOT" to insert into a separate
>>> reporting database.  The problem is, this simple approach has a race
>>> that will forever skip uncommitted events.  I.e., if 5000 was
>>> committed sooner than 4999, and we get 5000, we will never go back and
>>> get 4999 when it finally commits.  How can we solve this?  Basically
>>> it's a phantom row problem but it spans transactions.
>>>
>>> I looked at checking the internal 'xmin' column but the docs say that
>>> is 32 bit, and something like 'txid_current_snapshot' returns a 64 bit
>>> value.  I don't get it.
>>
>> http://www.postgresql.org/docs/8.4/interactive/functions-info.html#FUNCTIONS-TXID-SNAPSHOT-PARTS
>> "The internal transaction ID type (xid) is 32 bits wide and wraps around every 4
>> billion transactions. However, these functions export a 64-bit format that is
>> extended with an "epoch" counter so it will not wrap around during the life of
>> an installation. The data type used by these functions, txid_snapshot, stores
>> information about transaction ID visibility at a particular moment in time. Its
>> components are described in Table 9-53. "
>>
>> So:
>> Current snapshot:
>>
>> test=>  SELECT txid_current_snapshot();
>>   txid_current_snapshot
>> -----------------------
>>   5098:5098:
>>
>> xmin of snapshot:
>> test=>  SELECT txid_snapshot_xmin(txid_current_snapshot());
>>   txid_snapshot_xmin
>> --------------------
>>                5098
>> (1 row)
>
> So what happens when txid_snapshot_xmin() goes over 4 billion, and the
> table's xmin doesn't?  You can't compare a 32 bit value that rolls
> over to a 64 bit that doesn't.

The long explanation is here:
http://www.postgresql.org/docs/9.0/interactive/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

The short version as I understand it is that if everything is working
correctly the XID(hence xmin) values exist in a continuous loop where 2
billion are in the past and 2 billion are in the future(assuming default
settings). At some point the old values are frozen i.e. replaced with a
special FrozenXID. This would mean that the *snapshot functions should
only return currently valid xmins. Since I have never rolled over a
database I can only speak to theory as I understand it.


>
>>
>>
>>> All I want to is make sure I skip over any
>>> rows that are newer than the oldest currently running transaction.
>>> Has nobody else run into this before?
>>>
>>> Thank you very much.
>>>
>>> --
>>> Karl Pickett
>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@gmail.com
>>
>
>
>


--
Adrian Klaver
adrian.klaver@gmail.com

pgsql-general by date:

Previous
From: Mike Christensen
Date:
Subject: pgAdmin on Mac connecting to Postgres 9 on Linux - SSL/timeout issue
Next
From: Richard Broersma
Date:
Subject: Call For Talks: PGDay LA @ SCALE 9X