64-bit XIDs again - Mailing list pgsql-hackers

From Alexander Korotkov
Subject 64-bit XIDs again
Date
Msg-id CAPpHfduQ7KuCHvg3dHx+9Pwp_rNf705bjdRCrR_Cqiv_co4H9A@mail.gmail.com
Whole thread Raw
Responses Re: 64-bit XIDs again  (Simon Riggs <simon@2ndQuadrant.com>)
Re: 64-bit XIDs again  (Heikki Linnakangas <hlinnaka@iki.fi>)
Re: 64-bit XIDs again  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Hackers,

I know there were already couple of threads about 64bit XIDs.
I read them carefully, but I didn't find all the arguments for 64bit XIDs mentioned. That's why I'd like to raise this subject again.

Now hardware capabilities are much higher than when Postgres was designed. In the modern PostgreSQL scalability tests it's typical to achieve 400 000 - 500 000 tps with pgbench. With such tps it takes only few minutes to achieve default autovacuum_freeze_max_age = 200 millions. 

Notion of wraparound is evolutioning during the time. Initially it was something that almost never happens. Then it becomes something that could happen rarely, and we should be ready to it (freeze tuples in advance). Now, it becomes quite frequent periodic event for high load database. DB admins should take into account its performance impact.

Typical scenario that I've faced in real life was so. Database is divided into operative and archive parts. Operative part is small (dozens of gigabytes) and it serves most of transactions. Archive part is relatively large (some terabytes) and it serves rare selects and bulk inserts. Autovacuum work very active for operative part and very lazy for archive part (as it's expected). System works well until one day age of archive tables exceeds autovacuum_freeze_max_age. Then all autovacuum workers starts to do "autovacuum to prevent wraparound" on archive tables. If even system IO survive this, operative tables get bloated because all autovacuum workers are busy with archive tables. In such situation I typically advise to increase autovacuum_freeze_max_age and run vacuum freeze manually when system have enough of free resources.

As I mentioned in CSN thread, it would be nice to replace XID with CSN when setting hint bits for tuple. In this case when hint bits are set we don't need any additional lookups to check visibility.
http://www.postgresql.org/message-id/CAPpHfdv7BMwGv=OfUg3S-jGVFKqHi79pR_ZK1Wsk-13oZ+cy5g@mail.gmail.com
Introducing 32-bit CSN doesn't seem reasonable for me, because it would double our troubles with wraparound.

Also, I think it's possible to migrate to 64-bit XIDs without breaking pg_upgrade. Old tuples can be leaved with 32-bit XIDs while new tuples would be created with 64-bit XIDs. We can use free bits in t_infomask2 to distinguish old and new formats.

Any thoughts? Do you think 64-bit XIDs worth it?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: multivariate statistics / patch v7
Next
From: Andrew Dunstan
Date:
Subject: Re: The real reason why TAP testing isn't ready for prime time