'TID index' - Mailing list pgsql-hackers

From Jim C. Nasby
Subject 'TID index'
Date
Msg-id 20040915200221.GQ56059@decibel.org
Whole thread Raw
Responses Re: 'TID index'  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-hackers
I just had a thought that could potentially greatly improve vacuum
performance. What about some kind of TID (or does vacuum use CID?)
index? This would allow vacuum to visit only the pages it needs to
visit. Actually, I guess TID/CID wouldn't even be involved; the only
information needed would be if any tuples on a page have been marked
deleted. Something as simple as a bitmap could work. Storing TID info
might provide added vacuum efficiency, but my guess is it's probably not
worth the extra effort.

This might not help much for tables that just see a lot of random update
activity, but I think it would be very useful for large tables where
pages with dead tuples are likely to be a small percentage of the total
number of pages.

Maintaining this information on a per-transaction basis might prove
difficult to do without causing concurrency issues. Luckily, I think
this could probably be done in the background without much difficulty.
One possibility is to check for dead tuples as pages are written to disk
(actually, by definition, there would have to be dead tuples at that
point I would think). If memory serves writing these pages is now a
background process, so this shouldn't cause contention issues.
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


pgsql-hackers by date:

Previous
From: "Dann Corbit"
Date:
Subject: Re: libpq and prepared statements progress for 8.0
Next
From: "Simon Riggs"
Date:
Subject: Re: some PITR performance data with DBT-2