Re: Automatic free space map filling - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Automatic free space map filling
Date
Msg-id 200603021607.k22G7PW03723@candle.pha.pa.us
Whole thread Raw
In response to Re: Automatic free space map filling  (Christopher Browne <cbbrowne@acm.org>)
Responses Re: Automatic free space map filling  (Csaba Nagy <nagy@ecircle-ag.com>)
List pgsql-hackers
Christopher Browne wrote:
> What is unclear to me in the discussion is whether or not this is
> invalidating the item on the TODO list...
> 
> -------------------
> Create a bitmap of pages that need vacuuming
> 
> Instead of sequentially scanning the entire table, have the background
> writer or some other process record pages that have expired rows, then
> VACUUM can look at just those pages rather than the entire table. In
> the event of a system crash, the bitmap would probably be
> invalidated. One complexity is that index entries still have to be
> vacuumed, and doing this without an index scan (by using the heap
> values to find the index entry) might be slow and unreliable,
> especially for user-defined index functions.
> -------------------
> 
> It strikes me as a non-starter to draw vacuum work directly into the
> foreground; there is a *clear* loss in that the death of the tuple
> can't actually take place at that point, due to MVCC and the fact that
> it is likely that other transactions will be present, keeping the
> tuple from being destroyed.
> 
> But it would *seem* attractive to do what is in the TODO, above.
> Alas, the user defined index functions make cleanout of indexes much
> more troublesome :-(.  But what's in the TODO is still "wholesale,"
> albeit involving more targetted selling than the usual Kirby VACUUM
> :-).

What bothers me about the TODO item is that if we have to sequentially
scan indexes, are we really gaining much by not having to sequentially
scan the heap?  If the heap is large enough to gain from a bitmap, the
index is going to be large too.  Is disabling per-index cleanout for
expression indexes the answer?

The entire expression index problem is outlined in this thread:
http://archives.postgresql.org/pgsql-hackers/2006-02/msg01127.php

I don't think it is a show-stopper because if we fail to find the index
that matches the heap, we know we have a problem and can report it and
fall back to an index scan.

Anyway, as I remember, if you have a 20gig table, a vacuum / sequential
scan is painful, but if we have to sequential scan the all indexes, that
is probably just as painful.  If we can't make headway there and we
can't cleanout indexes without an sequential index scan, I think we
should just remove the TODO item and give up on improving vacuum
performance.

For the bitmaps, index-only scans require a bit that says "all page
tuples are visible" while vacuum wants "some tuples are expired". 
DELETE would clear both bits, while INSERT would clear just the first,
and update is a mix of INSERT and UPDATE, though perhaps on different
pages.

--  Bruce Momjian   http://candle.pha.pa.us SRA OSS, Inc.   http://www.sraoss.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Wisconsin Court Systems software
Next
From: Csaba Nagy
Date:
Subject: Re: Automatic free space map filling