Re: Thoughts on maintaining 7.3 - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Thoughts on maintaining 7.3 |
Date | |
Msg-id | 27612.1065326029@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Thoughts on maintaining 7.3 (Alvaro Herrera <alvherre@dcc.uchile.cl>) |
Responses |
Re: Thoughts on maintaining 7.3
|
List | pgsql-hackers |
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > On Sat, Oct 04, 2003 at 11:41:17AM -0400, Tom Lane wrote: >> No. You'd be better off using REINDEX for that, I think. IIRC we have >> speculated about making VAC FULL fix the indexes via REINDEX rather than >> indexbulkdelete. > I can't agree with that idea. Why not? There is plenty of anecdotal evidence in the archives saying that it's faster to drop indexes, VACUUM FULL, recreate indexes than to VACUUM FULL with indexes in place. Most of those reports date from before we had the lazy-vacuum alternative, but I don't think that renders them less relevant. > Imagine having to VACUUM FULL a huge > table. Not only it will take the lot required to do the VACUUM in the > heap itself, it will also have to rebuild all indexes from scratch. A very large chunk of VACUUM FULL's runtime is spent fooling with the indexes. Have you looked at the code in any detail? It goes like this: 1. Scan heap looking for dead tuples and free space. 2. Make a pass over the indexes to delete index entries for dead tuples. 3. Copy remaining live tuples to lower-numbered pages to compact heap. 3a. Every time we copy a tuple, make new index entries pointing to its new location. (The old index entries still remain,though.) 4. Commit transaction so that new copies of moved tuples are good and old ones are not. 5. Make a pass over the indexes to delete index entries for old copies of moved tuples. When there are only a few tuples being moved, this isn't too bad of a strategy. But when there are lots, steps 2, 3a, and 5 represent a huge amount of work. What's worse, step 3a swells the index well beyond its final size. This used to mean permanent index bloat. Nowadays step 5 will be able to recover some of that space --- but not at zero cost. I think it's entirely plausible that dropping steps 2, 3a, and 5 in favor of an index rebuild at the end could be a winner. > I think there are scenarios where the REINDEX will be much worse, say when > there are not too many deleted tuples (but in that case, why is the user > doing VACUUM FULL in the first place?). Yeah, I think that's exactly the important point. These days there's not a lot of reason to do VACUUM FULL unless you have a major amount of restructuring to do. I would once have favored maintaining two code paths with two strategies, but now I doubt it's worth the trouble. (Or I should say, we have two code paths, the other being lazy VACUUM --- do we need three?) regards, tom lane
pgsql-hackers by date: