Thread: mysterious nbtree.c comment
In nbtree.c there's a path that calls btvacuumscan to gather statistics if there aren't already statistics. I'm not exactly clear how this code path is reached but that's not my question. There's a comment that "there's no need to go through all the vacuum-cycle-ID pushups" in this case because no deletions are being performed. I don't see how the lack of deletions is relevant to needing vacuum-cycle-ID. AFAICT there's still a risk that someone will come along and do a page split underneath this scan and if the page is to the left of the scan it will be missed. Datum btvacuumcleanup(PG_FUNCTION_ARGS) {IndexVacuumInfo *info = (IndexVacuumInfo *) PG_GETARG_POINTER(0);IndexBulkDeleteResult *stats = (IndexBulkDeleteResult *)PG_GETARG_POINTER(1); /* * If btbulkdelete was called, we need not do anything, just return * the stats from the latest btbulkdelete call. Ifit wasn't called, * we must still do a pass over the index, to recycle any newly-recyclable * pages and to obtain indexstatistics. * * Since we aren't going to actually delete any leaf items, there's no * need to go through all the vacuum-cycle-IDpushups. */if (stats == NULL){ stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult)); btvacuumscan(info, stats, NULL, NULL, 0);} -- greg
On Mon, 2006-07-03 at 16:34 -0400, Greg Stark wrote: > In nbtree.c there's a path that calls btvacuumscan to gather statistics if > there aren't already statistics. I'm not exactly clear how this code path is > reached but that's not my question. VACUUM calls access/index/index_vacuum_cleanup() which is part of the btree index access method API. In the case of a btree index this is a function pointer to access/nbtree/btvacuumcleanup() That is important to your question. > There's a comment that "there's no need to > go through all the vacuum-cycle-ID pushups" in this case because no deletions > are being performed. > > I don't see how the lack of deletions is relevant to needing vacuum-cycle-ID. > AFAICT there's still a risk that someone will come along and do a page split > underneath this scan and if the page is to the left of the scan it will be > missed. Read the comments in btvacuumscan. It is only important to scan all the pages of an index when deleting leaf items. The btvacuumscan called from btvacuumcleanup only gets called when stats are NULL. That only happens when the VACUUM returns no rows for cleanup, so the scan need only perform one of its three functions: remove pages already marked as deleted that can now be recycled into the FSM. You're right - it will be missed but its not crucial to the scan when called in that way since we'll pick it up next time around. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Greg Stark <gsstark@mit.edu> writes: > I don't see how the lack of deletions is relevant to needing vacuum-cycle-ID. > AFAICT there's still a risk that someone will come along and do a page split > underneath this scan and if the page is to the left of the scan it will be > missed. Well, if there are active insertions or deletions happening in parallel with the scan, the tuple count is going to be at best approximate anyway, no? So there's no need to be tense about ensuring we visit every single index tuple. We do want to hit all the pages so we can clean up any recyclable pages, but that's not a problem. regards, tom lane