pgsql: Fix assorted bugs in contrib/bloom. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Fix assorted bugs in contrib/bloom.
Date
Msg-id E1bYl6r-0001Tg-DK@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix assorted bugs in contrib/bloom.

In blinsert(), cope with the possibility that a page we pull from the
notFullPage list is marked BLOOM_DELETED.  This could happen if VACUUM
recently marked it deleted but hasn't (yet) updated the metapage.
We can re-use such a page safely, but we *must* reinitialize it so that
it's no longer marked deleted.

Fix blvacuum() so that it updates the notFullPage list even if it's
going to update it to empty.  The previous "optimization" of skipping
the update seems pretty dubious, since it means that the next blinsert()
will uselessly visit whatever pages we left in the list.

Uniformly treat PageIsNew pages the same as deleted pages.  This should
allow proper recovery if a crash occurs just after relation extension.

Properly use vacuum_delay_point, not assorted ad-hoc CHECK_FOR_INTERRUPTS
calls, in the blvacuum() main loop.

Fix broken tuple-counting logic: blvacuum.c counted the number of live
index tuples over again in each scan, leading to VACUUM VERBOSE reporting
some multiple of the actual number of surviving index tuples after any
vacuum that removed any tuples (since they'd be counted in blvacuum, maybe
more than once, and then again in blvacuumcleanup, without ever zeroing the
counter).  It's sufficient to count them in blvacuumcleanup.

stats->estimated_count is a boolean, not a counter, and we don't want
to set it true, so don't add tuple counts to it.

Add a couple of Asserts that we don't overrun available space on a bloom
page.  I don't think there's any bug there today, but the way the
FreeBlockNumberArray size calculation is set up is scarily fragile, and
BloomPageGetFreeSpace isn't much better.  The Asserts should help catch
any future mistakes.

Per investigation of a report from Jeff Janes.  I think the first item
above may explain his report; the other changes were things I noticed
while casting about for an explanation.

Report: <CAMkU=1xEUuBphDwDmB1WjN4+td4kpnEniFaTBxnk1xzHCw8_OQ@mail.gmail.com>

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/d6c9e05cb7db64239887fac65b243229594f331d

Modified Files
--------------
contrib/bloom/blinsert.c | 11 +++++++
contrib/bloom/blscan.c   |  2 +-
contrib/bloom/blutils.c  | 13 +++++++--
contrib/bloom/blvacuum.c | 75 ++++++++++++++++++++++++------------------------
4 files changed, 60 insertions(+), 41 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Add SQL-accessible functions for inspecting index AM properties.
Next
From: Tom Lane
Date:
Subject: pgsql: Remove bogus dependencies on NUMERIC_MAX_PRECISION.