pgsql: Improve FSM management for BRIN indexes. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Improve FSM management for BRIN indexes.
Date
Msg-id E1f3n6q-0007e6-PS@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Improve FSM management for BRIN indexes.

BRIN indexes like to propagate additions of free space into the upper pages
of their free space maps as soon as the new space is known, even when it's
just on one individual index page.  Previously this required calling
FreeSpaceMapVacuum, which is quite an expensive thing if the map is large.
Use the FreeSpaceMapVacuumRange function recently added by commit c79f6df75
to reduce the amount of work done for this purpose.

Fix a couple of places that neglected to do the upper-page vacuuming at all
after recording new free space.  If the policy is to be that BRIN should do
that, it should do it everywhere.

Do RecordPageWithFreeSpace unconditionally in brin_page_cleanup, and do
FreeSpaceMapVacuum unconditionally in brin_vacuum_scan.  Because of the
FSM's imprecise storage of free space, the old complications here seldom
bought anything, they just slowed things down.  This approach also
provides a predictable path for FSM corruption to be repaired.

Remove premature RecordPageWithFreeSpace call in brin_getinsertbuffer
where it's about to return an extended page to the caller.  The caller
should do that, instead, after it's inserted its new tuple.  Fix the
one caller that forgot to do so.

Simplify logic in brin_doupdate's same-page-update case by postponing
brin_initialize_empty_new_buffer to after the critical section; I see
little point in doing it before.

Avoid repeat calls of RelationGetNumberOfBlocks in brin_vacuum_scan.
Avoid duplicate BufferGetBlockNumber and BufferGetPage calls in
a couple of places where we already had the right values.

Move a BRIN_elog debug logging call out of a critical section; that's
pretty unsafe and I don't think it buys us anything to not wait till
after the critical section.

Move the "*extended = false" step in brin_getinsertbuffer into the
routine's main loop.  There's no actual bug there, since the loop can't
iterate with *extended still true, but it doesn't seem very future-proof
as coded; and it's certainly not documented as a loop invariant.

This is all from follow-on investigation inspired by commit c79f6df75.

Discussion: https://postgr.es/m/5801.1522429460@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/1383e2a1a937116e1367c9584984f0730f9ef4d5

Modified Files
--------------
src/backend/access/brin/brin.c         |  29 +++---
src/backend/access/brin/brin_pageops.c | 155 ++++++++++++++++++---------------
src/include/access/brin_pageops.h      |   2 +-
3 files changed, 103 insertions(+), 83 deletions(-)


pgsql-committers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: pgsql: Optimize btree insertions for common case of increasing values
Next
From: Bruce Momjian
Date:
Subject: pgsql: docs: update ltree URL for the DMOZ catalog