pgsql: Use GlobalVisState in vacuum to determine page level visibility - Mailing list pgsql-committers

From Melanie Plageman
Subject pgsql: Use GlobalVisState in vacuum to determine page level visibility
Date
Msg-id E1w56v6-001H5i-1I@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Use GlobalVisState in vacuum to determine page level visibility

During vacuum's first and third phases, we examine tuples' visibility to
determine if we can set the page all-visible in the visibility map.

Previously, this check compared tuple xmins against a single XID chosen
at the start of vacuum (OldestXmin). We now use GlobalVisState, which
enables future work to set the VM during on-access pruning, since
ordinary queries have access to GlobalVisState but not OldestXmin.

This also benefits vacuum: in some cases, GlobalVisState may advance
during a vacuum, allowing more pages to become considered all-visible.
And, in the future, we could easily add a heuristic to update
GlobalVisState more frequently during vacuums of large tables.

OldestXmin is still used for freezing and as a backstop to ensure we
don't freeze a dead tuple that wasn't yet prunable according to
GlobalVisState in the rare occurrences where GlobalVisState moves
backwards.

Because comparing a transaction ID against GlobalVisState is more
expensive than comparing against a single XID, we defer this check until
after scanning all tuples on the page. Therefore, we perform the
GlobalVisState check only once per page. This is safe because
visibility_cutoff_xid records the newest live xmin on the page; if it is
globally visible, then the entire page is all-visible.

Using GlobalVisState means on-access pruning can also maintain
visibility_cutoff_xid, which is required to set the visibility map
on-access in the future.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion:
https://postgr.es/m/flat/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk#c755ef151507aba58471ffaca607e493

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/dd5716f3c74df6ebc97f5886b755ba79a3f5b559

Modified Files
--------------
src/backend/access/heap/heapam_visibility.c |  5 ++-
src/backend/access/heap/pruneheap.c         | 54 +++++++++++++-------------
src/backend/access/heap/vacuumlazy.c        | 60 ++++++++++++++++++++---------
src/backend/access/spgist/spgvacuum.c       |  2 +-
src/backend/storage/ipc/procarray.c         | 45 +++++++++++++++++++---
src/include/utils/snapmgr.h                 | 11 +++++-
6 files changed, 122 insertions(+), 55 deletions(-)


pgsql-committers by date:

Previous
From: Álvaro Herrera
Date:
Subject: pgsql: Avoid including clog.h in proc.h
Next
From: Melanie Plageman
Date:
Subject: pgsql: Keep newest live XID up-to-date even if page not all-visible