pgsql: Improve VACUUM and ANALYZE by avoiding early lock queue - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Improve VACUUM and ANALYZE by avoiding early lock queue
Date
Msg-id E1fu5B5-0007Pm-5s@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Improve VACUUM and ANALYZE by avoiding early lock queue

A caller of VACUUM can perform early lookup obtention which can cause
other sessions to block on the request done, causing potentially DOS
attacks as even a non-privileged user can attempt a vacuum fill of a
critical catalog table to block even all incoming connection attempts.

Contrary to TRUNCATE, a client could attempt a system-wide VACUUM after
building the list of relations to VACUUM, which can cause vacuum_rel()
or analyze_rel() to try to lock the relation but the operation would
just block.  When the client specifies a list of relations and the
relation needs to be skipped, ownership checks are done when building
the list of relations to work on, preventing a later lock attempt.

vacuum_rel() already had the sanity checks needed, except that those
were applied too late.  This commit refactors the code so as relation
skips are checked beforehand, making it safer to avoid too early locks,
for both manual VACUUM with and without a list of relations specified.

An isolation test is added emulating the fact that early locks do not
happen anymore, issuing a WARNING message earlier if the user calling
VACUUM is not a relation owner.

When a partitioned table is listed in a manual VACUUM or ANALYZE
command, its full list of partitions is fetched, all partitions get
added to the list to work on, and then each one of them is processed one
by one, with ownership checks happening at the later phase of
vacuum_rel() or analyze_rel().  Trying to do early ownership checks for
each partition is proving to be tedious as this would result in deadlock
risks with lock upgrades, and skipping all partitions if the listed
partitioned table is not owned would result in a behavior change
compared to how Postgres 10 has implemented vacuum for partitioned
tables.  The original problem reported related to early lock queue for
critical relations is fixed anyway, so priority is given to avoiding a
backward-incompatible behavior.

Reported-by: Lloyd Albin, Jeremy Schneider
Author: Michael Paquier
Reviewed by: Nathan Bossart, Kyotaro Horiguchi
Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org
Discussion: https://postgr.es/m/20180812222142.GA6097@paquier.xyz

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/a556549d7e6dce15fe216bd4130ea64239f4d83f

Modified Files
--------------
src/backend/commands/analyze.c                  |  28 ++---
src/backend/commands/vacuum.c                   | 156 +++++++++++++++++-------
src/include/commands/vacuum.h                   |   3 +
src/test/isolation/expected/vacuum-conflict.out | 149 ++++++++++++++++++++++
src/test/isolation/isolation_schedule           |   1 +
src/test/isolation/specs/vacuum-conflict.spec   |  51 ++++++++
6 files changed, 328 insertions(+), 60 deletions(-)


pgsql-committers by date:

Previous
From: Thomas Munro
Date:
Subject: pgsql: Fix typos.
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Add some not null constraints to catalogs