pgsql: Code review for early drop of orphaned temp relations in autovac - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Code review for early drop of orphaned temp relations in autovac
Date
Msg-id E1cBBbd-0000UF-Sd@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Code review for early drop of orphaned temp relations in autovacuum.

Commit a734fd5d1 exposed some race conditions that existed previously
in the autovac code, but were basically harmless because autovac would
not try to delete orphaned relations immediately.  Specifically, the test
for orphaned-ness was made on a pg_class tuple that might be dead by now,
allowing autovac to try to remove a table that the owning backend had just
finished deleting.  This resulted in a hard crash due to inadequate caution
about accessing the table's catalog entries without any lock.  We must take
a relation lock and then recheck whether the table is still present and
still looks deletable before we do anything.

Also, it seemed to me that deleting multiple tables per transaction, and
trying to continue after errors, represented unjustifiable complexity.
We do not expect this code path to be taken often in the field, nor even
during testing, which means that prioritizing performance over correctness
is a bad tradeoff.  Rip all that out in favor of just starting a new
transaction after each successful temp table deletion.  If we're unlucky
enough to get an error, which shouldn't happen anyway now that we're being
more cautious, let the autovacuum worker fail as it normally would.

In passing, improve the order of operations in the initial scan loop.
Now that we don't care about whether a temp table is a wraparound hazard,
there's no need to perform extract_autovac_opts, get_pgstat_tabentry_relid,
or relation_needs_vacanalyze for temp tables.

Also, if GetTempNamespaceBackendId returns InvalidBackendId (indicating
it doesn't recognize the schema as temp), treat that as meaning it's NOT
an orphaned temp table, not that it IS one, which is what happened before
because BackendIdGetProc necessarily failed.  The case really shouldn't
come up for a table that has RELPERSISTENCE_TEMP, but the consequences
if it did seem undesirable.  (This might represent a back-patchable bug
fix; not sure if it's worth the trouble.)

Discussion: https://postgr.es/m/21299.1480272347@sss.pgh.pa.us

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/dafa0848da11260e9510c699e7060171336cb550

Modified Files
--------------
src/backend/postmaster/autovacuum.c | 255 ++++++++++++++++--------------------
1 file changed, 112 insertions(+), 143 deletions(-)


pgsql-committers by date:

Previous
From: Magnus Hagander
Date:
Subject: pgsql: Mention server start requirement for ssl parameters
Next
From: Tom Lane
Date:
Subject: pgsql: Fix busted tab-completion pattern for ALTER TABLE t ALTER c DROP