Flaky vacuum truncate test in reloptions.sql - Mailing list pgsql-hackers

From Arseny Sher
Subject Flaky vacuum truncate test in reloptions.sql
Date
Msg-id 87tuotr2hh.fsf@ars-thinkpad
Whole thread Raw
Responses Re: Flaky vacuum truncate test in reloptions.sql  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hi,

I rarely observe failure of vacuum with truncation test in
reloptions.sql, i.e. the truncation doesn't happen:

--- ../../src/test/regress/expected/reloptions.out      2020-04-16 12:37:17.749547401 +0300
+++ ../../src/test/regress/results/reloptions.out       2020-04-17 00:14:58.999211750 +0300
@@ -131,7 +131,7 @@
 SELECT pg_relation_size('reloptions_test') = 0;
  ?column?
 ----------
- t
+ f
 (1 row)

Intimate reading of lazy_scan_heap says that the failure indeed might
happen; if ConditionalLockBufferForCleanup couldn't lock the buffer and
either the buffer doesn't need freezing or vacuum is not aggressive, we
don't insist on close inspection of the page contents and count it as
nonempty according to lazy_check_needs_freeze. It means the page is
regarded as such even if it contains only garbage (but occupied) ItemIds,
which is the case of the test. And of course this allegedly nonempty
page prevents the truncation. Obvious competitors for the page are
bgwriter/checkpointer; the chances of a simultaneous attack are small
but they exist.

A simple fix is to perform aggressive VACUUM FREEZE, as attached.

I'm a bit puzzled that I've ever seen this only when running regression
tests under our multimaster. While multimaster contains a fair amount of
C code, I don't see how any of it can interfere with the vacuuming
business here. I can't say I did my best to create the repoduction
though -- the explanation above seems to be enough.


--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


diff --git a/src/test/regress/expected/reloptions.out b/src/test/regress/expected/reloptions.out
index 44c130409ff..fa1c223c87a 100644
--- a/src/test/regress/expected/reloptions.out
+++ b/src/test/regress/expected/reloptions.out
@@ -127,7 +127,8 @@ SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass;
 INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL);
 ERROR:  null value in column "i" of relation "reloptions_test" violates not-null constraint
 DETAIL:  Failing row contains (null, null).
-VACUUM reloptions_test;
+-- do aggressive vacuum to be sure we won't skip the page
+VACUUM FREEZE reloptions_test;
 SELECT pg_relation_size('reloptions_test') = 0;
  ?column? 
 ----------
diff --git a/src/test/regress/sql/reloptions.sql b/src/test/regress/sql/reloptions.sql
index cac5b0bcb0d..a84aae5093b 100644
--- a/src/test/regress/sql/reloptions.sql
+++ b/src/test/regress/sql/reloptions.sql
@@ -71,7 +71,8 @@ SELECT reloptions FROM pg_class WHERE oid =
 ALTER TABLE reloptions_test RESET (vacuum_truncate);
 SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass;
 INSERT INTO reloptions_test VALUES (1, NULL), (NULL, NULL);
-VACUUM reloptions_test;
+-- do aggressive vacuum to be sure we won't skip the page
+VACUUM FREEZE reloptions_test;
 SELECT pg_relation_size('reloptions_test') = 0;
 
 -- Test toast.* options

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [PATCH v3 1/1] Fix detection of preadv/pwritev support for OSX.
Next
From: Amit Langote
Date:
Subject: Re: making update/delete of inheritance trees scale better