Re: Bug in amcheck? - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Bug in amcheck?
Date
Msg-id 88c727b2-1c65-4ee9-8bed-48a4813818dd@iki.fi
Whole thread Raw
In response to Re: Bug in amcheck?  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-hackers
On 16/01/2026 08:00, Alexander Lakhin wrote:
> 03.01.2026 04:40, Tom Lane wrote:
>> In the past couple of days, scorpion and skink have failed
>> the nbtree_half_dead_pages test with identical symptoms [1][2]:
>> ...
>> [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl? 
>> nm=scorpion&dt=2026-01-02%2004%3A54%3A38
>> [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl? 
>> nm=skink&dt=2025-12-31%2003%3A34%3A51
> 
> I reproduced such failures locally (when running multiple test
> instances under Valgrind concurrently) and discovered that the test might
> fail due to autovacuum activity. (Apparently because
> heap_prune_satisfies_vacuum() returns HEAPTUPLE_RECENTLY_DEAD, not
> HEAPTUPLE_DEAD for tuples in question, so prune_freeze_plan()/
> heap_page_prune_and_freeze() finds 0 lpdead_items.)
> 
> pgsql.build/testrun/nbtree/regress/log/postmaster.log in [2] contains:
> 2025-12-31 06:00:41.778 CET autovacuum worker[2250984] LOG: automatic 
> analyze of table "template1.information_schema.sql_features"
> 
> (The postmaster log is missing in [1] for some reason...)
> 
> I've also managed to reproduce this just with the attached patch and:
> echo "autovacuum_naptime = 1" > /tmp/temp.config
> TEMP_CONFIG=/tmp/temp.config make -s check -C src/test/modules/nbtree
> 
> ok 86        - nbtree_half_dead_pages                    319 ms
> not ok 87    - nbtree_half_dead_pages                    324 ms
> ok 88        - nbtree_half_dead_pages                    326 ms
> ...
> # 1 of 101 tests failed.

Great, thanks! I was able to readily reproduce it by adding a delay to 
auto-analyze (you still need to run it around 5 times in a row, for the 
auto-analyze to kick):

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index aa4fbec143f..4f91ce84786 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -645,6 +645,8 @@ vacuum(List *relations, const VacuumParams params, 
BufferAccessStrategy bstrateg
                      StartTransactionCommand();
                      /* functions in indexes may want a snapshot set */
                      PushActiveSnapshot(GetTransactionSnapshot());
+                    if (AmAutoVacuumWorkerProcess())
+                        pg_usleep(1000000);
                  }

                  analyze_rel(vrel->oid, vrel->relation, params,

Pushed a fix using a little helper procedure to wait for snapshots 
holding back the vacuum horizon to finish. It's the same approach as in 
the syscache-update-pruned test.

- Heikki




pgsql-hackers by date:

Previous
From: Andrey Borodin
Date:
Subject: Re: CREATE TABLE LIKE INCLUDING TRIGGERS
Next
From: Ashutosh Bapat
Date:
Subject: Re: Refactor replication origin state reset helpers