Thread: Autovacuum endless loop in heap_page_prune()?

Autovacuum endless loop in heap_page_prune()?

From

Peter

Date:

25 May 2024, 10:51:50

Good morning,

 I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query
doesnt cancel, cluster doesn't stop, autovacuum worker is not
killable, truss shows no activity, after kill -6 this backtrace:

* thread #1, name = 'postgres', stop reason = signal SIGABRT
  * frame #0: 0x0000000000548063 postgres`HeapTupleSatisfiesVacuumHorizon + 531
    frame #1: 0x000000000054aed9 postgres`heap_page_prune + 537
    frame #2: 0x000000000054e38a postgres`heap_vacuum_rel + 3626
    frame #3: 0x00000000006af382 postgres`vacuum_rel + 626
    frame #4: 0x00000000006aeeeb postgres`vacuum + 1611
    frame #5: 0x00000000007b4664 postgres`do_autovacuum + 4292
    frame #6: 0x00000000007b2342 postgres`AutoVacWorkerMain + 866
    frame #7: 0x00000000007b1f97 postgres`StartAutoVacWorker + 39
    frame #8: 0x00000000007ba0df postgres`sigusr1_handler + 783
    frame #9: 0x00000008220da627 libthr.so.3`___lldb_unnamed_symbol683 + 215
    frame #10: 0x00000008220d9b1a libthr.so.3`___lldb_unnamed_symbol664 + 314
    frame #11: 0x00007ffffffff913
    frame #12: 0x00000000007bba25 postgres`ServerLoop + 1541
    frame #13: 0x00000000007b9467 postgres`PostmasterMain + 3207
    frame #14: 0x000000000071a566 postgres`main + 758
    frame #15: 0x00000000004f9995 postgres`_start + 261

After restart, no problems reported yet.

Storyline:
this is the file-list table of my backup/archive system, contains ~50
mio records. Recently I found a flaw in the backup system, so that some
old records weren't removed. I wrote a script to do this, that script
did run first at 04:15 and reported it had now removed a lot of old
data. I looked into pgadmin4 and it reported 9 mio dead tuples.

I wondered why autovacuum wouldn't run, and the stats showed that
regular daily vacuum had last run at 03:15 and autovacuum at 03:18. I
wondered why it wouldn't start again, and went to sleep. Having
consumed 300 minutes cpu now at 11:00, it did start at around 05:00.

No messages of hardware errors, good old Haswell Xeon/EP with all ECC.
Could there be a compute error somewhere, and does autovacuum lock
out ordinary kill signals during some time?


We're at Rel. 15.6 on FreeBSD 13.3

 *** Changed values: 
shared_buffers = 40MB
temp_buffers = 20MB
work_mem = 50MB
maintenance_work_mem = 50MB
max_stack_depth = 40MB
dynamic_shared_memory_type = posix
max_files_per_process = 200
effective_io_concurrency = 5
synchronous_commit = off
wal_sync_method = fsync
full_page_writes = off
wal_compression = on
wal_init_zero = off
wal_writer_delay = 2000ms
checkpoint_timeout = 180min
checkpoint_completion_target = 0.0
max_wal_size = 2GB
archive_mode = on
archive_timeout = 86400
seq_page_cost = 0.5
random_page_cost = 0.7
effective_cache_size = 1GB
default_statistics_target = 1000
autovacuum = on
autovacuum_naptime = 5min
autovacuum_vacuum_scale_factor = 0.01
autovacuum_analyze_scale_factor = 0.05

Re: Autovacuum endless loop in heap_page_prune()?

From

Laurenz Albe

Date:

27 May 2024, 09:25:47

On Sat, 2024-05-25 at 12:51 +0200, Peter wrote:
>  I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query
> doesnt cancel, cluster doesn't stop, autovacuum worker is not
> killable, truss shows no activity, after kill -6 this backtrace:
>
> * thread #1, name = 'postgres', stop reason = signal SIGABRT
>   * frame #0: 0x0000000000548063 postgres`HeapTupleSatisfiesVacuumHorizon + 531
>     frame #1: 0x000000000054aed9 postgres`heap_page_prune + 537
>     frame #2: 0x000000000054e38a postgres`heap_vacuum_rel + 3626
>     frame #3: 0x00000000006af382 postgres`vacuum_rel + 626
>     frame #4: 0x00000000006aeeeb postgres`vacuum + 1611
>     frame #5: 0x00000000007b4664 postgres`do_autovacuum + 4292
>     frame #6: 0x00000000007b2342 postgres`AutoVacWorkerMain + 866
>     frame #7: 0x00000000007b1f97 postgres`StartAutoVacWorker + 39
>     frame #8: 0x00000000007ba0df postgres`sigusr1_handler + 783
>     frame #9: 0x00000008220da627 libthr.so.3`___lldb_unnamed_symbol683 + 215
>     frame #10: 0x00000008220d9b1a libthr.so.3`___lldb_unnamed_symbol664 + 314
>     frame #11: 0x00007ffffffff913
>     frame #12: 0x00000000007bba25 postgres`ServerLoop + 1541
>     frame #13: 0x00000000007b9467 postgres`PostmasterMain + 3207
>     frame #14: 0x000000000071a566 postgres`main + 758
>     frame #15: 0x00000000004f9995 postgres`_start + 261
>
> After restart, no problems reported yet.
>
> Storyline:
> this is the file-list table of my backup/archive system, contains ~50
> mio records. Recently I found a flaw in the backup system, so that some
> old records weren't removed. I wrote a script to do this, that script
> did run first at 04:15 and reported it had now removed a lot of old
> data. I looked into pgadmin4 and it reported 9 mio dead tuples.

This smells of index corruption.

I have seen cases where a corrupted index sends VACUUM into an endless loop
so that it does not react to query cancellation.

Check the index with the "bt_index_check()" function from the "amcheck"
extension.  If that reports a problem, rebuild the index.

Of course, as always, try to figure out how that could happen.
Apart from hardware problems, one frequent cause is upgrading glibc
(if the index on a string column or expression).

Yours,
Laurenz Albe

Re: Autovacuum endless loop in heap_page_prune()?

From

Peter

Date:

27 May 2024, 10:48:14

On Mon, May 27, 2024 at 11:25:47AM +0200, Laurenz Albe wrote:
! On Sat, 2024-05-25 at 12:51 +0200, Peter wrote:
! >  I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query
! > doesnt cancel, cluster doesn't stop, autovacuum worker is not
! > killable, truss shows no activity, after kill -6 this backtrace:
! >
! > * thread #1, name = 'postgres', stop reason = signal SIGABRT
! >   * frame #0: 0x0000000000548063 postgres`HeapTupleSatisfiesVacuumHorizon + 531
! >     frame #1: 0x000000000054aed9 postgres`heap_page_prune + 537
! >     frame #2: 0x000000000054e38a postgres`heap_vacuum_rel + 3626
! >     frame #3: 0x00000000006af382 postgres`vacuum_rel + 626
! >     frame #4: 0x00000000006aeeeb postgres`vacuum + 1611
! >     frame #5: 0x00000000007b4664 postgres`do_autovacuum + 4292
! >     frame #6: 0x00000000007b2342 postgres`AutoVacWorkerMain + 866
! >     frame #7: 0x00000000007b1f97 postgres`StartAutoVacWorker + 39
! >     frame #8: 0x00000000007ba0df postgres`sigusr1_handler + 783
! >     frame #9: 0x00000008220da627 libthr.so.3`___lldb_unnamed_symbol683 + 215
! >     frame #10: 0x00000008220d9b1a libthr.so.3`___lldb_unnamed_symbol664 + 314
! >     frame #11: 0x00007ffffffff913
! >     frame #12: 0x00000000007bba25 postgres`ServerLoop + 1541
! >     frame #13: 0x00000000007b9467 postgres`PostmasterMain + 3207
! >     frame #14: 0x000000000071a566 postgres`main + 758
! >     frame #15: 0x00000000004f9995 postgres`_start + 261
! >
! > After restart, no problems reported yet.
! >
! > Storyline:
! > this is the file-list table of my backup/archive system, contains ~50
! > mio records. Recently I found a flaw in the backup system, so that some
! > old records weren't removed. I wrote a script to do this, that script
! > did run first at 04:15 and reported it had now removed a lot of old
! > data. I looked into pgadmin4 and it reported 9 mio dead tuples.
!
! This smells of index corruption.
!
! I have seen cases where a corrupted index sends VACUUM into an endless loop
! so that it does not react to query cancellation.

Thanks for Your reply. So it can happen...

! Check the index with the "bt_index_check()" function from the "amcheck"
! extension.  If that reports a problem, rebuild the index.

---------------------------------------------------------------------
-- Table: public.file

-- DROP TABLE IF EXISTS public.file;

CREATE TABLE IF NOT EXISTS public.file
(
    fileid bigint NOT NULL DEFAULT nextval('file_fileid_seq'::regclass),
    fileindex integer NOT NULL DEFAULT 0,
    jobid integer NOT NULL,
    pathid integer NOT NULL,
    deltaseq smallint NOT NULL DEFAULT 0,
    markid integer NOT NULL DEFAULT 0,
    fhinfo numeric(20,0) NOT NULL DEFAULT 0,
    fhnode numeric(20,0) NOT NULL DEFAULT 0,
    lstat text COLLATE pg_catalog."default" NOT NULL,
    md5 text COLLATE pg_catalog."default" NOT NULL,
    name text COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT file_pkey PRIMARY KEY (fileid)
)

TABLESPACE pg_default;

ALTER TABLE IF EXISTS public.file
    OWNER to bareos;
-- Index: file_jpfid_idx

-- DROP INDEX IF EXISTS public.file_jpfid_idx;

CREATE INDEX IF NOT EXISTS file_jpfid_idx
    ON public.file USING btree
    (jobid ASC NULLS LAST, pathid ASC NULLS LAST, name COLLATE pg_catalog."default" ASC NULLS LAST)
    TABLESPACE pg_default;
-- Index: file_pjidpart_idx

-- DROP INDEX IF EXISTS public.file_pjidpart_idx;

CREATE INDEX IF NOT EXISTS file_pjidpart_idx
    ON public.file USING btree
    (pathid ASC NULLS LAST, jobid ASC NULLS LAST)
    TABLESPACE pg_default
    WHERE fileindex = 0 AND name = ''::text;
---------------------------------------------------------------------

bareos=# SET client_min_messages = DEBUG1;
bareos=# SELECT bt_index_check(index => 16753, heapallindexed => true);
DEBUG:  verifying consistency of tree structure for index "file_pjidpart_idx"
DEBUG:  verifying level 2 (true root level)
DEBUG:  verifying level 1
DEBUG:  verifying level 0 (leaf level)
DEBUG:  verifying that tuples from index "file_pjidpart_idx" are present in "file"
DEBUG:  finished verifying presence of 490834 tuples from table "file" with bitset 33.61% set
LOG:  duration: 55912.563 ms  statement: SELECT bt_index_check(index => 16753, heapallindexed => true);
 bt_index_check
----------------

(1 row)

bareos=# SELECT bt_index_check(index => 16752, heapallindexed => true);
DEBUG:  verifying consistency of tree structure for index "file_jpfid_idx"
DEBUG:  verifying level 3 (true root level)
DEBUG:  verifying level 2
DEBUG:  verifying level 1
DEBUG:  verifying level 0 (leaf level)
DEBUG:  verifying that tuples from index "file_jpfid_idx" are present in "file"
DEBUG:  finished verifying presence of 45922943 tuples from table "file" with bitset 15.72% set
LOG:  duration: 119843.507 ms  statement: SELECT bt_index_check(index => 16752, heapallindexed => true);
 bt_index_check
----------------

(1 row)

bareos=# SELECT bt_index_check(index => 16712, heapallindexed => true);
DEBUG:  verifying consistency of tree structure for index "file_pkey"
DEBUG:  verifying level 3 (true root level)
DEBUG:  verifying level 2
DEBUG:  verifying level 1
DEBUG:  verifying level 0 (leaf level)
DEBUG:  verifying that tuples from index "file_pkey" are present in "file"
DEBUG:  finished verifying presence of 45922962 tuples from table "file" with bitset 40.14% set
LOG:  duration: 55077.349 ms  statement: SELECT bt_index_check(index => 16712, heapallindexed => true);
 bt_index_check
----------------

(1 row)

bareos=# SELECT bt_index_check(index => 16499, heapallindexed => true);
DEBUG:  verifying consistency of tree structure for index "pg_toast_16490_index"
DEBUG:  verifying that tuples from index "pg_toast_16490_index" are present in "pg_toast_16490"
DEBUG:  finished verifying presence of 0 tuples from table "pg_toast_16490" with bitset 0.00% set
 bt_index_check
----------------

(1 row)

------------------------------------------------------------

I don't see a problem there.

After the database did recover, I did manually run VACUUM on the
respecitve table, it did run successfully in the expected timeframe,
the 9 mio. dead tubles went away as expected, and behaviour is all
correct since then (i.e. new data is now writing the *first* file
of the table, so it did successfully vacuum).

! Of course, as always, try to figure out how that could happen.

Yeah, that's exactly the point. Otherwise I have no issue anymore.

What I could do now, is a point-in-time recovery and rerun the
situation. But that will not be 100% identical, as the backup system
does continuousely put some new data in.

! Apart from hardware problems, one frequent cause is upgrading glibc
! (if the index on a string column or expression).

No, this is FreeBSD, we don't normally do such things... ;)

Also, that shouldn't happen with locale 'C', 'SQL_ASCII':
CREATE DATABASE bareos
    WITH
    OWNER = postgres
    ENCODING = 'SQL_ASCII'
    LC_COLLATE = 'C'
    LC_CTYPE = 'C'
    TABLESPACE = pg_default
    CONNECTION LIMIT = -1
    IS_TEMPLATE = False;



regards,
PMc

Re: Autovacuum endless loop in heap_page_prune()?

From

Laurenz Albe

Date:

27 May 2024, 11:51:56

On Mon, 2024-05-27 at 12:48 +0200, Peter wrote:
> On Mon, May 27, 2024 at 11:25:47AM +0200, Laurenz Albe wrote:
> ! On Sat, 2024-05-25 at 12:51 +0200, Peter wrote:
> ! >  I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query
> ! > doesnt cancel, cluster doesn't stop, autovacuum worker is not
> ! > killable
> !
> ! This smells of index corruption.
> !
> ! I have seen cases where a corrupted index sends VACUUM into an endless loop
> ! so that it does not react to query cancellation.
>
> Thanks for Your reply. So it can happen...

Well, not normally.

> ! Check the index with the "bt_index_check()" function from the "amcheck"
> ! extension.  If that reports a problem, rebuild the index.
>
> [runs amcheck test, no problem is reported]
> I don't see a problem there.
>
> After the database did recover, I did manually run VACUUM on the
> respecitve table, it did run successfully in the expected timeframe

Ok...
What made me suspect corruption is that the process could not
be interrupted.  Otherwise I'd have assumed that autivacuum was just
running too slowly.

> ! Apart from hardware problems, one frequent cause is upgrading glibc
> ! (if the index on a string column or expression).
>
> No, this is FreeBSD, we don't normally do such things... ;)

You don't update the C library, or collations never change?

> Also, that shouldn't happen with locale 'C', 'SQL_ASCII':

Right.

The C locale is safe (but of course SQL_ASCII is a mistake).

Yours,
Laurenz Albe

Re: Autovacuum endless loop in heap_page_prune()?

From

Peter

Date:

27 May 2024, 13:49:38

On Mon, May 27, 2024 at 01:51:56PM +0200, Laurenz Albe wrote:

! > ! Apart from hardware problems, one frequent cause is upgrading glibc
! > ! (if the index on a string column or expression).
! > 
! > No, this is FreeBSD, we don't normally do such things... ;)
! 
! You don't update the C library, or collations never change?

I rarely update the C library.
Kernel + libraries + OS programs are a single unit here, updated
about once a year, and then by many people and with all the usual
testing.
I could lookup how often some locale was modified, but honestly, I'm
too lazy now.

| (but of course SQL_ASCII is a mistake).

Really? I re-read the chapter on Locale/CharSet Support, and I don't
see the mistake. Only one can not convert that to anything else, which
is fine in this usecase (storing arbitrary filenames from any OS in
any character-set within the same column). 


regards,
PMc

Re: Autovacuum endless loop in heap_page_prune()?

From

Laurenz Albe

Date:

27 May 2024, 14:19:25

On Mon, 2024-05-27 at 15:49 +0200, Peter wrote:
> > (but of course SQL_ASCII is a mistake).
>
> Really? I re-read the chapter on Locale/CharSet Support, and I don't
> see the mistake. Only one can not convert that to anything else, which
> is fine in this usecase (storing arbitrary filenames from any OS in
> any character-set within the same column).

That will only work fine as long as everybody uses the same encoding,
or everybody uses only ASCII characters, or everybody works only on
their own data.

Yours,
Laurenz Albe