Thread: Autovacuum endless loop in heap_page_prune()?
Good morning, I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query doesnt cancel, cluster doesn't stop, autovacuum worker is not killable, truss shows no activity, after kill -6 this backtrace: * thread #1, name = 'postgres', stop reason = signal SIGABRT * frame #0: 0x0000000000548063 postgres`HeapTupleSatisfiesVacuumHorizon + 531 frame #1: 0x000000000054aed9 postgres`heap_page_prune + 537 frame #2: 0x000000000054e38a postgres`heap_vacuum_rel + 3626 frame #3: 0x00000000006af382 postgres`vacuum_rel + 626 frame #4: 0x00000000006aeeeb postgres`vacuum + 1611 frame #5: 0x00000000007b4664 postgres`do_autovacuum + 4292 frame #6: 0x00000000007b2342 postgres`AutoVacWorkerMain + 866 frame #7: 0x00000000007b1f97 postgres`StartAutoVacWorker + 39 frame #8: 0x00000000007ba0df postgres`sigusr1_handler + 783 frame #9: 0x00000008220da627 libthr.so.3`___lldb_unnamed_symbol683 + 215 frame #10: 0x00000008220d9b1a libthr.so.3`___lldb_unnamed_symbol664 + 314 frame #11: 0x00007ffffffff913 frame #12: 0x00000000007bba25 postgres`ServerLoop + 1541 frame #13: 0x00000000007b9467 postgres`PostmasterMain + 3207 frame #14: 0x000000000071a566 postgres`main + 758 frame #15: 0x00000000004f9995 postgres`_start + 261 After restart, no problems reported yet. Storyline: this is the file-list table of my backup/archive system, contains ~50 mio records. Recently I found a flaw in the backup system, so that some old records weren't removed. I wrote a script to do this, that script did run first at 04:15 and reported it had now removed a lot of old data. I looked into pgadmin4 and it reported 9 mio dead tuples. I wondered why autovacuum wouldn't run, and the stats showed that regular daily vacuum had last run at 03:15 and autovacuum at 03:18. I wondered why it wouldn't start again, and went to sleep. Having consumed 300 minutes cpu now at 11:00, it did start at around 05:00. No messages of hardware errors, good old Haswell Xeon/EP with all ECC. Could there be a compute error somewhere, and does autovacuum lock out ordinary kill signals during some time? We're at Rel. 15.6 on FreeBSD 13.3 *** Changed values: shared_buffers = 40MB temp_buffers = 20MB work_mem = 50MB maintenance_work_mem = 50MB max_stack_depth = 40MB dynamic_shared_memory_type = posix max_files_per_process = 200 effective_io_concurrency = 5 synchronous_commit = off wal_sync_method = fsync full_page_writes = off wal_compression = on wal_init_zero = off wal_writer_delay = 2000ms checkpoint_timeout = 180min checkpoint_completion_target = 0.0 max_wal_size = 2GB archive_mode = on archive_timeout = 86400 seq_page_cost = 0.5 random_page_cost = 0.7 effective_cache_size = 1GB default_statistics_target = 1000 autovacuum = on autovacuum_naptime = 5min autovacuum_vacuum_scale_factor = 0.01 autovacuum_analyze_scale_factor = 0.05
On Sat, 2024-05-25 at 12:51 +0200, Peter wrote: > I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query > doesnt cancel, cluster doesn't stop, autovacuum worker is not > killable, truss shows no activity, after kill -6 this backtrace: > > * thread #1, name = 'postgres', stop reason = signal SIGABRT > * frame #0: 0x0000000000548063 postgres`HeapTupleSatisfiesVacuumHorizon + 531 > frame #1: 0x000000000054aed9 postgres`heap_page_prune + 537 > frame #2: 0x000000000054e38a postgres`heap_vacuum_rel + 3626 > frame #3: 0x00000000006af382 postgres`vacuum_rel + 626 > frame #4: 0x00000000006aeeeb postgres`vacuum + 1611 > frame #5: 0x00000000007b4664 postgres`do_autovacuum + 4292 > frame #6: 0x00000000007b2342 postgres`AutoVacWorkerMain + 866 > frame #7: 0x00000000007b1f97 postgres`StartAutoVacWorker + 39 > frame #8: 0x00000000007ba0df postgres`sigusr1_handler + 783 > frame #9: 0x00000008220da627 libthr.so.3`___lldb_unnamed_symbol683 + 215 > frame #10: 0x00000008220d9b1a libthr.so.3`___lldb_unnamed_symbol664 + 314 > frame #11: 0x00007ffffffff913 > frame #12: 0x00000000007bba25 postgres`ServerLoop + 1541 > frame #13: 0x00000000007b9467 postgres`PostmasterMain + 3207 > frame #14: 0x000000000071a566 postgres`main + 758 > frame #15: 0x00000000004f9995 postgres`_start + 261 > > After restart, no problems reported yet. > > Storyline: > this is the file-list table of my backup/archive system, contains ~50 > mio records. Recently I found a flaw in the backup system, so that some > old records weren't removed. I wrote a script to do this, that script > did run first at 04:15 and reported it had now removed a lot of old > data. I looked into pgadmin4 and it reported 9 mio dead tuples. This smells of index corruption. I have seen cases where a corrupted index sends VACUUM into an endless loop so that it does not react to query cancellation. Check the index with the "bt_index_check()" function from the "amcheck" extension. If that reports a problem, rebuild the index. Of course, as always, try to figure out how that could happen. Apart from hardware problems, one frequent cause is upgrading glibc (if the index on a string column or expression). Yours, Laurenz Albe
On Mon, May 27, 2024 at 11:25:47AM +0200, Laurenz Albe wrote: ! On Sat, 2024-05-25 at 12:51 +0200, Peter wrote: ! > I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query ! > doesnt cancel, cluster doesn't stop, autovacuum worker is not ! > killable, truss shows no activity, after kill -6 this backtrace: ! > ! > * thread #1, name = 'postgres', stop reason = signal SIGABRT ! > * frame #0: 0x0000000000548063 postgres`HeapTupleSatisfiesVacuumHorizon + 531 ! > frame #1: 0x000000000054aed9 postgres`heap_page_prune + 537 ! > frame #2: 0x000000000054e38a postgres`heap_vacuum_rel + 3626 ! > frame #3: 0x00000000006af382 postgres`vacuum_rel + 626 ! > frame #4: 0x00000000006aeeeb postgres`vacuum + 1611 ! > frame #5: 0x00000000007b4664 postgres`do_autovacuum + 4292 ! > frame #6: 0x00000000007b2342 postgres`AutoVacWorkerMain + 866 ! > frame #7: 0x00000000007b1f97 postgres`StartAutoVacWorker + 39 ! > frame #8: 0x00000000007ba0df postgres`sigusr1_handler + 783 ! > frame #9: 0x00000008220da627 libthr.so.3`___lldb_unnamed_symbol683 + 215 ! > frame #10: 0x00000008220d9b1a libthr.so.3`___lldb_unnamed_symbol664 + 314 ! > frame #11: 0x00007ffffffff913 ! > frame #12: 0x00000000007bba25 postgres`ServerLoop + 1541 ! > frame #13: 0x00000000007b9467 postgres`PostmasterMain + 3207 ! > frame #14: 0x000000000071a566 postgres`main + 758 ! > frame #15: 0x00000000004f9995 postgres`_start + 261 ! > ! > After restart, no problems reported yet. ! > ! > Storyline: ! > this is the file-list table of my backup/archive system, contains ~50 ! > mio records. Recently I found a flaw in the backup system, so that some ! > old records weren't removed. I wrote a script to do this, that script ! > did run first at 04:15 and reported it had now removed a lot of old ! > data. I looked into pgadmin4 and it reported 9 mio dead tuples. ! ! This smells of index corruption. ! ! I have seen cases where a corrupted index sends VACUUM into an endless loop ! so that it does not react to query cancellation. Thanks for Your reply. So it can happen... ! Check the index with the "bt_index_check()" function from the "amcheck" ! extension. If that reports a problem, rebuild the index. --------------------------------------------------------------------- -- Table: public.file -- DROP TABLE IF EXISTS public.file; CREATE TABLE IF NOT EXISTS public.file ( fileid bigint NOT NULL DEFAULT nextval('file_fileid_seq'::regclass), fileindex integer NOT NULL DEFAULT 0, jobid integer NOT NULL, pathid integer NOT NULL, deltaseq smallint NOT NULL DEFAULT 0, markid integer NOT NULL DEFAULT 0, fhinfo numeric(20,0) NOT NULL DEFAULT 0, fhnode numeric(20,0) NOT NULL DEFAULT 0, lstat text COLLATE pg_catalog."default" NOT NULL, md5 text COLLATE pg_catalog."default" NOT NULL, name text COLLATE pg_catalog."default" NOT NULL, CONSTRAINT file_pkey PRIMARY KEY (fileid) ) TABLESPACE pg_default; ALTER TABLE IF EXISTS public.file OWNER to bareos; -- Index: file_jpfid_idx -- DROP INDEX IF EXISTS public.file_jpfid_idx; CREATE INDEX IF NOT EXISTS file_jpfid_idx ON public.file USING btree (jobid ASC NULLS LAST, pathid ASC NULLS LAST, name COLLATE pg_catalog."default" ASC NULLS LAST) TABLESPACE pg_default; -- Index: file_pjidpart_idx -- DROP INDEX IF EXISTS public.file_pjidpart_idx; CREATE INDEX IF NOT EXISTS file_pjidpart_idx ON public.file USING btree (pathid ASC NULLS LAST, jobid ASC NULLS LAST) TABLESPACE pg_default WHERE fileindex = 0 AND name = ''::text; --------------------------------------------------------------------- bareos=# SET client_min_messages = DEBUG1; bareos=# SELECT bt_index_check(index => 16753, heapallindexed => true); DEBUG: verifying consistency of tree structure for index "file_pjidpart_idx" DEBUG: verifying level 2 (true root level) DEBUG: verifying level 1 DEBUG: verifying level 0 (leaf level) DEBUG: verifying that tuples from index "file_pjidpart_idx" are present in "file" DEBUG: finished verifying presence of 490834 tuples from table "file" with bitset 33.61% set LOG: duration: 55912.563 ms statement: SELECT bt_index_check(index => 16753, heapallindexed => true); bt_index_check ---------------- (1 row) bareos=# SELECT bt_index_check(index => 16752, heapallindexed => true); DEBUG: verifying consistency of tree structure for index "file_jpfid_idx" DEBUG: verifying level 3 (true root level) DEBUG: verifying level 2 DEBUG: verifying level 1 DEBUG: verifying level 0 (leaf level) DEBUG: verifying that tuples from index "file_jpfid_idx" are present in "file" DEBUG: finished verifying presence of 45922943 tuples from table "file" with bitset 15.72% set LOG: duration: 119843.507 ms statement: SELECT bt_index_check(index => 16752, heapallindexed => true); bt_index_check ---------------- (1 row) bareos=# SELECT bt_index_check(index => 16712, heapallindexed => true); DEBUG: verifying consistency of tree structure for index "file_pkey" DEBUG: verifying level 3 (true root level) DEBUG: verifying level 2 DEBUG: verifying level 1 DEBUG: verifying level 0 (leaf level) DEBUG: verifying that tuples from index "file_pkey" are present in "file" DEBUG: finished verifying presence of 45922962 tuples from table "file" with bitset 40.14% set LOG: duration: 55077.349 ms statement: SELECT bt_index_check(index => 16712, heapallindexed => true); bt_index_check ---------------- (1 row) bareos=# SELECT bt_index_check(index => 16499, heapallindexed => true); DEBUG: verifying consistency of tree structure for index "pg_toast_16490_index" DEBUG: verifying that tuples from index "pg_toast_16490_index" are present in "pg_toast_16490" DEBUG: finished verifying presence of 0 tuples from table "pg_toast_16490" with bitset 0.00% set bt_index_check ---------------- (1 row) ------------------------------------------------------------ I don't see a problem there. After the database did recover, I did manually run VACUUM on the respecitve table, it did run successfully in the expected timeframe, the 9 mio. dead tubles went away as expected, and behaviour is all correct since then (i.e. new data is now writing the *first* file of the table, so it did successfully vacuum). ! Of course, as always, try to figure out how that could happen. Yeah, that's exactly the point. Otherwise I have no issue anymore. What I could do now, is a point-in-time recovery and rerun the situation. But that will not be 100% identical, as the backup system does continuousely put some new data in. ! Apart from hardware problems, one frequent cause is upgrading glibc ! (if the index on a string column or expression). No, this is FreeBSD, we don't normally do such things... ;) Also, that shouldn't happen with locale 'C', 'SQL_ASCII': CREATE DATABASE bareos WITH OWNER = postgres ENCODING = 'SQL_ASCII' LC_COLLATE = 'C' LC_CTYPE = 'C' TABLESPACE = pg_default CONNECTION LIMIT = -1 IS_TEMPLATE = False; regards, PMc
On Mon, 2024-05-27 at 12:48 +0200, Peter wrote: > On Mon, May 27, 2024 at 11:25:47AM +0200, Laurenz Albe wrote: > ! On Sat, 2024-05-25 at 12:51 +0200, Peter wrote: > ! > I just found Autovacuum run for 6 hours on a 8 GB table, VACUUM query > ! > doesnt cancel, cluster doesn't stop, autovacuum worker is not > ! > killable > ! > ! This smells of index corruption. > ! > ! I have seen cases where a corrupted index sends VACUUM into an endless loop > ! so that it does not react to query cancellation. > > Thanks for Your reply. So it can happen... Well, not normally. > ! Check the index with the "bt_index_check()" function from the "amcheck" > ! extension. If that reports a problem, rebuild the index. > > [runs amcheck test, no problem is reported] > I don't see a problem there. > > After the database did recover, I did manually run VACUUM on the > respecitve table, it did run successfully in the expected timeframe Ok... What made me suspect corruption is that the process could not be interrupted. Otherwise I'd have assumed that autivacuum was just running too slowly. > ! Apart from hardware problems, one frequent cause is upgrading glibc > ! (if the index on a string column or expression). > > No, this is FreeBSD, we don't normally do such things... ;) You don't update the C library, or collations never change? > Also, that shouldn't happen with locale 'C', 'SQL_ASCII': Right. The C locale is safe (but of course SQL_ASCII is a mistake). Yours, Laurenz Albe
On Mon, May 27, 2024 at 01:51:56PM +0200, Laurenz Albe wrote: ! > ! Apart from hardware problems, one frequent cause is upgrading glibc ! > ! (if the index on a string column or expression). ! > ! > No, this is FreeBSD, we don't normally do such things... ;) ! ! You don't update the C library, or collations never change? I rarely update the C library. Kernel + libraries + OS programs are a single unit here, updated about once a year, and then by many people and with all the usual testing. I could lookup how often some locale was modified, but honestly, I'm too lazy now. | (but of course SQL_ASCII is a mistake). Really? I re-read the chapter on Locale/CharSet Support, and I don't see the mistake. Only one can not convert that to anything else, which is fine in this usecase (storing arbitrary filenames from any OS in any character-set within the same column). regards, PMc
On Mon, 2024-05-27 at 15:49 +0200, Peter wrote: > > (but of course SQL_ASCII is a mistake). > > Really? I re-read the chapter on Locale/CharSet Support, and I don't > see the mistake. Only one can not convert that to anything else, which > is fine in this usecase (storing arbitrary filenames from any OS in > any character-set within the same column). That will only work fine as long as everybody uses the same encoding, or everybody uses only ASCII characters, or everybody works only on their own data. Yours, Laurenz Albe