BUG #18893: Segfault during analyze pg_database - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18893: Segfault during analyze pg_database
Date
Msg-id 18893-da17531047e6447f@postgresql.org
Whole thread Raw
Responses Re: BUG #18893: Segfault during analyze pg_database
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18893
Logged by:          Robins Tharakan
Email address:      tharakan@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system:   Ubuntu
Description:

Creating a few Databases followed by CHECKPOINT causes a segfault.

Tested on a recent - 847bbb21f8c4eb0e2b47417684ad2ba9255c9e80.

Backtrace below but to add, every time I stepped on this, postgres was
always analyzing pg_database.


Repro (a few runs may be required)
=====
-- seq 1 100 | xargs -i psql -Atq -c "DROP   DATABASE t{};" postgres
seq 1 100 | xargs -i psql -Atq -c "CREATE DATABASE t{};" postgres
psql -Atq -c "CHECKPOINT" postgres



Error Log (for multiple crashes)
=========
$ tail -10000 logfile | grep "Failed process was running"
2025-04-12 07:23:55.096 ACST [2833183] DETAIL:  Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 07:24:55.634 ACST [2833183] DETAIL:  Failed process was running:
autovacuum: ANALYZE pg_catalog.pg_database
2025-04-12 07:31:02.634 ACST [2833183] DETAIL:  Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 11:59:31.411 ACST [2845956] DETAIL:  Failed process was running:
autovacuum: ANALYZE pg_catalog.pg_database
2025-04-12 12:13:09.974 ACST [2846810] DETAIL:  Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:38:07.432 ACST [2846810] DETAIL:  Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:41:42.729 ACST [2846810] DETAIL:  Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:43:13.276 ACST [2846810] DETAIL:  Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database


Error Log (for 1 crash)
=========
2025-04-12 12:43:03.279 ACST [2849996] LOG:  checkpoint starting: immediate
force wait
2025-04-12 12:43:13.276 ACST [2846810] LOG:  autovacuum worker (PID 2851288)
was terminated by signal 11: Segmentation fault
2025-04-12 12:43:13.276 ACST [2846810] DETAIL:  Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:43:13.276 ACST [2846810] LOG:  terminating any other active
server processes
2025-04-12 12:43:13.280 ACST [2846810] LOG:  all server processes
terminated; reinitializing
2025-04-12 12:43:13.346 ACST [2851293] LOG:  database system was
interrupted; last known up at 2025-04-12 12:42:59 ACST
2025-04-12 12:43:23.175 ACST [2851293] LOG:  database system was not
properly shut down; automatic recovery in progress
2025-04-12 12:43:23.196 ACST [2851293] LOG:  redo starts at 0/BB5A2BE0
2025-04-12 12:43:23.197 ACST [2851293] WARNING:  could not open directory
"base/49251": No such file or directory
2025-04-12 12:43:23.197 ACST [2851293] CONTEXT:  WAL redo at 0/BB5A2CB0 for
Database/DROP: dir 1663/49251
2025-04-12 12:43:23.197 ACST [2851293] WARNING:  some useless files may be
left behind in old database directory "base/49251"
2025-04-12 12:43:23.197 ACST [2851293] CONTEXT:  WAL redo at 0/BB5A2CB0 for
Database/DROP: dir 1663/49251
2025-04-12 12:43:24.620 ACST [2851293] LOG:  unexpected pageaddr 0/A6D3A000
in WAL segment 0000000100000000000000D5, LSN 0/D5D3A000, offset 13869056
2025-04-12 12:43:24.620 ACST [2851293] LOG:  redo done at 0/D5D39198 system
usage: CPU: user: 0.88 s, system: 0.07 s, elapsed: 1.42 s
2025-04-12 12:43:24.633 ACST [2851294] LOG:  checkpoint starting:
end-of-recovery immediate wait
2025-04-12 12:43:44.451 ACST [2851294] LOG:  checkpoint complete: wrote
16284 buffers (99.4%), wrote 3 SLRU buffers; 0 WAL file(s) added, 0 removed,
26 recycled; write=0.173 s, sync=19.592 s, total=19.820 s; sync files=29806,
longest=0.019 s, average=0.001 s; distance=433757 kB, estimate=433757 kB;
lsn=0/D5D3A048, redo lsn=0/D5D3A048
2025-04-12 12:43:44.467 ACST [2846810] LOG:  database system is ready to
accept connections


SQL Output
==========
postgres=# checkpoint;
WARNING:  terminating connection because of crash of another server
process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
Time: 3485.895 ms (00:03.486)
!?> 


Backtrace
=========
(gdb) bt
#0  PopActiveSnapshot () at snapmgr.c:766
#1  0x0000559978e4aff5 in vacuum (relations=0x55999bb4f510,
params=0x55999bb48120, bstrategy=0x55999bb42880, vac_context=0x55999bb4f3c0,
isTopLevel=true) at vacuum.c:611
#2  0x000055997905242c in autovacuum_do_vac_analyze (tab=0x55999bb48118,
bstrategy=0x55999bb42880) at autovacuum.c:3160
#3  0x0000559979051164 in do_autovacuum () at autovacuum.c:2439
#4  0x000055997904fd05 in AutoVacWorkerMain (startup_data=0x0,
startup_data_len=0) at autovacuum.c:1594
#5  0x0000559979056ab7 in postmaster_child_launch
(child_type=B_AUTOVAC_WORKER, child_slot=2022, startup_data=0x0,
startup_data_len=0, client_sock=0x0) at launch_backend.c:290
#6  0x000055997905da7e in StartChildProcess (type=B_AUTOVAC_WORKER) at
postmaster.c:3973
#7  0x000055997905dc0d in StartAutovacuumWorker () at postmaster.c:4037
#8  0x000055997905d6ce in process_pm_pmsignal () at postmaster.c:3794
#9  0x000055997905a803 in ServerLoop () at postmaster.c:1695
#10 0x000055997905a1d2 in PostmasterMain (argc=3, argv=0x55999ba24f80) at
postmaster.c:1400
#11 0x0000559978f021f3 in main (argc=3, argv=0x55999ba24f80) at main.c:227


Backtrace Full
==============
#0  PopActiveSnapshot () at snapmgr.c:766
        newstack = 0x55999bb4f3c0
#1  0x0000559978e4aff5 in vacuum (relations=0x55999bb4f510,
params=0x55999bb48120, bstrategy=0x55999bb42880, vac_context=0x55999bb4f3c0,
isTopLevel=true) at vacuum.c:611
        in_vacuum = false
        stmttype = 0x55997951e3d0 "VACUUM"
        in_outer_xact = false
        use_own_xacts = true
        __func__ = "vacuum"
#2  0x000055997905242c in autovacuum_do_vac_analyze (tab=0x55999bb48118,
bstrategy=0x55999bb42880) at autovacuum.c:3160
        rangevar = 0x55999bb4d4b0
        rel = 0x55999bb4d500
        rel_list = 0x55999bb4d530
        vac_context = 0x55999bb4f3c0
#3  0x0000559979051164 in do_autovacuum () at autovacuum.c:2439
        _save_exception_stack = 0x7ffef3180850
        _save_context_stack = 0x0
        _local_sigjmp_buf = {{__jmpbuf = {140732976861944,
2786496174943352778, 0, 140732976861976, 94117656164696, 139965642006560,
2786496174997878730, 8242866857011034058},
            __mask_was_saved = 0, __saved_mask = {__val = {5460319232,
94118230406032, 6656, 94117652561175, 94118230399168, 16, 94117649680895,
26, 6240, 94118230406064,
                94117652046186, 6656, 94118230399408, 140732976858672,
94117652048505, 0}}}}
        _do_rethrow = false
        tab = 0x55999bb48118
        skipit = false
        iter = {cur = 0x7f4c4571d828, end = 0x7f4c4571d828}
        relid = 1262
        classTup = 0x7f4c47393e18
        isshared = true
        cell__state = {l = 0x55999bb47b38, i = 0}
        classRel = 0x7f4c494eaa88
        tuple = 0x0
        relScan = 0x55999bb42470
        dbForm = 0x7f4c47392d80
        table_oids = 0x55999bb47b38
        orphan_oids = 0x0
        ctl = {num_partitions = 0, ssize = 0, dsize = 140732976858832,
max_dsize = 94117644661494, keysize = 4, entrysize = 104, hash =
0x5599797a6b00 <TopTransactionStateData>,
          match = 0x79361810, keycopy = 0x3f, alloc = 0x7ffef31812f8, hcxt =
0x7ffef3180710, hctl = 0x559978c6ff8e
<CommitTransactionCommandInternal+177>}
        table_toast_map = 0x55999bb43470
        cell = 0x55999bb47b50
        bstrategy = 0x55999bb42880
        key = {sk_flags = 0, sk_attno = 18, sk_strategy = 3, sk_subtype = 0,
sk_collation = 950, sk_func = {fn_addr = 0x5599791aecf4 <chareq>, fn_oid =
61, fn_nargs = 2,
            fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
fn_extra = 0x0, fn_mcxt = 0x55999bb41360, fn_expr = 0x0}, sk_argument =
116}
        pg_class_desc = 0x55999bb41460
        effective_multixact_freeze_max_age = 400000000
        did_vacuum = false
        found_concurrent_worker = false
        i = 21913
        __func__ = "do_autovacuum"
#4  0x000055997904fd05 in AutoVacWorkerMain (startup_data=0x0,
startup_data_len=0) at autovacuum.c:1594
        dbname =

"template1\000\000\000\000\000\000\000p\030\000\000\000\000\000\000\0002os\276C\025C\200\000\000\000\000\000\000\000m\271<y\231U\000\000O\267<y\231U\000\000\0002os\036\000\000"
        local_sigjmp_buf = {{__jmpbuf = {140732976861944,
2786496174865758154, 0, 140732976861976, 94117656164696, 139965642006560,
2786496174828009418, 8242866840620218314},
            __mask_was_saved = 1, __saved_mask = {__val =
{18446744066192964099, 11214622847848677400, 139965631788948,
140732976859328, 4833844260311609856, 16, 140732976859424,
                140732976859360, 4833844260311609856, 0, 139965642011360, 1,
94117649519579, 140732976861944, 94118229463072, 140732976859424}}}}
        dbid = 1
        __func__ = "AutoVacWorkerMain"
#5  0x0000559979056ab7 in postmaster_child_launch
(child_type=B_AUTOVAC_WORKER, child_slot=2022, startup_data=0x0,
startup_data_len=0, client_sock=0x0) at launch_backend.c:290
        pid = 0
#6  0x000055997905da7e in StartChildProcess (type=B_AUTOVAC_WORKER) at
postmaster.c:3973
        pmchild = 0x55999bab4528
        pid = 32766
        __func__ = "StartChildProcess"
#7  0x000055997905dc0d in StartAutovacuumWorker () at postmaster.c:4037
        bn = 0x5000097e7
#8  0x000055997905d6ce in process_pm_pmsignal () at postmaster.c:3794
        request_state_update = false
        __func__ = "process_pm_pmsignal"


-
robins
https://robins.in


pgsql-bugs by date:

Previous
From: Vinod Sridharan
Date:
Subject: Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.
Next
From: Tom Lane
Date:
Subject: Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.