v13: CLUSTER segv with wal_level=minimal and parallel index creation - Mailing list pgsql-hackers
From | Justin Pryzby |
---|---|
Subject | v13: CLUSTER segv with wal_level=minimal and parallel index creation |
Date | |
Msg-id | 20200907023737.GA7158@telsasoft.com Whole thread Raw |
Responses |
Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation
|
List | pgsql-hackers |
Following a bulk load, a CLUSTER command run by a maintenance script crashed. This is currently reproducible on that instance, so please suggest if I can provide more info. < 2020-09-06 15:44:16.369 MDT >LOG: background worker "parallel worker" (PID 2576) was terminated by signal 6: Aborted < 2020-09-06 15:44:16.369 MDT >DETAIL: Failed process was running: CLUSTER pg_attribute USING pg_attribute_relid_attnam_index The crash happens during: ts=# REINDEX INDEX pg_attribute_relid_attnum_index; ..but not: ts=# REINDEX INDEX pg_attribute_relid_attnam_index ; pg_catalog | pg_attribute_relid_attnam_index | index | postgres | pg_attribute | permanent | 31 MB | pg_catalog | pg_attribute_relid_attnum_index | index | postgres | pg_attribute | permanent | 35 MB | I suspect |commit c6b92041d Skip WAL for new relfilenodes, under wal_level=minimal. In fact, I set wal_level=minimal for the bulk load. Note also: override | data_checksums | on configuration file | checkpoint_timeout | 60 configuration file | maintenance_work_mem | 1048576 configuration file | max_wal_senders | 0 configuration file | wal_compression | on configuration file | wal_level | minimal configuration file | fsync | off configuration file | full_page_writes | off default | server_version | 13beta3 (gdb) bt #0 0x00007ff9999ad387 in raise () from /lib64/libc.so.6 #1 0x00007ff9999aea78 in abort () from /lib64/libc.so.6 #2 0x0000000000921da5 in ExceptionalCondition (conditionName=conditionName@entry=0xad4078 "relcache_verdict == RelFileNodeSkippingWAL(relation->rd_node)",errorType=errorType@entry=0x977f49 "FailedAssertion", fileName=fileName@entry=0xad3068 "relcache.c", lineNumber=lineNumber@entry=2976) at assert.c:67 #3 0x000000000091a08b in AssertPendingSyncConsistency (relation=0x7ff99c2a70b8) at relcache.c:2976 #4 AssertPendingSyncs_RelationCache () at relcache.c:3036 #5 0x000000000058e591 in smgrDoPendingSyncs (isCommit=isCommit@entry=true, isParallelWorker=isParallelWorker@entry=true)at storage.c:685 #6 0x000000000053b1a4 in CommitTransaction () at xact.c:2118 #7 0x000000000053b826 in EndParallelWorkerTransaction () at xact.c:5300 #8 0x000000000052fcf7 in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1479 #9 0x000000000076047a in StartBackgroundWorker () at bgworker.c:813 #10 0x000000000076d88d in do_start_bgworker (rw=0x23ac110) at postmaster.c:5865 #11 maybe_start_bgworkers () at postmaster.c:6091 #12 0x000000000076e43e in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5260 #13 <signal handler called> #14 0x00007ff999a6c983 in __select_nocancel () from /lib64/libc.so.6 #15 0x00000000004887bc in ServerLoop () at postmaster.c:1691 #16 0x000000000076fb45 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x237d280) at postmaster.c:1400 #17 0x000000000048a83d in main (argc=3, argv=0x237d280) at main.c:210 (gdb) bt f ... #4 AssertPendingSyncs_RelationCache () at relcache.c:3036 status = {hashp = 0x23cba50, curBucket = 449, curEntry = 0x0} locallock = <optimized out> rels = 0x23ff018 maxrels = <optimized out> nrels = 0 idhentry = <optimized out> i = <optimized out> #5 0x000000000058e591 in smgrDoPendingSyncs (isCommit=isCommit@entry=true, isParallelWorker=isParallelWorker@entry=true)at storage.c:685 pending = <optimized out> nrels = 0 maxrels = 0 srels = 0x0 scan = {hashp = 0x23edf60, curBucket = 9633000, curEntry = 0xe01600 <TopTransactionStateData>} pendingsync = <optimized out> #6 0x000000000053b1a4 in CommitTransaction () at xact.c:2118 s = 0xe01600 <TopTransactionStateData> latestXid = <optimized out> is_parallel_worker = true __func__ = "CommitTransaction"
pgsql-hackers by date: