Re: longfin and tamandua aren't too happy but I'm not sure why - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: longfin and tamandua aren't too happy but I'm not sure why |
Date | |
Msg-id | 3825454.1664310917@sss.pgh.pa.us Whole thread Raw |
In response to | Re: longfin and tamandua aren't too happy but I'm not sure why (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: longfin and tamandua aren't too happy but I'm not sure why
|
List | pgsql-hackers |
Justin Pryzby <pryzby@telsasoft.com> writes: > On Tue, Sep 27, 2022 at 02:55:18PM -0400, Robert Haas wrote: >> Both animals are running with -fsanitize=alignment and it's not >> difficult to believe that the commit mentioned above could have >> introduced an alignment problem where we didn't have one before, but >> without a stack backtrace I don't know how to track it down. I tried >> running those tests locally with -fsanitize=alignment and they passed. > There's one here: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2022-09-27%2018%3A43%3A06 On longfin's host, the test_decoding run produces two core files. One has a backtrace like this: * frame #0: 0x000000010a36af8c postgres`ParseCommitRecord(info='\x80', xlrec=0x00007fa0678a8090, parsed=0x00007ff7b5c50e78)at xactdesc.c:102:30 frame #1: 0x000000010a765f9e postgres`xact_decode(ctx=0x00007fa0680d9118, buf=0x00007ff7b5c51000) at decode.c:201:5 [opt] frame #2: 0x000000010a765d17 postgres`LogicalDecodingProcessRecord(ctx=0x00007fa0680d9118, record=<unavailable>) at decode.c:119:3[opt] frame #3: 0x000000010a76d890 postgres`pg_logical_slot_get_changes_guts(fcinfo=<unavailable>, confirm=true, binary=false)at logicalfuncs.c:271:5 [opt] frame #4: 0x000000010a76d320 postgres`pg_logical_slot_get_changes(fcinfo=<unavailable>) at logicalfuncs.c:338:9 [opt] frame #5: 0x000000010a5a521d postgres`ExecMakeTableFunctionResult(setexpr=<unavailable>, econtext=0x00007fa068098f50,argContext=<unavailable>, expectedDesc=0x00007fa06701ba38, randomAccess=<unavailable>) at execSRF.c:234:13[opt] frame #6: 0x000000010a5c405b postgres`FunctionNext(node=0x00007fa068098d40) at nodeFunctionscan.c:95:5 [opt] frame #7: 0x000000010a5a61b9 postgres`ExecScan(node=0x00007fa068098d40, accessMtd=(postgres`FunctionNext at nodeFunctionscan.c:61),recheckMtd=(postgres`FunctionRecheck at nodeFunctionscan.c:251)) at execScan.c:199:10 [opt] frame #8: 0x000000010a596ee0 postgres`standard_ExecutorRun [inlined] ExecProcNode(node=0x00007fa068098d40) at executor.h:259:9[opt] frame #9: 0x000000010a596eb8 postgres`standard_ExecutorRun [inlined] ExecutePlan(estate=<unavailable>, planstate=0x00007fa068098d40,use_parallel_mode=<unavailable>, operation=CMD_SELECT, sendTuples=<unavailable>, numberTuples=0,direction=1745456112, dest=0x00007fa067023848, execute_once=<unavailable>) at execMain.c:1636:10 [opt] frame #10: 0x000000010a596e2a postgres`standard_ExecutorRun(queryDesc=<unavailable>, direction=1745456112, count=0, execute_once=<unavailable>)at execMain.c:363:3 [opt] and the other * frame #0: 0x000000010a36af8c postgres`ParseCommitRecord(info='\x80', xlrec=0x00007fa06783a090, parsed=0x00007ff7b5c50040)at xactdesc.c:102:30 frame #1: 0x000000010a3cd24d postgres`xact_redo(record=0x00007fa0670096c8) at xact.c:6161:3 frame #2: 0x000000010a41770d postgres`ApplyWalRecord(xlogreader=0x00007fa0670096c8, record=0x00007fa06783a060, replayTLI=0x00007ff7b5c507f0)at xlogrecovery.c:1897:2 frame #3: 0x000000010a4154be postgres`PerformWalRecovery at xlogrecovery.c:1728:4 frame #4: 0x000000010a3e0dc7 postgres`StartupXLOG at xlog.c:5473:3 frame #5: 0x000000010a7498a0 postgres`StartupProcessMain at startup.c:267:2 [opt] frame #6: 0x000000010a73e2cb postgres`AuxiliaryProcessMain(auxtype=StartupProcess) at auxprocess.c:141:4 [opt] frame #7: 0x000000010a745b97 postgres`StartChildProcess(type=StartupProcess) at postmaster.c:5408:3 [opt] frame #8: 0x000000010a7487e2 postgres`PostmasterStateMachine at postmaster.c:4006:16 [opt] frame #9: 0x000000010a745804 postgres`reaper(postgres_signal_arg=<unavailable>) at postmaster.c:3256:2 [opt] frame #10: 0x00007ff815b16dfd libsystem_platform.dylib`_sigtramp + 29 frame #11: 0x00007ff815accd5b libsystem_kernel.dylib`__select + 11 frame #12: 0x000000010a74689c postgres`ServerLoop at postmaster.c:1768:13 [opt] frame #13: 0x000000010a743fbb postgres`PostmasterMain(argc=<unavailable>, argv=0x00006000006480a0) at postmaster.c:1476:11[opt] frame #14: 0x000000010a61c775 postgres`main(argc=8, argv=<unavailable>) at main.c:197:3 [opt] Looks like it might be the same bug, but perhaps not. I recompiled access/transam and access/rmgrdesc at -O0 to get the accurate line numbers shown for those files. Let me know if you need any more info; I can add -O0 in more places, or poke around in the cores. regards, tom lane
pgsql-hackers by date: