Re: uninitialized values in revised prepared xact code - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: uninitialized values in revised prepared xact code |
Date | |
Msg-id | 544A5E9D.2090808@vmware.com Whole thread Raw |
In response to | Re: uninitialized values in revised prepared xact code (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: uninitialized values in revised prepared xact code
|
List | pgsql-hackers |
On 10/11/2014 10:20 PM, Bruce Momjian wrote: > > Uh, was this fixed. I see a cleanup commit for this C file, but this > report is from June: > > commit 07a4a93a0e35a778c77ffbbbc18de29e859e18f0 > Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Fri May 16 09:47:50 2014 +0300 > > Initialize tsId and dbId fields in WAL record of COMMIT PREPARED. > > Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but missed > that prepared transaction commits reuse that struct. Fix that. > > Because those fields were left unitialized, replaying a commit prepared WAL > record in a hot standby node would fail to remove the relcache init file. > That can lead to "could not open file" errors on the standby. Relcache init > file only needs to be removed when a system table/index is rewritten in the > transaction using two phase commit, so that should be rare in practice. In > HEAD, the incorrect dbId/tsId values are also used for filtering in logical > replication code, causing the transaction to always be filtered out. > > Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was > introduced. No, that was a different issue. (more below) > On Mon, Jun 30, 2014 at 11:58:59AM +0200, Andres Freund wrote: >> Hi, >> >> I've just rerun valgrind for the first time in a while and saw the >> following splat. My guess is it exists since bb38fb0d43c, but that's >> blindly guessing: >> >> ==2049== Use of uninitialised value of size 8 >> ==2049== at 0x4FE66D: EndPrepare (twophase.c:1063) >> ==2049== by 0x4F231B: PrepareTransaction (xact.c:2217) >> ==2049== by 0x4F2A38: CommitTransactionCommand (xact.c:2676) >> ==2049== by 0x79013E: finish_xact_command (postgres.c:2408) >> ==2049== by 0x78DE97: exec_simple_query (postgres.c:1062) >> ==2049== by 0x791FDD: PostgresMain (postgres.c:4010) >> ==2049== by 0x71B13B: BackendRun (postmaster.c:4113) >> ==2049== by 0x71A86D: BackendStartup (postmaster.c:3787) >> ==2049== by 0x71714C: ServerLoop (postmaster.c:1566) >> ==2049== by 0x716804: PostmasterMain (postmaster.c:1219) >> ==2049== by 0x679405: main (main.c:219) >> ==2049== Uninitialised value was created by a stack allocation >> ==2049== at 0x4FE16C: StartPrepare (twophase.c:942) >> ==2049== >> ==2049== Syscall param write(buf) points to uninitialised byte(s) >> ==2049== at 0x5C69640: __write_nocancel (syscall-template.S:81) >> ==2049== by 0x4FE6AE: EndPrepare (twophase.c:1064) >> ==2049== by 0x4F231B: PrepareTransaction (xact.c:2217) >> ==2049== by 0x4F2A38: CommitTransactionCommand (xact.c:2676) >> ==2049== by 0x79013E: finish_xact_command (postgres.c:2408) >> ==2049== by 0x78DE97: exec_simple_query (postgres.c:1062) >> ==2049== by 0x791FDD: PostgresMain (postgres.c:4010) >> ==2049== by 0x71B13B: BackendRun (postmaster.c:4113) >> ==2049== by 0x71A86D: BackendStartup (postmaster.c:3787) >> ==2049== by 0x71714C: ServerLoop (postmaster.c:1566) >> ==2049== by 0x716804: PostmasterMain (postmaster.c:1219) >> ==2049== by 0x679405: main (main.c:219) >> ==2049== Address 0x64694ed is 1,389 bytes inside a block of size 8,192 alloc'd >> ==2049== at 0x4C27B8F: malloc (vg_replace_malloc.c:298) >> ==2049== by 0x8E766E: AllocSetAlloc (aset.c:853) >> ==2049== by 0x8E8E04: MemoryContextAllocZero (mcxt.c:627) >> ==2049== by 0x8A54D3: AtStart_Inval (inval.c:704) >> ==2049== by 0x4F1DFC: StartTransaction (xact.c:1841) >> ==2049== by 0x4F28D1: StartTransactionCommand (xact.c:2529) >> ==2049== by 0x7900A7: start_xact_command (postgres.c:2383) >> ==2049== by 0x78DAF4: exec_simple_query (postgres.c:860) >> ==2049== by 0x791FDD: PostgresMain (postgres.c:4010) >> ==2049== by 0x71B13B: BackendRun (postmaster.c:4113) >> ==2049== by 0x71A86D: BackendStartup (postmaster.c:3787) >> ==2049== by 0x71714C: ServerLoop (postmaster.c:1566) >> ==2049== Uninitialised value was created by a stack allocation >> ==2049== at 0x4FE16C: StartPrepare (twophase.c:942) >> >> It's probably just padding - twophase.c:1063 is the CRC32 computation of >> the record data. Yeah. The padding bytes in TwoPhaseFileHeader were not initialized. That's simple enough to fix, but when I run valgrind, I get a lot whole bunch of similar messages. A few are from pgstat: the padding bytes in the pgstat messages are not initialized. One comes from write_relcache_init_file(); again I believe it's padding bytes being uninitialized (in FormData_pg_attribute). And one from the XLogInsert from heap_insert; there's an uninitialized padding byte in xl_heap_insert. And so forth.. Is it worthwhile to hunt down all of these? If there aren't many more than these, it probably is worth it, but I fear this might be an endless effort. Have we been clean of these warnings at any point in the past? - Heikki
pgsql-hackers by date: