Thread: Archive recovery crashes on win32 in HEAD - hot standby related?
I was going to test the walreceiver stuff, but it turns out that basic archive recovery appears to be broken in HEAD. From what I can tell, it's related to Hot Standby code. I get this (this is all on win32 - I got the same on win64, but moved back to win32 to make sure it's not an issue with the win64 code) LOG: restored log file "000000010000000000000001" from archive LOG: automatic recovery in progress LOG: initializing recovery connections LOG: redo starts at 0/1000020 LOG: consistent recovery state reached at 0/1000050 LOG: startup process (PID 1348) was terminated by exception 0xC0000005 HINT: See C include file "ntstatus.h" for a description of the hexadecimal value. LOG: terminating any other active server processes Stacktrace: postgres!hash_seq_init(struct HASH_SEQ_STATUS * status = 0x00002d66, struct HTAB * hashp = 0x00000001)+0x13 postgres!KnownAssignedXidsRemoveMany(unsigned int xid = 0, char keepPreparedXacts = 1 '')+0x73 postgres!ProcArrayApplyRecoveryInfo(struct RunningTransactionsData * running = 0x00002d66)+0x1a postgres!standby_redo(struct XLogRecPtr lsn = struct XLogRecPtr, struct XLogRecord * record = 0x00000000)+0x80 postgres!StartupXLOG(void)+0xcda postgres!StartupProcessMain(void)+0x91 postgres!AuxiliaryProcessMain(int argc = <Memory access error>, char ** argv = <Memory access error>)+0x435 postgres!SubPostmasterMain(int argc = 3614962, char ** argv = 0x003728fd)+0x2b2 postgres!main(int argc = <Memory access error>, char ** argv = <Memory access error>)+0x168 postgres!__tmainCRTStartup(void)+0x10f WARNING: Stack unwind information not available. Following frames may be wrong. KERNEL32!BaseProcessInitPostImport+0x8d very trivial install - one master with zero activity, archiving with plain "copy" commands... Not knowing that code very well at this time, but is this perhaps a structure not being properly initialized in EXEC_BACKEND case? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Sat, Jan 16, 2010 at 8:19 AM, Magnus Hagander <magnus@hagander.net> wrote: > Not knowing that code very well at this time, but is this perhaps a > structure not being properly initialized in EXEC_BACKEND case? It looks like KnownAssignedXidsHash is not initialized. That's supposed to happen when CreateSharedProcArray calls KnownAssignedXidsInit, but that only happens for the first process to call that function... but without EXEC_BACKEND it'll just work anyway. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Jan 16, 2010 at 8:19 AM, Magnus Hagander <magnus@hagander.net> wrote: >> Not knowing that code very well at this time, but is this perhaps a >> structure not being properly initialized in EXEC_BACKEND case? > It looks like KnownAssignedXidsHash is not initialized. That's > supposed to happen when CreateSharedProcArray calls > KnownAssignedXidsInit, but that only happens for the first process to > call that function... but without EXEC_BACKEND it'll just work anyway. That code is completely broken as far as the division of labor between "first" and not "first" is concerned ... regards, tom lane
Magnus Hagander <magnus@hagander.net> writes: > I was going to test the walreceiver stuff, but it turns out that basic > archive recovery appears to be broken in HEAD. From what I can tell, > it's related to Hot Standby code. I've committed a fix that makes it work in EXEC_BACKEND case on Unix. Can't tell if there are any Windows-specific issues. regards, tom lane
2010/1/16 Tom Lane <tgl@sss.pgh.pa.us>: > Magnus Hagander <magnus@hagander.net> writes: >> I was going to test the walreceiver stuff, but it turns out that basic >> archive recovery appears to be broken in HEAD. From what I can tell, >> it's related to Hot Standby code. > > I've committed a fix that makes it work in EXEC_BACKEND case on Unix. > Can't tell if there are any Windows-specific issues. Seems to have worked - I can confirm I can now do archive recovery again. Seems streaming replication is broken though, rebuilding a debug build to see if I can figure out why. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/