Thread: PANIC: unable to locate a valid checkpoint record
Hi, I am using 7.3.2. postmaster prints this on starting up: ---- LOG: database system was interrupted at 2003-02-20 14:23:36 IST LOG: ReadRecord: bad resource manager data checksum in record at 0/E42144 LOG: invalid primary checkpoint record LOG: ReadRecord: bad resource manager data checksum in record at 0/E42104 LOG: invalid secondary checkpoint record PANIC: unable to locate a valid checkpoint record LOG: startup process (pid 7530) was terminated by signal 6 LOG: aborting startup due to startup process failure ---- pg_resetxlog is able to recover from the problem; but I am concerned because I can reproduce the scenario very easily. I originally encountered the problem in 7.2.1; tried upgrading to 7.2.4 and now 7.3.2 and this scenario happens for every version. The scenario is like this; I have an application that is doing database updates using JDBC. I do a kill -9 on postmaster. The application detects that postmaster is down and restarts it; I do kill -9 on postmaster. After a couple of such forced crashes postmaster refuses to come up. The application uses a PostgreSQL 7.2 JDBC2 driver. I wrote a python application and tried to recreate the problem but wasn't successful. However, I can consistently reproduce the problem with the Java application. Any suggestions on how I can proceed? Please CC me on any replies; I am not (yet) subscribed to the lists. Thanks. Ganesan
Ganesan R <rganesan@myrealbox.com> writes: > I am using 7.3.2. postmaster prints this on starting up: > LOG: ReadRecord: bad resource manager data checksum in record at 0/E42144 > pg_resetxlog is able to recover from the problem; but I am concerned because > I can reproduce the scenario very easily. You should definitely be concerned :-(. It sounds like the CRC code isn't working at all on your platform. What is your platform --- what hardware, what OS, which C compiler? How did you configure and install Postgres? regards, tom lane
Ganesan R <rganesan@myrealbox.com> writes: > We've able recreate the problem on another platform; It seems pretty dang odd that you should be able to reproduce the problem on two different platforms, when no one else has reported it at all. Can you think of anything unusual that might be shared by these two installations? regards, tom lane
On Thu, Feb 20, 2003 at 11:07:55PM -0500, Tom Lane wrote: > Ganesan R <rganesan@myrealbox.com> writes: > > I am using 7.3.2. postmaster prints this on starting up: > > > LOG: ReadRecord: bad resource manager data checksum in record at 0/E42144 > > > pg_resetxlog is able to recover from the problem; but I am concerned because > > I can reproduce the scenario very easily. > > You should definitely be concerned :-(. It sounds like the CRC code > isn't working at all on your platform. What is your platform --- what > hardware, what OS, which C compiler? How did you configure and install > Postgres? > Hi, We've able recreate the problem on another platform; this time a DELL PowerEdge 1650. See http://www.dell.com/us/en/esg/topics/esg_pedge_rackmain_servers_1_pedge_1650.htm The configuration is pretty much identical (Single Pentium III 1133MHz CPU, dual mirrored SCSI drives, Redhat 7.3 running kernel 2.4.18). Please let me know if you need additional information. Thank you. Ganesan
On Thu, Feb 20, 2003 at 11:07:55PM -0500, Tom Lane wrote: > Ganesan R <rganesan@myrealbox.com> writes: > > I am using 7.3.2. postmaster prints this on starting up: > > > LOG: ReadRecord: bad resource manager data checksum in record at 0/E42144 > > > pg_resetxlog is able to recover from the problem; but I am concerned because > > I can reproduce the scenario very easily. > > You should definitely be concerned :-(. It sounds like the CRC code > isn't working at all on your platform. What is your platform --- what > hardware, what OS, which C compiler? How did you configure and install > Postgres? The hardware is an IBM xSeries 340 with a single Xeon Pentium IV 2.40GHz processor with dual mirrored SCSI disks. The OS is Redhat Linux 7.3 with a Linux 2.4.18 SMP kernel (the CPU supports hyperthreading). PostgreSQL binaries were precompiled. PostgreSQL 7.2.1 version is a redhat build shippping with Redhat Linux 7.3. PostgresQL 7.2.4 and 7.3.2 binaries were directly downloaded from the ftp mirrors. Please let me know if you need additional information. Ganesan