Thread: pg 7.3.4 and linux box crash
Hi, I am runing linux red hat 7.3 (standart install) on dual athlon box , 1 GB ram and pg 7.3.4. If I try to access with pgAdmin one from my tables (i contains ~ 1 M records) the linux box crashes. In my pg log I can find: ERROR: Invalid page header in block 5604 of a_acc before this error I see also : FATAL: Database "template0" is not currently accepting connections ERROR: Relation "pg_relcheck" does not exist and ERROR: 'ksqo' is not a valid option name I am using reiserFS on this box and do not find any other problems with it. Can I solve this problem or I need to recreate the datebase ( The table and the database are created befor 2 days)? regards, ivan.
pginfo <pginfo@t1.unisoftbg.com> writes: > In my pg log I can find: > ERROR: Invalid page header in block 5604 of a_acc You have a corrupt table. Can you drop and recreate a_acc? > before this error I see also : > FATAL: Database "template0" is not currently accepting connections > ERROR: Relation "pg_relcheck" does not exist > and > ERROR: 'ksqo' is not a valid option name Seems you are using a rather out-of-date version of pgAdmin. regards, tom lane
Hi, Tom Lane wrote: > pginfo <pginfo@t1.unisoftbg.com> writes: > > In my pg log I can find: > > ERROR: Invalid page header in block 5604 of a_acc > > You have a corrupt table. Can you drop and recreate a_acc? > Yes I can.The problem is that it is happen for 4 times in this week. It is happen on 2 diferent servers ( the second is production, and it is bad). I have had similar problem with pg 7.3.3. and some one from the list wrot that it is bug and I need to upgrade to 7.3.4. I have made the upgrade, but I am not sure if this bug exist in pg 7.3.4. If yes, what for version will be best to use? > > before this error I see also : > > FATAL: Database "template0" is not currently accepting connections > > ERROR: Relation "pg_relcheck" does not exist > > and > > ERROR: 'ksqo' is not a valid option name > > Seems you are using a rather out-of-date version of pgAdmin. > Yes, it is true.I am waiting for version 3. > regards, tom lane regards, ivan.
pginfo <pginfo@t1.unisoftbg.com> writes: > Tom Lane wrote: >> You have a corrupt table. Can you drop and recreate a_acc? > Yes I can.The problem is that it is happen for 4 times in this week. Oh? Perhaps you should be looking at the contents of the complained-of pages to try to see what's going on. Is there any pattern to the failures? > It is happen on 2 diferent servers Even so, I wouldn't rule out a hardware problem, especially if the servers are identical hardware setups. Have you run memtest86 and badblocks? regards, tom lane
Tom Lane wrote: > pginfo <pginfo@t1.unisoftbg.com> writes: > > Tom Lane wrote: > >> You have a corrupt table. Can you drop and recreate a_acc? > > > Yes I can.The problem is that it is happen for 4 times in this week. > > Oh? Perhaps you should be looking at the contents of the complained-of > pages to try to see what's going on. Is there any pattern to the failures? I do not see!I will search the problem and any additional info is wellcome. > > > > It is happen on 2 diferent servers > > Even so, I wouldn't rule out a hardware problem, especially if the > servers are identical hardware setups. Have you run memtest86 and > badblocks? > No, but on this servers we are running jboss and so on and do not find any problems.The servers are not identical, but linux version and file system are the same. Also the pg config is the same. > regards, tom lane thanks, ivan
On Fri, 12 Sep 2003, pginfo wrote: > Hi, > I am runing linux red hat 7.3 (standart install) > on dual athlon box , 1 GB ram and pg 7.3.4. > > If I try to access with pgAdmin one from my tables (i contains ~ 1 M > records) > the linux box crashes. WAIT. Do you mean you get a kernel panic? Or the box locks up tight? Or the box reboots? Or does Postgresql crash? Or does the linux box complain about a bad hard drive and stop running postgresql? If linux is crashing, that is NOT postgresql's fault. An app can't crash an OS without root level access, and Postgresql ain't got that. > In my pg log I can find: > ERROR: Invalid page header in block 5604 of a_acc > > before this error I see also : > FATAL: Database "template0" is not currently accepting connections > ERROR: Relation "pg_relcheck" does not exist > > and > > ERROR: 'ksqo' is not a valid option name This sounds like you are using an older pgaccess on a newer postgresql database. Once upon a time, apps could connect to template0. That is no longer the case.
scott.marlowe wrote: > On Fri, 12 Sep 2003, pginfo wrote: > > > Hi, > > I am runing linux red hat 7.3 (standart install) > > on dual athlon box , 1 GB ram and pg 7.3.4. > > > > If I try to access with pgAdmin one from my tables (i contains ~ 1 M > > records) > > the linux box crashes. > > WAIT. Do you mean you get a kernel panic? Or the box locks up tight? Or > the box reboots? > > Or does Postgresql crash? > > Or does the linux box complain about a bad hard drive and stop running > postgresql? > > If linux is crashing, that is NOT postgresql's fault. An app can't crash > an OS without root level access, and Postgresql ain't got that. > It is crashing the linux box. Not rebooting, not kernel panic, but only stop to respond.On the console if I type reboot it will not to reboot and so on. But it crash only if I start intensive operations on pg. And also my problem are this messages into the log. At hte weekend I will make some detailed tests on OS and hardware to see if it is the problem. It will be the best case for me if the problem is not in pg. > > In my pg log I can find: > > ERROR: Invalid page header in block 5604 of a_acc > > > > before this error I see also : > > FATAL: Database "template0" is not currently accepting connections > > ERROR: Relation "pg_relcheck" does not exist > > > > and > > > > ERROR: 'ksqo' is not a valid option name > > This sounds like you are using an older pgaccess on a newer postgresql > database. Once upon a time, apps could connect to template0. That is no > longer the case. > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match regards, ivan.
pginfo <pginfo@t1.unisoftbg.com> writes: > scott.marlowe wrote: >> WAIT. Do you mean you get a kernel panic? Or the box locks up tight? Or >> the box reboots? > It is crashing the linux box. Not rebooting, not kernel panic, but only stop > to respond.On the console if I type reboot it will not to reboot and so on. > But it crash only if I start intensive operations on pg. In that case you unquestionably have hardware problems (or, less likely, a broken kernel). Postgres runs as an unprivileged program; therefore it is *not* capable of locking up the kernel unless the kernel is broken. regards, tom lane
> > It is crashing the linux box. Not rebooting, not kernel panic, > but only stop > > to respond.On the console if I type reboot it will not to > reboot and so on. > > But it crash only if I start intensive operations on pg. If you can type 'reboot' then surely it hasn't stopped responding?
Hi, yes I can type reboot, but the system can not reboot. I tested the memory (it take ~20 h - the maximum test) without any problems. I do not see any reason to think that the problem is into the HW/OS. At the moments I install red hat 9.0 to be sure that the problem is not by kernel. regards, ivan. Matt Clark wrote: > > > It is crashing the linux box. Not rebooting, not kernel panic, > > but only stop > > > to respond.On the console if I type reboot it will not to > > reboot and so on. > > > But it crash only if I start intensive operations on pg. > > If you can type 'reboot' then surely it hasn't stopped responding? > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
At 06:14 AM 9/14/2003 , Matt Clark wrote: >>It is crashing the linux box. Not rebooting, not kernel panic, but only >>stop to respond.On the console if I type reboot it will not to reboot and >>so on. But it crash only if I start intensive operations on pg. > >If you can type 'reboot' then surely it hasn't stopped responding? It may be that he's got such a tight loop running somewhere that it =seems= to have locked up. But if he's getting character echo on the console, some stuff is running. -crl -- Chad R. Larson (CRL22) chad@eldocomp.com Eldorado Computing, Inc. 602-604-3100 5353 North 16th Street, Suite 400 Phoenix, Arizona 85016-3228 -- CONFIDENTIALITY NOTICE -- This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain informationthat is privileged, confidential and exempt from disclosure under applicable law. If you are not the intendedaddressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use, copy,disclose or distribute to anyone the message or any information contained in the message. If you have received thismessage in error, please immediately advise the sender by reply email, and delete the message. Thank you.