Thread: database troubles - various errors

database troubles - various errors

From
"A Palmblad"
Date:
We're having a lot of trouble with one of our new servers.  It's an AMD64
dual cpu system, running on Gentoo Linux, with kernel 2.6.7.  Data is stored
on a JFS partition.  Before we installed the server, memory was fully tested
and no problems were found.  Postgres is version 7.4.3.

    We've had a number of different errors.  One of the most popular seems
to be a Cannot open segment.  This usually occurs on an index, generally the
primary key index.  A reindex will fix the problem.

    Last night we got this one: ERROR:  could not find a feasible split
point for "some_index".  This one seemed funny because it was on a btree
index of one column.  A reindex made the error go away.  Every table in the
database was reindexed at some point last week.  The table with the index
involved uses all standard data types and has no non-btree indexes.

Another one we've had and fixed was ERROR: duplicate key violates unique
constraint "pg_class_oid_index".

One of the most annoying errors is occassional index corruption that is not
reported.  We make use of gist tsearch2 indexes, and sometimes they will
decide to stop working without an error.  The only clue we have that there
is a problem is that search words that generally return big results will
return any results or very few.  A reindex will fix the problem.

Vacuums are done manually, but generally there's at least one every week.

Basically, I'd like to know if anyone has seen any problems similar to this,
and if they have any suggestions on how to fix these issues.

-Adam Palmblad



Re: database troubles - various errors

From
"Scott Marlowe"
Date:
On Mon, 2004-08-23 at 10:14, A Palmblad wrote:
> We're having a lot of trouble with one of our new servers.  It's an AMD64
> dual cpu system, running on Gentoo Linux, with kernel 2.6.7.  Data is stored
> on a JFS partition.  Before we installed the server, memory was fully tested
> and no problems were found.  Postgres is version 7.4.3.
>
>     We've had a number of different errors.  One of the most popular seems
> to be a Cannot open segment.  This usually occurs on an index, generally the
> primary key index.  A reindex will fix the problem.
>
>     Last night we got this one: ERROR:  could not find a feasible split
> point for "some_index".  This one seemed funny because it was on a btree
> index of one column.  A reindex made the error go away.  Every table in the
> database was reindexed at some point last week.  The table with the index
> involved uses all standard data types and has no non-btree indexes.
>
> Another one we've had and fixed was ERROR: duplicate key violates unique
> constraint "pg_class_oid_index".
>
> One of the most annoying errors is occassional index corruption that is not
> reported.  We make use of gist tsearch2 indexes, and sometimes they will
> decide to stop working without an error.  The only clue we have that there
> is a problem is that search words that generally return big results will
> return any results or very few.  A reindex will fix the problem.
>
> Vacuums are done manually, but generally there's at least one every week.
>
> Basically, I'd like to know if anyone has seen any problems similar to this,
> and if they have any suggestions on how to fix these issues.

Have you tested the memory, hard drive and CPU on this machine to ensure
proper operation?  This sounds like a hardware problem to me.


Re: database troubles - various errors

From
"A Palmblad"
Date:
Memory has been tested, and it was okay.  What's the best way to test the
CPU and hard drive / controller?
-Adam
----- Original Message -----
From: "Scott Marlowe" <smarlowe@qwest.net>
To: "A Palmblad" <adampalmblad@yahoo.ca>
Cc: "General Postgres" <pgsql-general@postgresql.org>
Sent: Monday, August 23, 2004 9:41 AM
Subject: Re: [GENERAL] database troubles - various errors


> On Mon, 2004-08-23 at 10:14, A Palmblad wrote:
> > We're having a lot of trouble with one of our new servers.  It's an
AMD64
> > dual cpu system, running on Gentoo Linux, with kernel 2.6.7.  Data is
stored
> > on a JFS partition.  Before we installed the server, memory was fully
tested
> > and no problems were found.  Postgres is version 7.4.3.
> >
> >     We've had a number of different errors.  One of the most popular
seems
> > to be a Cannot open segment.  This usually occurs on an index, generally
the
> > primary key index.  A reindex will fix the problem.
> >
> >     Last night we got this one: ERROR:  could not find a feasible split
> > point for "some_index".  This one seemed funny because it was on a btree
> > index of one column.  A reindex made the error go away.  Every table in
the
> > database was reindexed at some point last week.  The table with the
index
> > involved uses all standard data types and has no non-btree indexes.
> >
> > Another one we've had and fixed was ERROR: duplicate key violates unique
> > constraint "pg_class_oid_index".
> >
> > One of the most annoying errors is occassional index corruption that is
not
> > reported.  We make use of gist tsearch2 indexes, and sometimes they will
> > decide to stop working without an error.  The only clue we have that
there
> > is a problem is that search words that generally return big results will
> > return any results or very few.  A reindex will fix the problem.
> >
> > Vacuums are done manually, but generally there's at least one every
week.
> >
> > Basically, I'd like to know if anyone has seen any problems similar to
this,
> > and if they have any suggestions on how to fix these issues.
>
> Have you tested the memory, hard drive and CPU on this machine to ensure
> proper operation?  This sounds like a hardware problem to me.
>
>



Re: database troubles - various errors

From
Tom Lane
Date:
"A Palmblad" <adampalmblad@yahoo.ca> writes:
> Basically, I'd like to know if anyone has seen any problems similar to this,
> and if they have any suggestions on how to fix these issues.

It's not going to answer your issue directly, but see Joe Conway's
recent report of similar failures:
http://archives.postgresql.org/pgsql-hackers/2004-08/msg01251.php
and earlier messages in that thread.

Personally I'm wondering about kernel/filesystem bugs.  Gentoo's
emphasis on bleeding-edge code is fine for dev work but I think it's
a poor choice for a production server.

            regards, tom lane

Re: database troubles - various errors

From
"Scott Marlowe"
Date:
On Mon, 2004-08-23 at 11:51, A Palmblad wrote:
> Memory has been tested, and it was okay.  What's the best way to test the
> CPU and hard drive / controller?
> -Adam

What method did you use for testing the memory?

For testing the hard drive I usually write a large file of random /
semi-random garbage that I have the md5 for, then md5 what's on the
disk.  Over and over, usually for days when testing a new server.  The
CPU will cause many other problems should it really be acting up.  While
it's a remote possibility the CPU is causing the problems, it's more
likely another bit of hardware, if not a kernel / fs bug.

I'd read the article the other poster mentioned, and try a different FS
to see if the problem goes away.  I'm not quite as leery of Gentoo as
the other fellow, but I probably wouldn't be using it in production
either.

Do you have an unusual setup?  HW/RAID/LVM etc?