Re: Limitations on 7.0.3? - Mailing list pgsql-general

From Richard Huxton
Subject Re: Limitations on 7.0.3?
Date
Msg-id 469C69BE.5020109@archonet.com
Whole thread Raw
In response to Re: Limitations on 7.0.3?  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Limitations on 7.0.3?
List pgsql-general
ARTEAGA Jose wrote:
> I have spent the last month battling and looking deeper into the issue,
> here's a summary of were I'm at:
> - Increasing shared buffers improved performance but did not resolve the
> backend FATAL disconnect error.
> - Dumping and recreating entire database also did not resolve the issue.

OK, so it's not a corrupted index/file then.

> - re-initializing the DB and recreating from the dump also did not
> resolve the issue.
> On both cases above the issue re-occurred within 2-3 days of run-time
> (insert of new records).
>
> I got the issue narrowed down to the point were I was able to re-create
> the issue at will by just inserting enough data, the data content did
> not matter. The issue always occurred while inserting into my
> "teststeprun" table, which is the largest of my tables (~15 Mill rows).
> The issue is that once I got this table to a certain size, then the
> backend system would crash.
>
> Since I was able to reproduce, I then decided to analyze the core dumps.
> Looking at the core dumps I immediately began to see a pattern, even the
> same patter was there from the initial core dumps I had when the problem
> began occurring back two months ago. In every case the dump indicated
> the last instruction was always in the call to tag_hash(). I also
> noticed that each time the values passed to tag_hash which are used to
> generate the key were just below the 32-bit max value, and tag_hash
> should be returning a uint32 value. Now I'm really suspecting that there
> is some issue with this. Below are the traces of the four core dumps
> which point to the issue I'm suspecting.

I think tag_hash (in /backend/utils/hash/hashfn.c) is responsible for
internal hash-tables (rather than hash indexes). It takes a pointer to a
key to hash and a keysize (in bytes), so either the pointer is bad or
the size is too long and it's reading off the end.

At the other end of your call, _bt_insertonpg
(/backend/access/nbtree/nbtinsert.c) is inserting into a btree index. In
one case it's splitting the index page it tries to insert into (because
it's full) but not in the others.

If it's not a hardware related problem, then it's a bug, but you're
unlikely to get a fix given how old the code is. If an upgrade to 8.2
looks like it will take a lot of effort, perhaps consider an
intermediate upgrade to 7.2 - I think schemas were introduced in 7.3 so
before that should be easier.

There is a chance that you might reduce the problem by REINDEXing the
table concerned every night. That's just a guess though, and you're real
solution will be to upgrade to something more recent.

--
   Richard Huxton
   Archonet Ltd

pgsql-general by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Concurrency Question
Next
From: Vince
Date:
Subject: PHP pg_connect