Re: Limitations on 7.0.3? - Mailing list pgsql-general
From | Richard Huxton |
---|---|
Subject | Re: Limitations on 7.0.3? |
Date | |
Msg-id | 469C69BE.5020109@archonet.com Whole thread Raw |
In response to | Re: Limitations on 7.0.3? (Alvaro Herrera <alvherre@commandprompt.com>) |
Responses |
Re: Limitations on 7.0.3?
|
List | pgsql-general |
ARTEAGA Jose wrote: > I have spent the last month battling and looking deeper into the issue, > here's a summary of were I'm at: > - Increasing shared buffers improved performance but did not resolve the > backend FATAL disconnect error. > - Dumping and recreating entire database also did not resolve the issue. OK, so it's not a corrupted index/file then. > - re-initializing the DB and recreating from the dump also did not > resolve the issue. > On both cases above the issue re-occurred within 2-3 days of run-time > (insert of new records). > > I got the issue narrowed down to the point were I was able to re-create > the issue at will by just inserting enough data, the data content did > not matter. The issue always occurred while inserting into my > "teststeprun" table, which is the largest of my tables (~15 Mill rows). > The issue is that once I got this table to a certain size, then the > backend system would crash. > > Since I was able to reproduce, I then decided to analyze the core dumps. > Looking at the core dumps I immediately began to see a pattern, even the > same patter was there from the initial core dumps I had when the problem > began occurring back two months ago. In every case the dump indicated > the last instruction was always in the call to tag_hash(). I also > noticed that each time the values passed to tag_hash which are used to > generate the key were just below the 32-bit max value, and tag_hash > should be returning a uint32 value. Now I'm really suspecting that there > is some issue with this. Below are the traces of the four core dumps > which point to the issue I'm suspecting. I think tag_hash (in /backend/utils/hash/hashfn.c) is responsible for internal hash-tables (rather than hash indexes). It takes a pointer to a key to hash and a keysize (in bytes), so either the pointer is bad or the size is too long and it's reading off the end. At the other end of your call, _bt_insertonpg (/backend/access/nbtree/nbtinsert.c) is inserting into a btree index. In one case it's splitting the index page it tries to insert into (because it's full) but not in the others. If it's not a hardware related problem, then it's a bug, but you're unlikely to get a fix given how old the code is. If an upgrade to 8.2 looks like it will take a lot of effort, perhaps consider an intermediate upgrade to 7.2 - I think schemas were introduced in 7.3 so before that should be easier. There is a chance that you might reduce the problem by REINDEXing the table concerned every night. That's just a guess though, and you're real solution will be to upgrade to something more recent. -- Richard Huxton Archonet Ltd
pgsql-general by date: