High-level of inserts makes database drop core - Mailing list pgsql-hackers

From David Ben-Yaacov
Subject High-level of inserts makes database drop core
Date
Msg-id 007501be0f0f$0b5839d0$66097cc0@itd.sterling.com
Whole thread Raw
Responses Re: [HACKERS] High-level of inserts makes database drop core
List pgsql-hackers
Dear PostgreSQL hackers:
 
I have a problem that I recently posted in the Postgres-ADMIN newsgroup.  It was not answered there, but Mr. Marc Fournier, aka Hermit Hacker, told me that it might be worthwhile (to both myself and the Postgres code hackers) to cross-post my problem. 
 
Without reposting all the previous emails, I will attempt to summarize the problem.  Based upon the advice that is offered from this newsgroup,  I intend to change my Postgres configuration to aid in the debugging of this problem.  In advance, thank you for your help.
 
The Problem
I have three Perl ingest processes running continuously, inserting and deleting data into the database.  Although not consistent, sometimes the database slows down to a crawl and the CPU usage increases to
just under 100%.  A Perl ingest processes may insert/delete to/from the same table as another Perl ingest process.

You may be wondering what kind of data we store.  The system is a development system utilizing a real-time alphanumeric weather feed.  So if we drop the tables and thus the weather, we can always wait a couple of hours for it to reload.  On our production/deployed systems, this is not an option.
 
As to the amount of data being ingested...we don't really know what the true amount is, but we have discussed it and predict that we could get about 1500 inserts peak in a 3 minute time span, coming from two different Perl ingest routines.  Of course, it probably is quite a bit less normally.

To try to stop the problem before it starts, I wrote a script that vacuums the database tables once every hour.  The vacuum script runs at the same time as the Perl ingest processes do (the Perl ingest processes never
get shut down).  This may or may not have helped the situation.  I believe it does help.  Or does the entire database need to be idle before performing a vacuum due to potential deadlock scenarious?


When I do notice the database taking 99% of CPU, I generally shut down the ingest, and then try to vacuum the database manually.  I usually find that the indexes that I set up on some of the large tables do not correlate to the actual table data.  In the past, I have deleted the index and then recreated the index.  In extreme cases, I have deleted the actual UNIX file that corresponds to the table, and then deleted the table reference and then  ecreated the table.
 
A Related Problem
Even under normal operations when the database is fast, we still have problems inserting data.  We have examined the insert query identified in our error log to see if there is a problem with our code.  Apparently not, as the Insert SQL that failed and dropped core while running the ingest process has no problems when entered manually. 
 
More Information
As to the wise suggestions of Mr. Fournier and others, we have adequate RAM, 256 MBytes, adequate CPU, 2 MIPS R10000s, adequate swap, 260 Mbytes, and the postgres user located on a separate disk from the swap file. 
(SGI IRIX 6.5 Dual Processor Octane, Postgres 6.3 built using SGI's C compiler, Pg) 
 
We tried building 6.3.2 using GNU's C compiler and SGI's C compiler but the problem appeared instantly and was much worse.  We are going to attempt 6.4.  We do not have trouble running out of memory.

Finally, as we lose the database connection consistently when running our Perl ingest routines, we automatically try to reconnect to the database (as per Mr. Fournier's advice).
 
Another little fact is that we first check the primary key to see if there is a previous entry with the
same key, and if there is a previous entry, delete the old entry.  The primary key is basically a location identifier.  So if we already have a weather report for "Denver", we make sure to delete the old weather report
before inserting the new "Denver" weather report.  Mr. Fournier has offered some suggestions to streamline this process, but as yet we have not made any changes.  IMHO, these changes should not affect the problem.
 
We are in the process of changing over to 6.4.  In fact, if you, the PostgreSQL Hackers, could give me some information on what type of run-time environment would be benificial to you, I will try to comply.  In other words, if you wanted us to set up at a particular debug level, at certain memory buffer allocations,or any other setup suitable for easy diagnostics, please let me know.

Thanks again for your interest and help
 
David Ben-Yaacov
MetView Technical Lead
Sterling Software
USA
402 291 8300 x 351 (voice)
402 291 4362 (fax)

pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: postmaster.c:802: fixed or forbidden register 2 (cx) was spilled for class CREG.
Next
From: "Thomas G. Lockhart"
Date:
Subject: Re: [HACKERS] More CORBA and PostgreSQL