Thread: very slow after a while...
Hello, I have a machine that uses pgsql version 8.0.1 I don't think the version is relevant because I had 7.4.1 before and I had the same problem. I have a PHP script that runs regularily and does this: select a bunch of lines from a mssql database insert into postgres the values taken if insert fails with duplicate id (I have a unique index there) then perform an update with the values inside. Pretty straightforward. The thing is the after I updated to 8.0.1 and also (separate ocasion) after I recreated the database one day, the script runs instantly with thousands and hundreds of lines inserted and updated per second. However, after a while the whole process slows down significantly, eraching the point of one insert or update per second or per four!!! seconds. I have tried vacuuming full, reindexing, deleting the table and recreating it, I tried changing values in postgres conf and in the linux kernel. Nothing works except re initialising the whole database directory. PLEASE HELP! ------------------------- E-Mail powered by MadNet. http://www.madnet.ro/
Costin Manda wrote: > The thing is the after I updated to 8.0.1 and also (separate ocasion) > after I recreated the database one day, the script runs instantly with > thousands and hundreds of lines inserted and updated per second. However, > after a while the whole process slows down significantly, eraching the > point of one insert or update per second or per four!!! seconds. > > I have tried vacuuming full, reindexing, deleting the table and recreating > it, I tried changing values in postgres conf and in the linux kernel. > Nothing works except re initialising the whole database directory. Some more info please: 1. This is this one INSERT statement per transaction, yes? If that fails, you do an UPDATE 2. Are there any foreign-keys the insert will be checking? 3. What indexes are there on the main table/foreign-key-related tables? Whatever the answers to these questions, perhaps look into loading your data into a temporary table, inserting any rows without matching primary keys and then deleting those and updating what's left. -- Richard Huxton Archonet Ltd
Please CC the list as well as replying directly - it means more people can help. Costin Manda wrote: >>Some more info please: >>1. This is this one INSERT statement per transaction, yes? If that >>fails, you do an UPDATE > > correct. > >>2. Are there any foreign-keys the insert will be checking? >>3. What indexes are there on the main table/foreign-key-related tables? > > > this is the table, the only restriction at the insert is the logid which > must be unique. > > Table "public.pgconnectionlog" > Column | Type | Modifiers > ----------------+-----------------------+----------- > logid | integer | not null > username | character varying(20) | > logtime | integer | > connecttime | integer | > disconnecttime | integer | > usedcredit | double precision | > usedtime | integer | > phonenum | character varying(30) | > prephonenum | character varying(20) | > pricelistname | character varying(30) | > precode | character varying(20) | > effectivetime | integer | > callerid | character varying(30) | > serialnumber | character varying(30) | > prefix | character varying(20) | > tara | character varying | > Indexes: > "pgconnectionlog_pkey" PRIMARY KEY, btree (logid) > "connecttime_index" btree (connecttime) > "disconnecttime_index" btree (disconnecttime) > "logtime_index" btree (logtime) > "prefix_index" btree (prefix) > "tara_index" btree (tara) > "username_index" btree (username) Hmm - nothing unusual there. I'd be suspicious of a problem with the indexes, except you say reindexing has no effect. >>Whatever the answers to these questions, perhaps look into loading your >>data into a temporary table, inserting any rows without matching primary >>keys and then deleting those and updating what's left. > > You think this will be faster? It does make sense. Anyway, the problem > is not optimising the script, is the speed change , dramatic I would > say. Could you monitor what's happening while this slows down. In particular, could you: 1. Run "vmstat 10" so we can see memory/cpu/disk usage while this is happening. 2. See what's happening in pg_xlog - are you creating/cycling through a lot of transaction-log files? 3. Keep an eye on the logs - are there any warnings there? If you vacuum full, it's worth adding "verbose" to the that too, to see what it's doing. -- Richard Huxton Archonet Ltd
> Some more info please: > 1. This is this one INSERT statement per transaction, yes? If that > fails, you do an UPDATE correct. > 2. Are there any foreign-keys the insert will be checking? > 3. What indexes are there on the main table/foreign-key-related tables? this is the table, the only restriction at the insert is the logid which must be unique. Table "public.pgconnectionlog" Column | Type | Modifiers ----------------+-----------------------+----------- logid | integer | not null username | character varying(20) | logtime | integer | connecttime | integer | disconnecttime | integer | usedcredit | double precision | usedtime | integer | phonenum | character varying(30) | prephonenum | character varying(20) | pricelistname | character varying(30) | precode | character varying(20) | effectivetime | integer | callerid | character varying(30) | serialnumber | character varying(30) | prefix | character varying(20) | tara | character varying | Indexes: "pgconnectionlog_pkey" PRIMARY KEY, btree (logid) "connecttime_index" btree (connecttime) "disconnecttime_index" btree (disconnecttime) "logtime_index" btree (logtime) "prefix_index" btree (prefix) "tara_index" btree (tara) "username_index" btree (username) > Whatever the answers to these questions, perhaps look into loading your > data into a temporary table, inserting any rows without matching primary > keys and then deleting those and updating what's left. You think this will be faster? It does make sense. Anyway, the problem is not optimising the script, is the speed change , dramatic I would say. It is possible the problem doesn't come from this script, but from others. The question is why does the database slow down to such a degree? I repeat: dumping all data into a file, recreating the data directory and reloading the data results in almost instantaneous inserts and updates. ------------------------- E-Mail powered by MadNet. http://www.madnet.ro/
I think I found the problem. I was comparing wrongly some values and based on that, every time the script was run (that means once every 5 minutes) my script deleted two tables and populated them with about 70 thousand records. I still don't know why that affected the speed of the database (even when the script was not running) and how to fix it. I have a script that vacuums full and reindexes the database every day. Is there something else I must do? ------------------------- E-Mail powered by MadNet. http://www.madnet.ro/
Costin Manda wrote: > I think I found the problem. I was comparing wrongly some values and > based on that, every time the script was run (that means once every 5 > minutes) my script deleted two tables and populated them with about 70 > thousand records. > > I still don't know why that affected the speed of the database (even > when the script was not running) and how to fix it. I have a script that > vacuums full and reindexes the database every day. Is there something > else I must do? I'm not sure I understand what you're saying, but if you vacuum at the wrong time that can cause problems. I've shot myself in the foot before now doing something like: DELETE FROM big_table VACUUM ANALYSE big_table COPY lots of rows into big_table Of course, the planner now thinks there are zero rows in big_table. -- Richard Huxton Archonet Ltd
On Wed, 06 Apr 2005 14:07:36 +0100 Richard Huxton <dev@archonet.com> wrote: > Costin Manda wrote: > > I think I found the problem. I was comparing wrongly some values and > > based on that, every time the script was run (that means once every 5 > > minutes) my script deleted two tables and populated them with about 70 > > thousand records. > I'm not sure I understand what you're saying, but if you vacuum at the > wrong time that can cause problems. I've shot myself in the foot before > now doing something like: > > DELETE FROM big_table > VACUUM ANALYSE big_table > COPY lots of rows into big_table > > Of course, the planner now thinks there are zero rows in big_table. I mean from 5 to 5 minutes DROP TABLE CREATE TABLE INSERT 70000 rows in table and the Vacuuming is done at 2 in the morning, there should be no activity then...
Costin Manda wrote: > On Wed, 06 Apr 2005 14:07:36 +0100 > Richard Huxton <dev@archonet.com> wrote: > > >>Costin Manda wrote: >> >>> I think I found the problem. I was comparing wrongly some values and >>>based on that, every time the script was run (that means once every 5 >>>minutes) my script deleted two tables and populated them with about 70 >>>thousand records. > > >>I'm not sure I understand what you're saying, but if you vacuum at the >>wrong time that can cause problems. I've shot myself in the foot before >>now doing something like: >> >>DELETE FROM big_table >>VACUUM ANALYSE big_table >>COPY lots of rows into big_table >> >>Of course, the planner now thinks there are zero rows in big_table. > > > I mean from 5 to 5 minutes > DROP TABLE > CREATE TABLE > INSERT 70000 rows in table I thought you were trying an inserting / updating if it failed? You shouldn't have any duplicates if the table was already empty. Or have I misunderstood? -- Richard Huxton Archonet Ltd
On Wed, 06 Apr 2005 15:54:29 +0100 Richard Huxton <dev@archonet.com> wrote: > > I mean from 5 to 5 minutes > > DROP TABLE > > CREATE TABLE > > INSERT 70000 rows in table > > I thought you were trying an inserting / updating if it failed? You > shouldn't have any duplicates if the table was already empty. Or have I > misunderstood? Ok, let's start over :) The script does the following thing: 1. read the count of rows in two tables from the mssql database 2. read the count of rows of the 'mirror' tables in postgres these are tables that get updated rarely and have a maximum of 100000 records together 3. if the counts differ, delete from the mirror table everything and reinsert everything. 4. THEN do the inserts that get updated on error I thought the problem lied with step 4, but now I see that step 3 was the culprit and that , indeed, I did not do drop table, create table but delete from and inserts. I think that recreating these two tables should solve the problem, isn't it? A classical case of figuring out what the problem is while asking for help, then feeling silly about it :)
Costin Manda wrote: > On Wed, 06 Apr 2005 15:54:29 +0100 > Richard Huxton <dev@archonet.com> wrote: > > >>> I mean from 5 to 5 minutes >>>DROP TABLE >>>CREATE TABLE >>>INSERT 70000 rows in table >> >>I thought you were trying an inserting / updating if it failed? You >>shouldn't have any duplicates if the table was already empty. Or have I >>misunderstood? > > > > Ok, let's start over :) > > The script does the following thing: > 1. read the count of rows in two tables from the mssql database > 2. read the count of rows of the 'mirror' tables in postgres > these are tables that get updated rarely and have a maximum of 100000 > records together > 3. if the counts differ, delete from the mirror table everything and > reinsert everything. > 4. THEN do the inserts that get updated on error > > I thought the problem lied with step 4, but now I see that step 3 was > the culprit and that , indeed, I did not do drop table, create table but > delete from and inserts. I think that recreating these two tables should > solve the problem, isn't it? Hmm - try TRUNCATE rather than DELETE. Also, you might drop the indexes, re-insert the data then recreate the indexes - that can be faster for bulk loading. -- Richard Huxton Archonet Ltd
Richard Huxton <dev@archonet.com> writes: > Costin Manda wrote: >> I thought the problem lied with step 4, but now I see that step 3 was >> the culprit and that , indeed, I did not do drop table, create table but >> delete from and inserts. I think that recreating these two tables should >> solve the problem, isn't it? > Hmm - try TRUNCATE rather than DELETE. Or VACUUM between deleting and inserting. But TRUNCATE would be better if you can do it. regards, tom lane
On Wed, 2005-04-06 at 18:18 +0300, Costin Manda wrote: > The script does the following thing: > 1. read the count of rows in two tables from the mssql database > 2. read the count of rows of the 'mirror' tables in postgres > these are tables that get updated rarely and have a maximum of 100000 > records together > 3. if the counts differ, delete from the mirror table everything and > reinsert everything. > 4. THEN do the inserts that get updated on error if you empty the table in step 3, why do you have to test for the duplicate id error? are there duplicates in the mssql table? gnari