Thread: VERY slow after many updates
I am using PostgreSQL in an embedded system which has only 32 or 64 MB RAM (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod tasks. There is a daemon keep looking up the table and fork a new process to download data from internet. Daemon: . Check the table every 5 seconds . Fork a download process to download if there is new task Downlaod process (there are 5 download process max): . Update the download rate and downloaded size every 3 seconds. At begining, everything just fine. The speed is good. But after 24 hours, the speed to access database become very very slow. Even I stop all processes, restart PostgreSQL and use psql to select data, this speed is still very very slow (a SQL command takes more than 2 seconds). It is a small table. There are only 8 records in the table. The only way to solve it is remove all database, run initdb, create new database and insert new records. I tried to run vacummdb but still very slow. Any idea to make it faster? Thanks, Alex -- Here is the table schema: create table download_queue ( task_id SERIAL, username varchar(128), pid int, url text, filename varchar(1024), status int, created_time int, started_time int, total_size int8, current_size int8, current_rate int, CONSTRAINT download_queue_pkey PRIMARY KEY(task_id) ); CREATE INDEX download_queue_user_index ON download_queue USING BTREE (username); -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Alex, I suppose the table is a kind of 'queue' table, where you insert/get/delete continuously, and the life of the records is short. Considering that in postgres a delete will still leave you the record in the table's file and in the indexes, just mark it as dead, your table's actual size can grow quite a lot even if the number of live records will stay small (you will have a lot of dead tuples, the more tasks processed, the more dead tuples). So I guess you should vacuum this table very often, so that the dead tuples are reused. I'm not an expert on this, but it might be good to vacuum after each n deletions, where n is ~ half the average size of the queue you expect to have. From time to time you might want to do a vacuum full on it and a reindex. Right now I guess a vacuum full + reindex will help you. I think it's best to do: vacuum download_queue; vacuum full download_queue; reindex download_queue; I think the non-full vacuum which is less obtrusive than the full one will do at least some of the work and it will bring all needed things in FS cache, so the full vacuum to be as fast as possible (vacuum full locks exclusively the table). At least I do it this way with good results for small queue-like tables... BTW, I wonder if the download_queue_user_index index is helping you at all on that table ? Do you expect it to grow bigger than 1000 ? Otherwise it has no point to index it. HTH, Csaba. On Sat, 2005-11-19 at 08:46, Alex Wang wrote: > I am using PostgreSQL in an embedded system which has only 32 or 64 MB RAM > (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod > tasks. There is a daemon keep looking up the table and fork a new process to > download data from internet. > > Daemon: > . Check the table every 5 seconds > . Fork a download process to download if there is new task > Downlaod process (there are 5 download process max): > . Update the download rate and downloaded size every 3 seconds. > > At begining, everything just fine. The speed is good. But after 24 hours, > the speed to access database become very very slow. Even I stop all > processes, restart PostgreSQL and use psql to select data, this speed is > still very very slow (a SQL command takes more than 2 seconds). It is a > small table. There are only 8 records in the table. > > The only way to solve it is remove all database, run initdb, create new > database and insert new records. I tried to run vacummdb but still very > slow. > > Any idea to make it faster? > > Thanks, > Alex > > -- > Here is the table schema: > create table download_queue ( > task_id SERIAL, > username varchar(128), > pid int, > url text, > filename varchar(1024), > status int, > created_time int, > started_time int, > total_size int8, > current_size int8, > current_rate int, > CONSTRAINT download_queue_pkey PRIMARY KEY(task_id) > ); > CREATE INDEX download_queue_user_index ON download_queue USING BTREE > (username); > >
Hi Csaba, Thanks for your reply. Yes, it's a "queue" table. But I did not perform many insert/delete before it becomes slow. After insert 10 records, I just do get/update continuously. After 24 hour, the whole database become very slow (not only the download_queue table but other tables, too). But you are right. Full vacuum fixes the problem. Thank you very much! I expect there will be less than 1000 records in the table. The index does obvous improvement on "SELECT task_id, username FROM download_queue WHERE username > '%s'" even there are only 100 records. Thanks, Alex ----- Original Message ----- From: "Csaba Nagy" <nagy@ecircle-ag.com> To: "Alex Wang" <alex@alexwang.com> Cc: "postgres performance list" <pgsql-performance@postgresql.org> Sent: Saturday, November 19, 2005 7:12 PM Subject: Re: [PERFORM] VERY slow after many updates > Alex, > > I suppose the table is a kind of 'queue' table, where you > insert/get/delete continuously, and the life of the records is short. > Considering that in postgres a delete will still leave you the record in > the table's file and in the indexes, just mark it as dead, your table's > actual size can grow quite a lot even if the number of live records will > stay small (you will have a lot of dead tuples, the more tasks > processed, the more dead tuples). So I guess you should vacuum this > table very often, so that the dead tuples are reused. I'm not an expert > on this, but it might be good to vacuum after each n deletions, where n > is ~ half the average size of the queue you expect to have. From time to > time you might want to do a vacuum full on it and a reindex. > > Right now I guess a vacuum full + reindex will help you. I think it's > best to do: > > vacuum download_queue; > vacuum full download_queue; > reindex download_queue; > > I think the non-full vacuum which is less obtrusive than the full one > will do at least some of the work and it will bring all needed things in > FS cache, so the full vacuum to be as fast as possible (vacuum full > locks exclusively the table). At least I do it this way with good > results for small queue-like tables... > > BTW, I wonder if the download_queue_user_index index is helping you at > all on that table ? Do you expect it to grow bigger than 1000 ? > Otherwise it has no point to index it. > > HTH, > Csaba. > > On Sat, 2005-11-19 at 08:46, Alex Wang wrote: >> I am using PostgreSQL in an embedded system which has only 32 or 64 MB >> RAM >> (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod >> tasks. There is a daemon keep looking up the table and fork a new process >> to >> download data from internet. >> >> Daemon: >> . Check the table every 5 seconds >> . Fork a download process to download if there is new task >> Downlaod process (there are 5 download process max): >> . Update the download rate and downloaded size every 3 seconds. >> >> At begining, everything just fine. The speed is good. But after 24 hours, >> the speed to access database become very very slow. Even I stop all >> processes, restart PostgreSQL and use psql to select data, this speed is >> still very very slow (a SQL command takes more than 2 seconds). It is a >> small table. There are only 8 records in the table. >> >> The only way to solve it is remove all database, run initdb, create new >> database and insert new records. I tried to run vacummdb but still very >> slow. >> >> Any idea to make it faster? >> >> Thanks, >> Alex >> >> -- >> Here is the table schema: >> create table download_queue ( >> task_id SERIAL, >> username varchar(128), >> pid int, >> url text, >> filename varchar(1024), >> status int, >> created_time int, >> started_time int, >> total_size int8, >> current_size int8, >> current_rate int, >> CONSTRAINT download_queue_pkey PRIMARY KEY(task_id) >> ); >> CREATE INDEX download_queue_user_index ON download_queue USING BTREE >> (username); >> >> > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Just for clarification, update is actually equal to delete+insert in Postgres. So if you update rows, it's the same as you would delete the row and insert a new version. So the table is bloating also in this situation. I think there is an added problem when you update, namely to get to a row, postgres will traverse all dead rows matching the criteria... so even if you have an index, getting 1 row which was updated 10000 times will access 10000 rows only to find 1 which is still alive. So in this case vacuuming should happen even more often, to eliminate the dead rows. And the index was probably only helping because the table was really bloated, so if you vacuum it often enough you will be better off without the index if the row count will stay low. Cheers, Csaba. On Sat, 2005-11-19 at 13:05, Alex Wang wrote: > Hi Csaba, > > Thanks for your reply. > > Yes, it's a "queue" table. But I did not perform many insert/delete before > it becomes slow. After insert 10 records, I just do get/update continuously. > After 24 hour, the whole database become very slow (not only the > download_queue table but other tables, too). But you are right. Full vacuum > fixes the problem. Thank you very much! > > I expect there will be less than 1000 records in the table. The index does > obvous improvement on "SELECT task_id, username FROM download_queue WHERE > username > '%s'" even there are only 100 records. > > Thanks, > Alex > > ----- Original Message ----- > From: "Csaba Nagy" <nagy@ecircle-ag.com> > To: "Alex Wang" <alex@alexwang.com> > Cc: "postgres performance list" <pgsql-performance@postgresql.org> > Sent: Saturday, November 19, 2005 7:12 PM > Subject: Re: [PERFORM] VERY slow after many updates > > > > Alex, > > > > I suppose the table is a kind of 'queue' table, where you > > insert/get/delete continuously, and the life of the records is short. > > Considering that in postgres a delete will still leave you the record in > > the table's file and in the indexes, just mark it as dead, your table's > > actual size can grow quite a lot even if the number of live records will > > stay small (you will have a lot of dead tuples, the more tasks > > processed, the more dead tuples). So I guess you should vacuum this > > table very often, so that the dead tuples are reused. I'm not an expert > > on this, but it might be good to vacuum after each n deletions, where n > > is ~ half the average size of the queue you expect to have. From time to > > time you might want to do a vacuum full on it and a reindex. > > > > Right now I guess a vacuum full + reindex will help you. I think it's > > best to do: > > > > vacuum download_queue; > > vacuum full download_queue; > > reindex download_queue; > > > > I think the non-full vacuum which is less obtrusive than the full one > > will do at least some of the work and it will bring all needed things in > > FS cache, so the full vacuum to be as fast as possible (vacuum full > > locks exclusively the table). At least I do it this way with good > > results for small queue-like tables... > > > > BTW, I wonder if the download_queue_user_index index is helping you at > > all on that table ? Do you expect it to grow bigger than 1000 ? > > Otherwise it has no point to index it. > > > > HTH, > > Csaba. > > > > On Sat, 2005-11-19 at 08:46, Alex Wang wrote: > >> I am using PostgreSQL in an embedded system which has only 32 or 64 MB > >> RAM > >> (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod > >> tasks. There is a daemon keep looking up the table and fork a new process > >> to > >> download data from internet. > >> > >> Daemon: > >> . Check the table every 5 seconds > >> . Fork a download process to download if there is new task > >> Downlaod process (there are 5 download process max): > >> . Update the download rate and downloaded size every 3 seconds. > >> > >> At begining, everything just fine. The speed is good. But after 24 hours, > >> the speed to access database become very very slow. Even I stop all > >> processes, restart PostgreSQL and use psql to select data, this speed is > >> still very very slow (a SQL command takes more than 2 seconds). It is a > >> small table. There are only 8 records in the table. > >> > >> The only way to solve it is remove all database, run initdb, create new > >> database and insert new records. I tried to run vacummdb but still very > >> slow. > >> > >> Any idea to make it faster? > >> > >> Thanks, > >> Alex > >> > >> -- > >> Here is the table schema: > >> create table download_queue ( > >> task_id SERIAL, > >> username varchar(128), > >> pid int, > >> url text, > >> filename varchar(1024), > >> status int, > >> created_time int, > >> started_time int, > >> total_size int8, > >> current_size int8, > >> current_rate int, > >> CONSTRAINT download_queue_pkey PRIMARY KEY(task_id) > >> ); > >> CREATE INDEX download_queue_user_index ON download_queue USING BTREE > >> (username); > >> > >> > > > > > > -- > > This message has been scanned for viruses and > > dangerous content by MailScanner, and is > > believed to be clean. > > >
On 19.11.2005, at 13:05 Uhr, Alex Wang wrote: > Yes, it's a "queue" table. But I did not perform many insert/delete > before it becomes slow. After insert 10 records, I just do get/ > update continuously. When PostgreSQL updates a row, it creates a new row with the updated values. So you should be aware, that the DB gets bigger and bigger when you only update your rows. Vacuum full reclaims that used space. The concepts are described in detail in the manual in chapter 12. cug -- PharmaLine Essen, GERMANY and Big Nerd Ranch Europe - PostgreSQL Training, Dec. 2005, Rome, Italy http://www.bignerdranch.com/classes/postgresql.shtml
Attachment
Great infomation. I didn't know that update is equal to delete+insert in Postgres. I would be more careful on designing the database access method in this case. Thanks, Alex ----- Original Message ----- From: "Csaba Nagy" <nagy@ecircle-ag.com> To: "Alex Wang" <alex@alexwang.com> Cc: "postgres performance list" <pgsql-performance@postgresql.org> Sent: Saturday, November 19, 2005 8:12 PM Subject: Re: [PERFORM] VERY slow after many updates > Just for clarification, update is actually equal to delete+insert in > Postgres. So if you update rows, it's the same as you would delete the > row and insert a new version. So the table is bloating also in this > situation. > I think there is an added problem when you update, namely to get to a > row, postgres will traverse all dead rows matching the criteria... so > even if you have an index, getting 1 row which was updated 10000 times > will access 10000 rows only to find 1 which is still alive. So in this > case vacuuming should happen even more often, to eliminate the dead > rows. > And the index was probably only helping because the table was really > bloated, so if you vacuum it often enough you will be better off without > the index if the row count will stay low. > > Cheers, > Csaba. > > > On Sat, 2005-11-19 at 13:05, Alex Wang wrote: >> Hi Csaba, >> >> Thanks for your reply. >> >> Yes, it's a "queue" table. But I did not perform many insert/delete >> before >> it becomes slow. After insert 10 records, I just do get/update >> continuously. >> After 24 hour, the whole database become very slow (not only the >> download_queue table but other tables, too). But you are right. Full >> vacuum >> fixes the problem. Thank you very much! >> >> I expect there will be less than 1000 records in the table. The index >> does >> obvous improvement on "SELECT task_id, username FROM download_queue WHERE >> username > '%s'" even there are only 100 records. >> >> Thanks, >> Alex >> >> ----- Original Message ----- >> From: "Csaba Nagy" <nagy@ecircle-ag.com> >> To: "Alex Wang" <alex@alexwang.com> >> Cc: "postgres performance list" <pgsql-performance@postgresql.org> >> Sent: Saturday, November 19, 2005 7:12 PM >> Subject: Re: [PERFORM] VERY slow after many updates >> >> >> > Alex, >> > >> > I suppose the table is a kind of 'queue' table, where you >> > insert/get/delete continuously, and the life of the records is short. >> > Considering that in postgres a delete will still leave you the record >> > in >> > the table's file and in the indexes, just mark it as dead, your table's >> > actual size can grow quite a lot even if the number of live records >> > will >> > stay small (you will have a lot of dead tuples, the more tasks >> > processed, the more dead tuples). So I guess you should vacuum this >> > table very often, so that the dead tuples are reused. I'm not an expert >> > on this, but it might be good to vacuum after each n deletions, where n >> > is ~ half the average size of the queue you expect to have. From time >> > to >> > time you might want to do a vacuum full on it and a reindex. >> > >> > Right now I guess a vacuum full + reindex will help you. I think it's >> > best to do: >> > >> > vacuum download_queue; >> > vacuum full download_queue; >> > reindex download_queue; >> > >> > I think the non-full vacuum which is less obtrusive than the full one >> > will do at least some of the work and it will bring all needed things >> > in >> > FS cache, so the full vacuum to be as fast as possible (vacuum full >> > locks exclusively the table). At least I do it this way with good >> > results for small queue-like tables... >> > >> > BTW, I wonder if the download_queue_user_index index is helping you at >> > all on that table ? Do you expect it to grow bigger than 1000 ? >> > Otherwise it has no point to index it. >> > >> > HTH, >> > Csaba. >> > >> > On Sat, 2005-11-19 at 08:46, Alex Wang wrote: >> >> I am using PostgreSQL in an embedded system which has only 32 or 64 MB >> >> RAM >> >> (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep >> >> downlaod >> >> tasks. There is a daemon keep looking up the table and fork a new >> >> process >> >> to >> >> download data from internet. >> >> >> >> Daemon: >> >> . Check the table every 5 seconds >> >> . Fork a download process to download if there is new task >> >> Downlaod process (there are 5 download process max): >> >> . Update the download rate and downloaded size every 3 seconds. >> >> >> >> At begining, everything just fine. The speed is good. But after 24 >> >> hours, >> >> the speed to access database become very very slow. Even I stop all >> >> processes, restart PostgreSQL and use psql to select data, this speed >> >> is >> >> still very very slow (a SQL command takes more than 2 seconds). It is >> >> a >> >> small table. There are only 8 records in the table. >> >> >> >> The only way to solve it is remove all database, run initdb, create >> >> new >> >> database and insert new records. I tried to run vacummdb but still >> >> very >> >> slow. >> >> >> >> Any idea to make it faster? >> >> >> >> Thanks, >> >> Alex >> >> >> >> -- >> >> Here is the table schema: >> >> create table download_queue ( >> >> task_id SERIAL, >> >> username varchar(128), >> >> pid int, >> >> url text, >> >> filename varchar(1024), >> >> status int, >> >> created_time int, >> >> started_time int, >> >> total_size int8, >> >> current_size int8, >> >> current_rate int, >> >> CONSTRAINT download_queue_pkey PRIMARY KEY(task_id) >> >> ); >> >> CREATE INDEX download_queue_user_index ON download_queue USING BTREE >> >> (username); >> >> >> >> >> > >> > >> > -- >> > This message has been scanned for viruses and >> > dangerous content by MailScanner, and is >> > believed to be clean. >> > >> > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
On Sat, 2005-11-19 at 06:29, Alex Wang wrote: > Great infomation. I didn't know that update is equal to delete+insert in > Postgres. I would be more careful on designing the database access method in > this case. Just make sure you have regular vacuums scheduled (or run them from within your app) and you're fine.