Thread: VERY slow after many updates

VERY slow after many updates

From
"Alex Wang"
Date:
I am using PostgreSQL in an embedded system which has only 32 or 64 MB RAM
(run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod
tasks. There is a daemon keep looking up the table and fork a new process to
download data from internet.

Daemon:
    . Check the table every 5 seconds
    . Fork a download process to download if there is new task
Downlaod process (there are 5 download process max):
    . Update the download rate and downloaded size every 3 seconds.

At begining, everything just fine. The speed is good. But after 24 hours,
the speed to access database become very very slow. Even I stop all
processes, restart PostgreSQL and use psql to select data, this speed is
still very very slow (a SQL command takes more than 2 seconds). It is a
small table. There are only 8 records in the table.

The only way to solve it is remove all database, run initdb, create new
database and insert new records. I tried to run vacummdb but still very
slow.

Any idea to make it faster?

Thanks,
Alex

--
Here is the table schema:
create table download_queue (
       task_id SERIAL,
       username varchar(128),
       pid int,
       url text,
       filename varchar(1024),
       status int,
       created_time int,
       started_time int,
       total_size int8,
       current_size int8,
       current_rate int,
       CONSTRAINT download_queue_pkey PRIMARY KEY(task_id)
);
CREATE INDEX download_queue_user_index ON download_queue USING BTREE
(username);



--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: VERY slow after many updates

From
Csaba Nagy
Date:
Alex,

I suppose the table is a kind of 'queue' table, where you
insert/get/delete continuously, and the life of the records is short.
Considering that in postgres a delete will still leave you the record in
the table's file and in the indexes, just mark it as dead, your table's
actual size can grow quite a lot even if the number of live records will
stay small (you will have a lot of dead tuples, the more tasks
processed, the more dead tuples). So I guess you should vacuum this
table very often, so that the dead tuples are reused. I'm not an expert
on this, but it might be good to vacuum after each n deletions, where n
is ~ half the average size of the queue you expect to have. From time to
time you might want to do a vacuum full on it and a reindex.

Right now I guess a vacuum full + reindex will help you. I think it's
best to do:

vacuum download_queue;
vacuum full download_queue;
reindex download_queue;

I think the non-full vacuum which is less obtrusive than the full one
will do at least some of the work and it will bring all needed things in
FS cache, so the full vacuum to be as fast as possible (vacuum full
locks exclusively the table). At least I do it this way with good
results for small queue-like tables...

BTW, I wonder if the download_queue_user_index index is helping you at
all on that table ? Do you expect it to grow bigger than 1000 ?
Otherwise it has no point to index it.

HTH,
Csaba.

On Sat, 2005-11-19 at 08:46, Alex Wang wrote:
> I am using PostgreSQL in an embedded system which has only 32 or 64 MB RAM
> (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod
> tasks. There is a daemon keep looking up the table and fork a new process to
> download data from internet.
>
> Daemon:
>     . Check the table every 5 seconds
>     . Fork a download process to download if there is new task
> Downlaod process (there are 5 download process max):
>     . Update the download rate and downloaded size every 3 seconds.
>
> At begining, everything just fine. The speed is good. But after 24 hours,
> the speed to access database become very very slow. Even I stop all
> processes, restart PostgreSQL and use psql to select data, this speed is
> still very very slow (a SQL command takes more than 2 seconds). It is a
> small table. There are only 8 records in the table.
>
> The only way to solve it is remove all database, run initdb, create new
> database and insert new records. I tried to run vacummdb but still very
> slow.
>
> Any idea to make it faster?
>
> Thanks,
> Alex
>
> --
> Here is the table schema:
> create table download_queue (
>        task_id SERIAL,
>        username varchar(128),
>        pid int,
>        url text,
>        filename varchar(1024),
>        status int,
>        created_time int,
>        started_time int,
>        total_size int8,
>        current_size int8,
>        current_rate int,
>        CONSTRAINT download_queue_pkey PRIMARY KEY(task_id)
> );
> CREATE INDEX download_queue_user_index ON download_queue USING BTREE
> (username);
>
>


Re: VERY slow after many updates

From
"Alex Wang"
Date:
Hi Csaba,

Thanks for your reply.

Yes, it's a "queue" table. But I did not perform many insert/delete before
it becomes slow. After insert 10 records, I just do get/update continuously.
After 24 hour, the whole database become very slow (not only the
download_queue table but other tables, too).  But you are right. Full vacuum
fixes the problem. Thank you very much!

I expect there will be less than 1000 records in the table. The index does
obvous improvement on "SELECT task_id, username FROM download_queue WHERE
username > '%s'" even there are only 100 records.

Thanks,
Alex

----- Original Message -----
From: "Csaba Nagy" <nagy@ecircle-ag.com>
To: "Alex Wang" <alex@alexwang.com>
Cc: "postgres performance list" <pgsql-performance@postgresql.org>
Sent: Saturday, November 19, 2005 7:12 PM
Subject: Re: [PERFORM] VERY slow after many updates


> Alex,
>
> I suppose the table is a kind of 'queue' table, where you
> insert/get/delete continuously, and the life of the records is short.
> Considering that in postgres a delete will still leave you the record in
> the table's file and in the indexes, just mark it as dead, your table's
> actual size can grow quite a lot even if the number of live records will
> stay small (you will have a lot of dead tuples, the more tasks
> processed, the more dead tuples). So I guess you should vacuum this
> table very often, so that the dead tuples are reused. I'm not an expert
> on this, but it might be good to vacuum after each n deletions, where n
> is ~ half the average size of the queue you expect to have. From time to
> time you might want to do a vacuum full on it and a reindex.
>
> Right now I guess a vacuum full + reindex will help you. I think it's
> best to do:
>
> vacuum download_queue;
> vacuum full download_queue;
> reindex download_queue;
>
> I think the non-full vacuum which is less obtrusive than the full one
> will do at least some of the work and it will bring all needed things in
> FS cache, so the full vacuum to be as fast as possible (vacuum full
> locks exclusively the table). At least I do it this way with good
> results for small queue-like tables...
>
> BTW, I wonder if the download_queue_user_index index is helping you at
> all on that table ? Do you expect it to grow bigger than 1000 ?
> Otherwise it has no point to index it.
>
> HTH,
> Csaba.
>
> On Sat, 2005-11-19 at 08:46, Alex Wang wrote:
>> I am using PostgreSQL in an embedded system which has only 32 or 64 MB
>> RAM
>> (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod
>> tasks. There is a daemon keep looking up the table and fork a new process
>> to
>> download data from internet.
>>
>> Daemon:
>>     . Check the table every 5 seconds
>>     . Fork a download process to download if there is new task
>> Downlaod process (there are 5 download process max):
>>     . Update the download rate and downloaded size every 3 seconds.
>>
>> At begining, everything just fine. The speed is good. But after 24 hours,
>> the speed to access database become very very slow. Even I stop all
>> processes, restart PostgreSQL and use psql to select data, this speed is
>> still very very slow (a SQL command takes more than 2 seconds). It is a
>> small table. There are only 8 records in the table.
>>
>> The only way to solve it is remove all database, run initdb, create new
>> database and insert new records. I tried to run vacummdb but still very
>> slow.
>>
>> Any idea to make it faster?
>>
>> Thanks,
>> Alex
>>
>> --
>> Here is the table schema:
>> create table download_queue (
>>        task_id SERIAL,
>>        username varchar(128),
>>        pid int,
>>        url text,
>>        filename varchar(1024),
>>        status int,
>>        created_time int,
>>        started_time int,
>>        total_size int8,
>>        current_size int8,
>>        current_rate int,
>>        CONSTRAINT download_queue_pkey PRIMARY KEY(task_id)
>> );
>> CREATE INDEX download_queue_user_index ON download_queue USING BTREE
>> (username);
>>
>>
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: VERY slow after many updates

From
Csaba Nagy
Date:
Just for clarification, update is actually equal to delete+insert in
Postgres. So if you update rows, it's the same as you would delete the
row and insert a new version. So the table is bloating also in this
situation.
I think there is an added problem when you update, namely to get to a
row, postgres will traverse all dead rows matching the criteria... so
even if you have an index, getting 1 row which was updated 10000 times
will access 10000 rows only to find 1 which is still alive. So in this
case vacuuming should happen even more often, to eliminate the dead
rows.
And the index was probably only helping because the table was really
bloated, so if you vacuum it often enough you will be better off without
the index if the row count will stay low.

Cheers,
Csaba.


On Sat, 2005-11-19 at 13:05, Alex Wang wrote:
> Hi Csaba,
>
> Thanks for your reply.
>
> Yes, it's a "queue" table. But I did not perform many insert/delete before
> it becomes slow. After insert 10 records, I just do get/update continuously.
> After 24 hour, the whole database become very slow (not only the
> download_queue table but other tables, too).  But you are right. Full vacuum
> fixes the problem. Thank you very much!
>
> I expect there will be less than 1000 records in the table. The index does
> obvous improvement on "SELECT task_id, username FROM download_queue WHERE
> username > '%s'" even there are only 100 records.
>
> Thanks,
> Alex
>
> ----- Original Message -----
> From: "Csaba Nagy" <nagy@ecircle-ag.com>
> To: "Alex Wang" <alex@alexwang.com>
> Cc: "postgres performance list" <pgsql-performance@postgresql.org>
> Sent: Saturday, November 19, 2005 7:12 PM
> Subject: Re: [PERFORM] VERY slow after many updates
>
>
> > Alex,
> >
> > I suppose the table is a kind of 'queue' table, where you
> > insert/get/delete continuously, and the life of the records is short.
> > Considering that in postgres a delete will still leave you the record in
> > the table's file and in the indexes, just mark it as dead, your table's
> > actual size can grow quite a lot even if the number of live records will
> > stay small (you will have a lot of dead tuples, the more tasks
> > processed, the more dead tuples). So I guess you should vacuum this
> > table very often, so that the dead tuples are reused. I'm not an expert
> > on this, but it might be good to vacuum after each n deletions, where n
> > is ~ half the average size of the queue you expect to have. From time to
> > time you might want to do a vacuum full on it and a reindex.
> >
> > Right now I guess a vacuum full + reindex will help you. I think it's
> > best to do:
> >
> > vacuum download_queue;
> > vacuum full download_queue;
> > reindex download_queue;
> >
> > I think the non-full vacuum which is less obtrusive than the full one
> > will do at least some of the work and it will bring all needed things in
> > FS cache, so the full vacuum to be as fast as possible (vacuum full
> > locks exclusively the table). At least I do it this way with good
> > results for small queue-like tables...
> >
> > BTW, I wonder if the download_queue_user_index index is helping you at
> > all on that table ? Do you expect it to grow bigger than 1000 ?
> > Otherwise it has no point to index it.
> >
> > HTH,
> > Csaba.
> >
> > On Sat, 2005-11-19 at 08:46, Alex Wang wrote:
> >> I am using PostgreSQL in an embedded system which has only 32 or 64 MB
> >> RAM
> >> (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep downlaod
> >> tasks. There is a daemon keep looking up the table and fork a new process
> >> to
> >> download data from internet.
> >>
> >> Daemon:
> >>     . Check the table every 5 seconds
> >>     . Fork a download process to download if there is new task
> >> Downlaod process (there are 5 download process max):
> >>     . Update the download rate and downloaded size every 3 seconds.
> >>
> >> At begining, everything just fine. The speed is good. But after 24 hours,
> >> the speed to access database become very very slow. Even I stop all
> >> processes, restart PostgreSQL and use psql to select data, this speed is
> >> still very very slow (a SQL command takes more than 2 seconds). It is a
> >> small table. There are only 8 records in the table.
> >>
> >> The only way to solve it is remove all database, run initdb, create new
> >> database and insert new records. I tried to run vacummdb but still very
> >> slow.
> >>
> >> Any idea to make it faster?
> >>
> >> Thanks,
> >> Alex
> >>
> >> --
> >> Here is the table schema:
> >> create table download_queue (
> >>        task_id SERIAL,
> >>        username varchar(128),
> >>        pid int,
> >>        url text,
> >>        filename varchar(1024),
> >>        status int,
> >>        created_time int,
> >>        started_time int,
> >>        total_size int8,
> >>        current_size int8,
> >>        current_rate int,
> >>        CONSTRAINT download_queue_pkey PRIMARY KEY(task_id)
> >> );
> >> CREATE INDEX download_queue_user_index ON download_queue USING BTREE
> >> (username);
> >>
> >>
> >
> >
> > --
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> >
>


Re: VERY slow after many updates

From
Guido Neitzer
Date:
On 19.11.2005, at 13:05 Uhr, Alex Wang wrote:

> Yes, it's a "queue" table. But I did not perform many insert/delete
> before it becomes slow. After insert 10 records, I just do get/
> update continuously.

When PostgreSQL updates a row, it creates a new row with the updated
values. So you should be aware, that the DB gets bigger and bigger
when you only update your rows. Vacuum full reclaims that used space.

The concepts are described in detail in the manual in chapter 12.

cug

--
PharmaLine Essen, GERMANY and
Big Nerd Ranch Europe - PostgreSQL Training, Dec. 2005, Rome, Italy
http://www.bignerdranch.com/classes/postgresql.shtml



Attachment

Re: VERY slow after many updates

From
"Alex Wang"
Date:
Great infomation. I didn't know that update is equal to delete+insert in
Postgres. I would be more careful on designing the database access method in
this case.

Thanks,
Alex

----- Original Message -----
From: "Csaba Nagy" <nagy@ecircle-ag.com>
To: "Alex Wang" <alex@alexwang.com>
Cc: "postgres performance list" <pgsql-performance@postgresql.org>
Sent: Saturday, November 19, 2005 8:12 PM
Subject: Re: [PERFORM] VERY slow after many updates


> Just for clarification, update is actually equal to delete+insert in
> Postgres. So if you update rows, it's the same as you would delete the
> row and insert a new version. So the table is bloating also in this
> situation.
> I think there is an added problem when you update, namely to get to a
> row, postgres will traverse all dead rows matching the criteria... so
> even if you have an index, getting 1 row which was updated 10000 times
> will access 10000 rows only to find 1 which is still alive. So in this
> case vacuuming should happen even more often, to eliminate the dead
> rows.
> And the index was probably only helping because the table was really
> bloated, so if you vacuum it often enough you will be better off without
> the index if the row count will stay low.
>
> Cheers,
> Csaba.
>
>
> On Sat, 2005-11-19 at 13:05, Alex Wang wrote:
>> Hi Csaba,
>>
>> Thanks for your reply.
>>
>> Yes, it's a "queue" table. But I did not perform many insert/delete
>> before
>> it becomes slow. After insert 10 records, I just do get/update
>> continuously.
>> After 24 hour, the whole database become very slow (not only the
>> download_queue table but other tables, too).  But you are right. Full
>> vacuum
>> fixes the problem. Thank you very much!
>>
>> I expect there will be less than 1000 records in the table. The index
>> does
>> obvous improvement on "SELECT task_id, username FROM download_queue WHERE
>> username > '%s'" even there are only 100 records.
>>
>> Thanks,
>> Alex
>>
>> ----- Original Message -----
>> From: "Csaba Nagy" <nagy@ecircle-ag.com>
>> To: "Alex Wang" <alex@alexwang.com>
>> Cc: "postgres performance list" <pgsql-performance@postgresql.org>
>> Sent: Saturday, November 19, 2005 7:12 PM
>> Subject: Re: [PERFORM] VERY slow after many updates
>>
>>
>> > Alex,
>> >
>> > I suppose the table is a kind of 'queue' table, where you
>> > insert/get/delete continuously, and the life of the records is short.
>> > Considering that in postgres a delete will still leave you the record
>> > in
>> > the table's file and in the indexes, just mark it as dead, your table's
>> > actual size can grow quite a lot even if the number of live records
>> > will
>> > stay small (you will have a lot of dead tuples, the more tasks
>> > processed, the more dead tuples). So I guess you should vacuum this
>> > table very often, so that the dead tuples are reused. I'm not an expert
>> > on this, but it might be good to vacuum after each n deletions, where n
>> > is ~ half the average size of the queue you expect to have. From time
>> > to
>> > time you might want to do a vacuum full on it and a reindex.
>> >
>> > Right now I guess a vacuum full + reindex will help you. I think it's
>> > best to do:
>> >
>> > vacuum download_queue;
>> > vacuum full download_queue;
>> > reindex download_queue;
>> >
>> > I think the non-full vacuum which is less obtrusive than the full one
>> > will do at least some of the work and it will bring all needed things
>> > in
>> > FS cache, so the full vacuum to be as fast as possible (vacuum full
>> > locks exclusively the table). At least I do it this way with good
>> > results for small queue-like tables...
>> >
>> > BTW, I wonder if the download_queue_user_index index is helping you at
>> > all on that table ? Do you expect it to grow bigger than 1000 ?
>> > Otherwise it has no point to index it.
>> >
>> > HTH,
>> > Csaba.
>> >
>> > On Sat, 2005-11-19 at 08:46, Alex Wang wrote:
>> >> I am using PostgreSQL in an embedded system which has only 32 or 64 MB
>> >> RAM
>> >> (run on PPC 266 MHz or ARM 266MHz CPU). I have a table to keep
>> >> downlaod
>> >> tasks. There is a daemon keep looking up the table and fork a new
>> >> process
>> >> to
>> >> download data from internet.
>> >>
>> >> Daemon:
>> >>     . Check the table every 5 seconds
>> >>     . Fork a download process to download if there is new task
>> >> Downlaod process (there are 5 download process max):
>> >>     . Update the download rate and downloaded size every 3 seconds.
>> >>
>> >> At begining, everything just fine. The speed is good. But after 24
>> >> hours,
>> >> the speed to access database become very very slow. Even I stop all
>> >> processes, restart PostgreSQL and use psql to select data, this speed
>> >> is
>> >> still very very slow (a SQL command takes more than 2 seconds). It is
>> >> a
>> >> small table. There are only 8 records in the table.
>> >>
>> >> The only way to solve it is remove all database, run initdb, create
>> >> new
>> >> database and insert new records. I tried to run vacummdb but still
>> >> very
>> >> slow.
>> >>
>> >> Any idea to make it faster?
>> >>
>> >> Thanks,
>> >> Alex
>> >>
>> >> --
>> >> Here is the table schema:
>> >> create table download_queue (
>> >>        task_id SERIAL,
>> >>        username varchar(128),
>> >>        pid int,
>> >>        url text,
>> >>        filename varchar(1024),
>> >>        status int,
>> >>        created_time int,
>> >>        started_time int,
>> >>        total_size int8,
>> >>        current_size int8,
>> >>        current_rate int,
>> >>        CONSTRAINT download_queue_pkey PRIMARY KEY(task_id)
>> >> );
>> >> CREATE INDEX download_queue_user_index ON download_queue USING BTREE
>> >> (username);
>> >>
>> >>
>> >
>> >
>> > --
>> > This message has been scanned for viruses and
>> > dangerous content by MailScanner, and is
>> > believed to be clean.
>> >
>>
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: VERY slow after many updates

From
Scott Marlowe
Date:
On Sat, 2005-11-19 at 06:29, Alex Wang wrote:
> Great infomation. I didn't know that update is equal to delete+insert in
> Postgres. I would be more careful on designing the database access method in
> this case.

Just make sure you have regular vacuums scheduled (or run them from
within your app) and you're fine.