Thread: Poor Performance on large Tables

Poor Performance on large Tables

From

"Manuel Rorarius"

Date:

15 November 2002, 11:29:23

Hi!

We are currently running a Community-Site on Postgresql. This community
makes a lot of traffic on the Database. To improve Performance we already
took a lot of actions from the Database into the Java-Cache of our
Application.

Nonetheless we have some very big Tables where we store profiles, messages
and a lot more. I am talking about 50.000 new tuples every day for one
tables.

Now we get very poor results and even difficulties when trying easy selects
on those tables. When using pgAdminII we sometimes even have to wait nearly
a minute until we can edit that table because pgAdminII always does a
"select count(*)" on the table to get the amout of rows in that table. Also
the Server-Load goes very high when issuing this count(*)-Select!

We also do a delete of old data in the table, to keep it "smaller". Once a
Night we have a vacuum-analyze running over the whole database and a second
one over the big tables.

What we get as a result of the Explain on the "select count(*)" for a big
table looks awful for me:

Aggregate  (cost=40407.96..40407.96 rows=1 width=0)
  ->  Seq Scan on userlog  (cost=0.00..37029.37 rows=1351437 width=0)

Aggregate  (cost=114213.24..114213.24 rows=1 width=0)
  ->  Seq Scan on trafficcenter  (cost=0.00..109446.79 rows=1906579 width=0)

What can we do to improve the performance of big tables in our Database ?!
We are currently running postgresql 7.2.1 on Linux with a 1.8 Athlon machine
with 1 GB RAM!

Regards ... Manuel Rorarius ...

Re: Poor Performance on large Tables

From

"Nigel J. Andrews"

Date:

15 November 2002, 13:10:37

First, I've just noticed the date on this message and a quick check shows I'm
getting a right old mixture of time stamped messages at the moment. Got at
least one dated the 8th.

On to the subject now...if you have serial types in your tables perhaps you can
use those to determine the count of tuples. This isn't going to help pgAdminII
but you could  select the min and max of there serial columns and subtract to
find the number of tuples. From the sounds of your setup this may work for you
even though it's not a general solution.

You can optimize the min and max selection using queries of the form:

select min(serialcol) from yourtable order by serialcol desc limit 1;

--
Nigel J. Andrews


 On Thu, 14 Nov 2002, Manuel Rorarius wrote:

> Hi!
>
> We are currently running a Community-Site on Postgresql. This community
> makes a lot of traffic on the Database. To improve Performance we already
> took a lot of actions from the Database into the Java-Cache of our
> Application.
>
> Nonetheless we have some very big Tables where we store profiles, messages
> and a lot more. I am talking about 50.000 new tuples every day for one
> tables.
>
> Now we get very poor results and even difficulties when trying easy selects
> on those tables. When using pgAdminII we sometimes even have to wait nearly
> a minute until we can edit that table because pgAdminII always does a
> "select count(*)" on the table to get the amout of rows in that table. Also
> the Server-Load goes very high when issuing this count(*)-Select!
>
> We also do a delete of old data in the table, to keep it "smaller". Once a
> Night we have a vacuum-analyze running over the whole database and a second
> one over the big tables.
>
> What we get as a result of the Explain on the "select count(*)" for a big
> table looks awful for me:
>
> Aggregate  (cost=40407.96..40407.96 rows=1 width=0)
>   ->  Seq Scan on userlog  (cost=0.00..37029.37 rows=1351437 width=0)
>
> Aggregate  (cost=114213.24..114213.24 rows=1 width=0)
>   ->  Seq Scan on trafficcenter  (cost=0.00..109446.79 rows=1906579 width=0)
>
> What can we do to improve the performance of big tables in our Database ?!
> We are currently running postgresql 7.2.1 on Linux with a 1.8 Athlon machine
> with 1 GB RAM!
>
> Regards ... Manuel Rorarius ...
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: Poor Performance on large Tables

From

Garrett Bladow

Date:

15 November 2002, 13:17:42

Looks like you need to index those large tables.
CREATE INDEX userlog_idx ON foo_table(userlog);

---- This is what you wrote me ----

:Hi!
:
:We are currently running a Community-Site on Postgresql. This community
:makes a lot of traffic on the Database. To improve Performance we already
:took a lot of actions from the Database into the Java-Cache of our
:Application.
:
:Nonetheless we have some very big Tables where we store profiles, messages
:and a lot more. I am talking about 50.000 new tuples every day for one
:tables.
:
:Now we get very poor results and even difficulties when trying easy selects
:on those tables. When using pgAdminII we sometimes even have to wait nearly
:a minute until we can edit that table because pgAdminII always does a
:"select count(*)" on the table to get the amout of rows in that table. Also
:the Server-Load goes very high when issuing this count(*)-Select!
:
:We also do a delete of old data in the table, to keep it "smaller". Once a
:Night we have a vacuum-analyze running over the whole database and a second
:one over the big tables.
:
:What we get as a result of the Explain on the "select count(*)" for a big
:table looks awful for me:
:
:Aggregate  (cost=40407.96..40407.96 rows=1 width=0)
:  ->  Seq Scan on userlog  (cost=0.00..37029.37 rows=1351437 width=0)
:
:Aggregate  (cost=114213.24..114213.24 rows=1 width=0)
:  ->  Seq Scan on trafficcenter  (cost=0.00..109446.79 rows=1906579 width=0)
:
:What can we do to improve the performance of big tables in our Database ?!
:We are currently running postgresql 7.2.1 on Linux with a 1.8 Athlon machine
:with 1 GB RAM!
:
:Regards ... Manuel Rorarius ...
:
:
:
:---------------------------(end of broadcast)---------------------------
:TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
: