Thread: Poor Performance on large Tables
Hi! We are currently running a Community-Site on Postgresql. This community makes a lot of traffic on the Database. To improve Performance we already took a lot of actions from the Database into the Java-Cache of our Application. Nonetheless we have some very big Tables where we store profiles, messages and a lot more. I am talking about 50.000 new tuples every day for one tables. Now we get very poor results and even difficulties when trying easy selects on those tables. When using pgAdminII we sometimes even have to wait nearly a minute until we can edit that table because pgAdminII always does a "select count(*)" on the table to get the amout of rows in that table. Also the Server-Load goes very high when issuing this count(*)-Select! We also do a delete of old data in the table, to keep it "smaller". Once a Night we have a vacuum-analyze running over the whole database and a second one over the big tables. What we get as a result of the Explain on the "select count(*)" for a big table looks awful for me: Aggregate (cost=40407.96..40407.96 rows=1 width=0) -> Seq Scan on userlog (cost=0.00..37029.37 rows=1351437 width=0) Aggregate (cost=114213.24..114213.24 rows=1 width=0) -> Seq Scan on trafficcenter (cost=0.00..109446.79 rows=1906579 width=0) What can we do to improve the performance of big tables in our Database ?! We are currently running postgresql 7.2.1 on Linux with a 1.8 Athlon machine with 1 GB RAM! Regards ... Manuel Rorarius ...
First, I've just noticed the date on this message and a quick check shows I'm getting a right old mixture of time stamped messages at the moment. Got at least one dated the 8th. On to the subject now...if you have serial types in your tables perhaps you can use those to determine the count of tuples. This isn't going to help pgAdminII but you could select the min and max of there serial columns and subtract to find the number of tuples. From the sounds of your setup this may work for you even though it's not a general solution. You can optimize the min and max selection using queries of the form: select min(serialcol) from yourtable order by serialcol desc limit 1; -- Nigel J. Andrews On Thu, 14 Nov 2002, Manuel Rorarius wrote: > Hi! > > We are currently running a Community-Site on Postgresql. This community > makes a lot of traffic on the Database. To improve Performance we already > took a lot of actions from the Database into the Java-Cache of our > Application. > > Nonetheless we have some very big Tables where we store profiles, messages > and a lot more. I am talking about 50.000 new tuples every day for one > tables. > > Now we get very poor results and even difficulties when trying easy selects > on those tables. When using pgAdminII we sometimes even have to wait nearly > a minute until we can edit that table because pgAdminII always does a > "select count(*)" on the table to get the amout of rows in that table. Also > the Server-Load goes very high when issuing this count(*)-Select! > > We also do a delete of old data in the table, to keep it "smaller". Once a > Night we have a vacuum-analyze running over the whole database and a second > one over the big tables. > > What we get as a result of the Explain on the "select count(*)" for a big > table looks awful for me: > > Aggregate (cost=40407.96..40407.96 rows=1 width=0) > -> Seq Scan on userlog (cost=0.00..37029.37 rows=1351437 width=0) > > Aggregate (cost=114213.24..114213.24 rows=1 width=0) > -> Seq Scan on trafficcenter (cost=0.00..109446.79 rows=1906579 width=0) > > What can we do to improve the performance of big tables in our Database ?! > We are currently running postgresql 7.2.1 on Linux with a 1.8 Athlon machine > with 1 GB RAM! > > Regards ... Manuel Rorarius ... > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
Looks like you need to index those large tables. CREATE INDEX userlog_idx ON foo_table(userlog); ---- This is what you wrote me ---- :Hi! : :We are currently running a Community-Site on Postgresql. This community :makes a lot of traffic on the Database. To improve Performance we already :took a lot of actions from the Database into the Java-Cache of our :Application. : :Nonetheless we have some very big Tables where we store profiles, messages :and a lot more. I am talking about 50.000 new tuples every day for one :tables. : :Now we get very poor results and even difficulties when trying easy selects :on those tables. When using pgAdminII we sometimes even have to wait nearly :a minute until we can edit that table because pgAdminII always does a :"select count(*)" on the table to get the amout of rows in that table. Also :the Server-Load goes very high when issuing this count(*)-Select! : :We also do a delete of old data in the table, to keep it "smaller". Once a :Night we have a vacuum-analyze running over the whole database and a second :one over the big tables. : :What we get as a result of the Explain on the "select count(*)" for a big :table looks awful for me: : :Aggregate (cost=40407.96..40407.96 rows=1 width=0) : -> Seq Scan on userlog (cost=0.00..37029.37 rows=1351437 width=0) : :Aggregate (cost=114213.24..114213.24 rows=1 width=0) : -> Seq Scan on trafficcenter (cost=0.00..109446.79 rows=1906579 width=0) : :What can we do to improve the performance of big tables in our Database ?! :We are currently running postgresql 7.2.1 on Linux with a 1.8 Athlon machine :with 1 GB RAM! : :Regards ... Manuel Rorarius ... : : : :---------------------------(end of broadcast)--------------------------- :TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org :