Re: poor performance of db? - Mailing list pgsql-performance
From | John Arbash Meinel |
---|---|
Subject | Re: poor performance of db? |
Date | |
Msg-id | 41F5B9D0.2010803@arbash-meinel.com Whole thread Raw |
In response to | Re: poor performance of db? ("SpaceBallOne" <space_ball_one@hotmail.com>) |
List | pgsql-performance |
SpaceBallOne wrote: > Thanks for the reply John, > > There are approximately 800 rows total in our job table (which stays > approximately the same because 'completed' jobs get moved to a > 'job_archive' table).The other jobs not shown by the specific query > could be on backorder status, temporary deleted status, etc etc. > > You are correct in assuming the _id and _no (stands for 'number') > fields are unique - this was one of the first pages I built when I > started learning postgres, so not knowing how to set up primary and > foriegn keys at the time, I did it that way ... it is normalised to a > point (probably rather sloppy, but its a juggling act between learning > on the fly, what I'd like to have, and time constraints of being the > only I.T. guy in the company!)... > > I think I will definitely focus on converting my database and php > pages to using proper primary keys in postgres - especially if they > automatically index themselves. I didn't do a vacuum analyse on them > so that may explain why they didn't seem to do much. You probably can add them now if you don't want to do a lot of redesign. ALTER TABLE job ADD PRIMARY KEY (id); If they are not unique this will cause problems, but as they should be unique, I think it will work. I'm not sure how much help indexes will be if you only have 800 rows, and your queries use 300+ of them. You might need re-think the query/table design. You might try doing nested queries, or explicit joins, rather than one big query with a WHERE clause. Meaning do stuff like: SELECT (job JOIN customer ON job.customer_no = customer.customer_no) as jc JOIN location on jc.location_no = location.location_no ... I also see that the planner seems to mis-estimate the number of rows in some cases. Like here: > -> Hash (cost=14.53..14.53 rows=853 width=19) (actual > time=9.79..9.79 rows=0 loops=1) > -> Seq Scan on street (cost=0.00..14.53 rows=853 > width=19) (actual time=0.01..5.12 rows=853 loops=1) > -> Hash (cost=9.91..9.91 rows=491 width=20) (actual > time=5.73..5.73 rows=0 loops=1) > -> Seq Scan on ubd (cost=0.00..9.91 rows=491 width=20) > (actual time=0.02..2.98 rows=491 Where it thinks the hash will return all of the rows from the sequential scan, when in reality it returns none. I think problems with the planner fall into 3 categories. 1. You didn't VACUUM ANALYZE. 2. You did, but the planner doesn't keep sufficient statistics (ALTER TABLE job ALTER COLUMN no SET STATISTICS <a number>) 3. You're join needs cross column statistics, which postgres doesn't support (yet). If you only have 800 rows, I don't think you have to worry about statistics, so that leaves things at 1 or 3. If you did do 1, then I don't know what to tell you. John =:-> PS> I'm not a guru at this stuff, so some of what I say may be wrong. But hopefully I point you in the right direction. > > Thanks, > Dave > space_ball_one@hotmail.com >
Attachment
pgsql-performance by date: