Re: PostgreSQL Query Speed Issues - Mailing list pgsql-novice
From | Kevin Grittner |
---|---|
Subject | Re: PostgreSQL Query Speed Issues |
Date | |
Msg-id | 1361638407.67335.YahooMailNeo@web162902.mail.bf1.yahoo.com Whole thread Raw |
In response to | Re: PostgreSQL Query Speed Issues (Joseph Pravato <joseph.pravato@nomagic.com>) |
Responses |
Re: PostgreSQL Query Speed Issues
|
List | pgsql-novice |
Joseph Pravato <joseph.pravato@nomagic.com> wrote: > From: "Kevin Grittner" <kgrittn@ymail.com> > We actually only have 11.37GB on our server, so we've used 3072MB > for shared_buffers & 8400MB for effective_cache_size. Sounds reasonable. Keep in mind that the actual "sweet spot" for a lot of this configuration depends not only on the hardware, but also the database and workload. Once you have a performance baseline with actual workload using these settings, trying incremental changes and monitoring the results will help zero in on ideal settings for you situation. Many settings can take effect with just a reload, and don't disrupt ongoing workload, although shared_buffers is one which requires a restart. You can check that in the docs or the context column in the pg_settings view. >> In addition, with 12GB and apparently not more than 2 million >> rows per table, you seem very likely to have the active portion >> of your database fully cached. So these settings are likely to >> help: >> >> seq_page_cost = 0.1 >> random_page_cost = 0.1 >> cpu_tuple_cost = 0.03 > Wow, the random_page_cost setting made a huge difference. The > query we mentioned yesterday that takes 30 minutes is down to > about 30 seconds. Nice speed improvement, reasonable speed, but > still could use improvements. However, I'm a bit worried about > changing it so drastically. Can there be any future issues from > this change? We have 4 or 5 tables that will continue to grow > fairly rapidly (1 million row increase every 2-3 years). PostgreSQL uses a cost-based optimizer (rather than rules-based) and it goes through the OS and its filesystems, so caching effects need to be considered in setting the cost factors in order to get the best plan for any query. The default configuration is intended to allow PostgreSQL to run on a low-end laptop, so that people don't have the unfortunate experience of installing it and not being able to get it to even start. Tuning is required for serious production work. The original configuration assumed fairly heavy disk access when reading data, so if that's not the case for your environment, you need to make adjustments like the above to more accurately model the costs of each possible plan. If the cache hit rate falls dramatically, you will want to either add RAM to keep data cached, or adjust the cost factors so that it will try to minimize slow disk access. >> If you haven't already done so, run VACUUM ANALYZE at the >> database level. If most of your data was loaded at about the >> same time, run VACUUM FREEZE ANALYZE instead, to prevent a >> read-and-rewrite pass of your entire database at peak OLTP load. > The vacuuming didn't seem to change any performances, and it > didn't take very long to run. Autovacuum may have already taken > care of most potential issues from this. That's good, but if most of the data was loaded at about the same time, you may want to run VACUUM FREEZE ANALYZE during an off-peak period. Without that, there will be an autovacuum run at some point which will be more aggressive than usual, and may affect performance enought to notice. A manual run will let you pick when to do this more aggressive maintenance. >> Your plan shows an index scan with no index conditions to pass >> every row in a table, which is much slower than a seqscan. Did >> you turn off enable_seqscan [...]? > No, enable_seqscan is still turned on. That plan choice strikes me as very odd, and not likely to be optimal. The only other things that I can think of which might cause this plan choice would be if seq_page_cost is higher than random_page_cost, or if the table has a lot of dead space in it. Could you show EXPLAIN ANALYZE output for the current settings, along with the output of running this?: SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override'); SELECT oid, relname, relpages, reltuples FROM pg_class WHERE relname = 'contact'; SELECT * FROM pg_stat_user_tables WHERE relname = 'contact'; -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-novice by date: