Thread: Enabling and disabling run time configuration parameters.
I have discovered that I could optimize queries by adjusting the following parameters such as enable_seqscan, enable_hashjoin, enable_mergejoin and enable_nestloop. Is it a good idea, to temporarily adjust those values before running a query to spend up the execution time? I've searched online and wasn't able to find articles about it. I need to speed up an enterprise application that I'm working on, and I wouldn't want to screw things up. My plan is for every query that could be optimized by adjusting parameters: I'll enable parameters that'll speed it up, run the query, then set the parameters back to their default values. Thanks in advance.
On Thu, 5 Jun 2003, Yusuf wrote: > I have discovered that I could optimize queries by adjusting the > following parameters such as enable_seqscan, enable_hashjoin, > enable_mergejoin and enable_nestloop. Setting those to get a fast query is the brute force method. It works, but at some cost of flexibility. Have you run vacuum full and analyze? If not, the planner has no clue how to decide on which plans to choose. > Is it a good idea, to temporarily adjust those values before running a > query to spend up the execution time? I've searched online and wasn't > able to find articles about it. Yes, it's a great idea to do that in testing. No, it's a bad idea to rely on them in production. > I need to speed up an enterprise application that I'm working on, and I > wouldn't want to screw things up. Then you'll want to tune your databases cost estimates so it makes the right decision. > My plan is for every query that could be optimized by adjusting > parameters: I'll enable parameters that'll speed it up, run the query, > then set the parameters back to their default values. That's a good plan as long as you go the extra step of making the changes to the cost parameters so that the planner chooses correctly between the different options it has. Every server has different performance characteristics. A machine with 1 gig of RAM and 18 drives in a large RAID 1+0 is going to handle random page access a lot better than a machine with 256 Meg ram and a single IDE hard drive. The values you need to look at are these: random_page_cost cpu_index_tuple_cost cpu_operator_cost cpu_tuple_cost effective_cache_size They are covered in detail in the docs here: http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=runtime-config.html I'm gonna go offline and write a quick tutorial on tuning your database to your server. Look for a preliminary version today or tomorrow. Set effective cache size to approximately the size of all kernel cache buffer/pagesize (8192 for most pgsql setups). Then tune the *_cost options so the planner picks the right plan each time.
On Thu, Jun 05, 2003 at 11:35:22AM -0400, Yusuf wrote: > I have discovered that I could optimize queries by adjusting the > following parameters such as enable_seqscan, enable_hashjoin, > enable_mergejoin and enable_nestloop. > > Is it a good idea, to temporarily adjust those values before running a > query to spend up the execution time? I've searched online and wasn't > able to find articles about it. It sounds like you need more general tuning. If the planner is making mistakes, it'd be nice to know about it. Could you post some details? A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Yusuf, > Is it a good idea, to temporarily adjust those values before running a > query to spend up the execution time? I've searched online and wasn't > able to find articles about it. No. The "enable_%" vars are intended as settings for *testing*, to tell you if you have a problem with your query structure or indexing, cost variables, or are in need of a VACUUM. Using them in a production capacity is a bad idea, because you haven't addressed the problem that was causing the query to be slow in the first place, and as your database changes over time your queries will become slow again. Adhjusting the *cost* variables is a good idea. Find you need ENABLE_SEQSCAN=FALSE a lot? Raise your cache_size and lower your random_tuple_cost variables, among other adjustments. For further adjustments, post some of your "bad queries" to this list. Be sure to include *all* of the following: 1) VACUUM FULL ANALYZE before testing. 2) Include the full query. 3) Include the EXPLAIN ANALYZE results of the query. 4) Include (possibly as a text attachment) the schema of relevant tables, including (especially!) indexes. -- Josh Berkus Aglio Database Solutions San Francisco
> My plan is for every query that could be optimized by adjusting > parameters: I'll enable parameters that'll speed it up, run the query, > then set the parameters back to their default values. Unless you intend to regularly test these, or have static data this may cause you more problems than it fixes. Any change in the data may make the plan you have forced a non-optimal one. A much better approach is to tweek the cost values that cause the planner to chose that particular plan. The random_page_cost will probably have the most effect on the plan chosen. #effective_cache_size = 1000 # typically 8KB each #random_page_cost = 4 # units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
On Thu, Jun 05, 2003 at 11:35:22 -0400, Yusuf <yusuf0478@netscape.net> wrote: > I have discovered that I could optimize queries by adjusting the > following parameters such as enable_seqscan, enable_hashjoin, > enable_mergejoin and enable_nestloop. > > Is it a good idea, to temporarily adjust those values before running a > query to spend up the execution time? I've searched online and wasn't > able to find articles about it. That is a reasonable thing to do. However you should also look into adjusting some of the costs used by the planner so that it gets the right plan more often. If you manually hack how the query is done, then you have to worry about whether the hack is still right if the the data changes significantly. > I need to speed up an enterprise application that I'm working on, and I > wouldn't want to screw things up. There worst that would happen is that the plan you forced it to use was slower than what the planner would have used. > My plan is for every query that could be optimized by adjusting > parameters: I'll enable parameters that'll speed it up, run the query, > then set the parameters back to their default values. They only apply to the current backend session. You can also set them for just the current transaction which is safer if you are using persistant backend connections. (So that if you make a mistake the setting doesn't apply for a very long time.)