Thread: query optimiser changes 6.5->7.0
Hi, I apologise if I am being stupid or this is the wrong list, but here goes anyway ;-) I have a table of the following form: Table "gazet" Attribute | Type | Modifier -----------+-------------+---------- country | integer | name | varchar(30) | lat | float4 | long | float4 | score | integer | Index: gazet_index gazet_index is an index defined as create index gazet_index on gazet (country) This table has a couple of million rows and I am executing the query:- select * from gazet where country=1 and lower(name) = 'placename'; I have been running this with no problems on Postgres 6.5.1 and this query takes about a second. I have now set-up a dedicated Postgres server using version 7.0 and exported the entire database into it using a pg_dump. I have then run vacuum to recreate indexes etc, but the query takes 7-8 seconds now. I have run explain on the query and it shows that it is just performed a sequential scan on version 7.0 Seq Scan on gazet (cost.....) On the old version (6.5.1) it reports Index Scan using gazet_index on gazet (cost=.... Any suggestions as to how I can improve performance on this databases new server? Many thanks Simon _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ _/ Simon Hardingham - netXtra Ltd - UK _/ _/ Tel: +44 (0) 1787 319393 Fax: +44 (0) 1787 319394 _/ _/ http://www.netxtra.co.uk simon@netxtra.co.uk _/ _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
"Simon Hardingham" <simon@netxtra.net> writes: > I have run explain on the query and it shows that it is just > performed a sequential scan on version 7.0 > Seq Scan on gazet (cost.....) > On the old version (6.5.1) it reports > Index Scan using gazet_index on gazet (cost=.... > Any suggestions as to how I can improve performance on this databases new > server? Unfortunately you have been careful to suppress any information that might actually let someone give you useful advice ;-). There are several threads in the archives about undesirable index-vs-sequential- scan choices in 7.0; check pgsql-sql as well as pgsql-general for the last month or so. Also, 7.0.1, propagating now to an archive near you, contains some fudge-factor twiddling to make it more willing to choose an indexscan. We shall soon find out whether that made things better or worse for typical uses... regards, tom lane
Tom Lane wrote: > > Also, 7.0.1, propagating now to an archive near you, contains some > fudge-factor twiddling to make it more willing to choose an indexscan. > We shall soon find out whether that made things better or worse for > typical uses... > > regards, tom lane OK this may seem like a stupid question, but isn't index scan always better except for the pathalogical simple case where the work to be done is trivial anyway?
Joseph Shraibman <jks@selectacast.net> writes: > OK this may seem like a stupid question, but isn't index scan always > better except for the pathalogical simple case where the work to be done > is trivial anyway? No. If it were, the optimizer would be a whole lot simpler ;-) In practice an indexscan only wins if it will visit a relatively small percentage of the tuples in the table. The $64 questions are how small is small enough, and how can the optimizer guess how many tuples will be hit in advance of doing the query... regards, tom lane
Tom Lane wrote: > > Joseph Shraibman <jks@selectacast.net> writes: > > OK this may seem like a stupid question, but isn't index scan always > > better except for the pathalogical simple case where the work to be done > > is trivial anyway? > > No. If it were, the optimizer would be a whole lot simpler ;-) > > In practice an indexscan only wins if it will visit a relatively > small percentage of the tuples in the table. The $64 questions > are how small is small enough, and how can the optimizer guess > how many tuples will be hit in advance of doing the query... > Isn't there a way to tell the optimizer to use an index scan if one knows this is going to be the best. I have seen lots of posts concerning people who are trying to force the optimizer to use an index scan by rephrasing their SQL statements. Isn't there a possibility to provide some syntax (non SQL compliant I know) to force the optimizer to do an index scan? Regards Wim
* Wim Ceulemans <wim.ceulemans@nice.be> [000602 02:33] wrote: > > Isn't there a way to tell the optimizer to use an index scan if one > knows this is going to be the best. set enable_seqscan=off; -Alfred
Alfred Perlstein wrote: > > * Wim Ceulemans <wim.ceulemans@nice.be> [000602 02:33] wrote: > > > > Isn't there a way to tell the optimizer to use an index scan if one > > knows this is going to be the best. > > set enable_seqscan=off; But doesn't this mean that if there are no appropriate indicies defined on a table, queries will fail? -- Martijn van Oosterhout <kleptog@cupid.suninternet.com> http://cupid.suninternet.com/~kleptog/
Martijn van Oosterhout <kleptog@cupid.suninternet.com> writes: >>>> Isn't there a way to tell the optimizer to use an index scan if one >>>> knows this is going to be the best. >> >> set enable_seqscan=off; > But doesn't this mean that if there are no appropriate > indicies defined on a table, queries will fail? No. It just biases the planner heavily against choosing a seqscan if there are any indexscan alternatives available. A more serious objection is that this switch is a very blunt instrument, since it's going to affect planning for *all* queries as long as it's off. It's not really designed to be used as anything except a debugging tool --- if you run with it routinely, you will almost certainly lose more in performance on queries that shouldn't have been seqscanned than you gain on the ones where the planner would have made the wrong choice. In the long run the right answer is to continue to work on improving the planner. I don't much like the idea of user overrides on planner choices as a standard answer --- the trouble with that is that when you throw "set enable_seqscan=off" or some such into an application, it tends to stay there long after the need for it is gone; perhaps long enough for the database schema to be revised to the point where the command forces use of a plan much worse than what the planner would have picked without such "help". This problem gets rapidly worse with more-specialized user controls, such as a command to force use of a specific index. So, while I have been known to suggest "set enable_seqscan=off" and friends when there seemed no other short-term answer, I don't want to enshrine it as a standard recommendation. It's just a way of plugging holes in the dike until more planner work gets done. regards, tom lane
Martijn van Oosterhout wrote: > > set enable_seqscan=off; > > But doesn't this mean that if there are no appropriate > indicies defined on a table, queries will fail? no, it just makes sequential scans a very bad choice. of course, if there aren't any other choices, you're stuck going with the bad choice. that means that if there aren't any other options (i.e. indexes) you'll always revert to a sequential scan. jeff
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> So, while I have been known to suggest "set enable_seqscan=off" and >> friends when there seemed no other short-term answer, I don't want to >> enshrine it as a standard recommendation. It's just a way of plugging >> holes in the dike until more planner work gets done. > They can certainly turn of off before a query and on after the query, > right? Sure, that's how I'd suggest using it at the moment. > Seems people have been asking for fine-tuned optimizer control. I know, but I think that's the wrong direction to go in, for the reasons I tried to explain. The more specific the command you give to force a particular query plan, the more likely that command is to go wrong as the situation changes. regards, tom lane
> So, while I have been known to suggest "set enable_seqscan=off" and > friends when there seemed no other short-term answer, I don't want to > enshrine it as a standard recommendation. It's just a way of plugging > holes in the dike until more planner work gets done. They can certainly turn of off before a query and on after the query, right? Seems people have been asking for fine-tuned optimizer control. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
At 01:09 PM 02-06-2000 -0400, Tom Lane wrote: >Bruce Momjian <pgman@candle.pha.pa.us> writes: >> They can certainly turn of off before a query and on after the query, >> right? > >Sure, that's how I'd suggest using it at the moment. > >> Seems people have been asking for fine-tuned optimizer control. > >I know, but I think that's the wrong direction to go in, for the reasons >I tried to explain. The more specific the command you give to force >a particular query plan, the more likely that command is to go wrong >as the situation changes. True. But in very many cases the humans do better than the optimizer/planner. How about a priority/rating/mode for indexes? And we won't have to deal with legacy SQL statements containing "hints" scattered everywhere... That's less blunt. But it's still so blunt that it may not be that useful or work better than the global seq_scan setting. Is it common for the optimizer to do stupid things? If it is, is there a way for it to figure out it has done something stupid and to try not to do it again? e.g. Ooops, I actually went through 50% of the possible rows this time, maybe that wasn't such a good idea. Try something else. Or oops, my cost estimate was really off, revising estimate. I think this would be preferable- assume the optimizer is not stupid, just ignorant. Ack, it starts to look like a chess algorithm :). Cheerio, Link.