Thread: query optimiser changes 6.5->7.0

query optimiser changes 6.5->7.0

From
"Simon Hardingham"
Date:
Hi,

I apologise if I am being stupid or this is the wrong list, but here goes
anyway ;-)

I have a table of the following form:

           Table "gazet"
 Attribute |    Type     | Modifier
-----------+-------------+----------
 country   | integer     |
 name      | varchar(30) |
 lat       | float4      |
 long      | float4      |
 score     | integer     |
Index: gazet_index

gazet_index is an index defined as

create index gazet_index on gazet (country)

This table has a couple of million rows and I am executing the query:-

select * from gazet where country=1 and lower(name) = 'placename';

I have been running this with no problems on Postgres 6.5.1 and this query
takes about a second. I have now set-up a dedicated Postgres server using
version 7.0 and exported the entire database into it using a pg_dump.  I
have then run vacuum to recreate indexes etc, but the query takes 7-8
seconds now.  I have run explain on the query and it shows that it is just
performed a sequential scan on version 7.0

Seq Scan on gazet  (cost.....)

On the old version (6.5.1) it reports

Index Scan using gazet_index on gazet  (cost=....

Any suggestions as to how I can improve performance on this databases new
server?

Many thanks

Simon


_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
_/         Simon Hardingham - netXtra Ltd - UK          _/
_/ Tel: +44 (0) 1787 319393    Fax: +44 (0) 1787 319394 _/
_/ http://www.netxtra.co.uk         simon@netxtra.co.uk _/
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


Re: query optimiser changes 6.5->7.0

From
Tom Lane
Date:
"Simon Hardingham" <simon@netxtra.net> writes:
> I have run explain on the query and it shows that it is just
> performed a sequential scan on version 7.0

> Seq Scan on gazet  (cost.....)

> On the old version (6.5.1) it reports

> Index Scan using gazet_index on gazet  (cost=....

> Any suggestions as to how I can improve performance on this databases new
> server?

Unfortunately you have been careful to suppress any information that
might actually let someone give you useful advice ;-).  There are
several threads in the archives about undesirable index-vs-sequential-
scan choices in 7.0; check pgsql-sql as well as pgsql-general for the
last month or so.

Also, 7.0.1, propagating now to an archive near you, contains some
fudge-factor twiddling to make it more willing to choose an indexscan.
We shall soon find out whether that made things better or worse for
typical uses...

            regards, tom lane

Re: query optimiser changes 6.5->7.0

From
Joseph Shraibman
Date:
Tom Lane wrote:
>
> Also, 7.0.1, propagating now to an archive near you, contains some
> fudge-factor twiddling to make it more willing to choose an indexscan.
> We shall soon find out whether that made things better or worse for
> typical uses...
>
>                         regards, tom lane

OK this may seem like a stupid question, but isn't index scan always
better except for the pathalogical simple case where the work to be done
is trivial anyway?

Re: query optimiser changes 6.5->7.0

From
Tom Lane
Date:
Joseph Shraibman <jks@selectacast.net> writes:
> OK this may seem like a stupid question, but isn't index scan always
> better except for the pathalogical simple case where the work to be done
> is trivial anyway?

No.  If it were, the optimizer would be a whole lot simpler ;-)

In practice an indexscan only wins if it will visit a relatively
small percentage of the tuples in the table.  The $64 questions
are how small is small enough, and how can the optimizer guess
how many tuples will be hit in advance of doing the query...

            regards, tom lane

Re: query optimiser changes 6.5->7.0

From
Wim Ceulemans
Date:
Tom Lane wrote:
>
> Joseph Shraibman <jks@selectacast.net> writes:
> > OK this may seem like a stupid question, but isn't index scan always
> > better except for the pathalogical simple case where the work to be done
> > is trivial anyway?
>
> No.  If it were, the optimizer would be a whole lot simpler ;-)
>
> In practice an indexscan only wins if it will visit a relatively
> small percentage of the tuples in the table.  The $64 questions
> are how small is small enough, and how can the optimizer guess
> how many tuples will be hit in advance of doing the query...
>

Isn't there a way to tell the optimizer to use an index scan if one
knows this is going to be the best.

I have seen lots of posts concerning people who are trying to force the
optimizer to use an index scan by rephrasing their SQL statements. Isn't
there a possibility to provide some syntax (non SQL compliant I know) to
force the optimizer to do an index scan?

Regards
Wim

Re: query optimiser changes 6.5->7.0

From
Alfred Perlstein
Date:
* Wim Ceulemans <wim.ceulemans@nice.be> [000602 02:33] wrote:
>
> Isn't there a way to tell the optimizer to use an index scan if one
> knows this is going to be the best.

set enable_seqscan=off;

-Alfred

Re: query optimiser changes 6.5->7.0

From
Martijn van Oosterhout
Date:
Alfred Perlstein wrote:
>
> * Wim Ceulemans <wim.ceulemans@nice.be> [000602 02:33] wrote:
> >
> > Isn't there a way to tell the optimizer to use an index scan if one
> > knows this is going to be the best.
>
> set enable_seqscan=off;

But doesn't this mean that if there are no appropriate
indicies defined on a table, queries will fail?
--
Martijn van Oosterhout <kleptog@cupid.suninternet.com>
http://cupid.suninternet.com/~kleptog/

Re: query optimiser changes 6.5->7.0

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@cupid.suninternet.com> writes:
>>>> Isn't there a way to tell the optimizer to use an index scan if one
>>>> knows this is going to be the best.
>>
>> set enable_seqscan=off;

> But doesn't this mean that if there are no appropriate
> indicies defined on a table, queries will fail?

No.  It just biases the planner heavily against choosing a seqscan
if there are any indexscan alternatives available.

A more serious objection is that this switch is a very blunt instrument,
since it's going to affect planning for *all* queries as long as it's
off.  It's not really designed to be used as anything except a debugging
tool --- if you run with it routinely, you will almost certainly lose
more in performance on queries that shouldn't have been seqscanned than
you gain on the ones where the planner would have made the wrong choice.

In the long run the right answer is to continue to work on improving
the planner.  I don't much like the idea of user overrides on planner
choices as a standard answer --- the trouble with that is that when
you throw "set enable_seqscan=off" or some such into an application,
it tends to stay there long after the need for it is gone; perhaps long
enough for the database schema to be revised to the point where the
command forces use of a plan much worse than what the planner would
have picked without such "help".  This problem gets rapidly worse with
more-specialized user controls, such as a command to force use of a
specific index.

So, while I have been known to suggest "set enable_seqscan=off" and
friends when there seemed no other short-term answer, I don't want to
enshrine it as a standard recommendation.  It's just a way of plugging
holes in the dike until more planner work gets done.

            regards, tom lane

Re: query optimiser changes 6.5->7.0

From
Jeff Hoffmann
Date:
Martijn van Oosterhout wrote:

> > set enable_seqscan=off;
>
> But doesn't this mean that if there are no appropriate
> indicies defined on a table, queries will fail?

no, it just makes sequential scans a very bad choice.  of course, if
there aren't any other choices, you're stuck going with the bad choice.
that means that if there aren't any other options (i.e. indexes) you'll
always revert to a sequential scan.

jeff

Re: query optimiser changes 6.5->7.0

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> So, while I have been known to suggest "set enable_seqscan=off" and
>> friends when there seemed no other short-term answer, I don't want to
>> enshrine it as a standard recommendation.  It's just a way of plugging
>> holes in the dike until more planner work gets done.

> They can certainly turn of off before a query and on after the query,
> right?

Sure, that's how I'd suggest using it at the moment.

> Seems people have been asking for fine-tuned optimizer control.

I know, but I think that's the wrong direction to go in, for the reasons
I tried to explain.  The more specific the command you give to force
a particular query plan, the more likely that command is to go wrong
as the situation changes.

            regards, tom lane

Re: query optimiser changes 6.5->7.0

From
Bruce Momjian
Date:
> So, while I have been known to suggest "set enable_seqscan=off" and
> friends when there seemed no other short-term answer, I don't want to
> enshrine it as a standard recommendation.  It's just a way of plugging
> holes in the dike until more planner work gets done.

They can certainly turn of off before a query and on after the query,
right?  Seems people have been asking for fine-tuned optimizer control.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: query optimiser changes 6.5->7.0

From
Lincoln Yeoh
Date:
At 01:09 PM 02-06-2000 -0400, Tom Lane wrote:
>Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> They can certainly turn of off before a query and on after the query,
>> right?
>
>Sure, that's how I'd suggest using it at the moment.
>
>> Seems people have been asking for fine-tuned optimizer control.
>
>I know, but I think that's the wrong direction to go in, for the reasons
>I tried to explain.  The more specific the command you give to force
>a particular query plan, the more likely that command is to go wrong
>as the situation changes.

True.

But in very many cases the humans do better than the optimizer/planner.

How about a priority/rating/mode for indexes? And we won't have to deal
with legacy SQL statements containing "hints" scattered everywhere...

That's less blunt. But it's still so blunt that it may not be that useful
or work better than the global seq_scan setting.

Is it common for the optimizer to do stupid things? If it is, is there a
way for it to figure out it has done something stupid and to try not to do
it again?

e.g. Ooops, I actually went through 50% of the possible rows this time,
maybe that wasn't such a good idea. Try something else.

Or oops, my cost estimate was really off, revising estimate. I think this
would be preferable- assume the optimizer is not stupid, just ignorant.

Ack, it starts to look like a chess algorithm :).

Cheerio,

Link.