Thread: Determining scan types

Determining scan types

From

Philip Molter

Date:

03 July 2001, 10:24:19

Is there any way to force the query optimizer to prefer one scan type
over another?  I have a rather large query which the optimizer sometime
last night decided to switch from using index scans in a variety of
places to using sequential scans.  This has resulted in a doubling of
CPU usage and weird behavior where the number of rows the optimizer
thinks it has to search through is rising slowly, even though new rows
aren't being entered into any of the tables it's searching through.

What causes this and how can I fix it?  The query has not changed at
all.  The data in the rows have changed, but the just values, not the
number of rows nor the indexing of the tables.  It's baffling.  All
tables being joined (10) are being explicitly joined using INNER JOIN
or LEFT JOIN in an order that should (at least it was) making optimal
use of the indexing.

* Philip Molter
* DataFoundry.net
* http://www.datafoundry.net/
* philip@datafoundry.net

Re: Determining scan types

From

Philip Molter

Date:

03 July 2001, 11:17:26

On Tue, Jul 03, 2001 at 10:42:37AM -0400, Sam Tregar wrote:
: On Tue, 3 Jul 2001, Philip Molter wrote:
:
: > What causes this and how can I fix it?
:
: Have you tried a VACUUM ANALYZE?  For some reason Postgres isn't able to
: use its indexes fully unless you VACUUM ANALYZE often.
:
: I consider this a bug but I gather the developers are OK with it.

Yes.  In fact, I have to VACUUM ANALYZE the tables every half hour on
this constantly running system or else kernel CPU usage rises to
unacceptable levels (another thing I consider a bug).  Like I said, in
the middle of the night (probably after one of these analyses), it
switched from using index scans where appropriate to using sequential
scans for everything.  If I turn off sequential scans altogether, it
uses the indices, but I don't get the performance of benefits of using
sequential scans when appropriate.

* Philip Molter
* DataFoundry.net
* http://www.datafoundry.net/
* philip@datafoundry.net

Re: Determining scan types

From

Sam Tregar

Date:

03 July 2001, 11:17:29

On Tue, 3 Jul 2001, Philip Molter wrote:

> What causes this and how can I fix it?

Have you tried a VACUUM ANALYZE?  For some reason Postgres isn't able to
use its indexes fully unless you VACUUM ANALYZE often.

I consider this a bug but I gather the developers are OK with it.

-sam

Re: Determining scan types

From

"Richard Huxton"

Date:

03 July 2001, 13:03:33

From: "Philip Molter" <philip@datafoundry.net>

> On Tue, Jul 03, 2001 at 10:42:37AM -0400, Sam Tregar wrote:
> : On Tue, 3 Jul 2001, Philip Molter wrote:
> :
> : > What causes this and how can I fix it?
> :
> : Have you tried a VACUUM ANALYZE?  For some reason Postgres isn't able to
> : use its indexes fully unless you VACUUM ANALYZE often.
> :
> : I consider this a bug but I gather the developers are OK with it.
>
> Yes.  In fact, I have to VACUUM ANALYZE the tables every half hour on
> this constantly running system or else kernel CPU usage rises to
> unacceptable levels (another thing I consider a bug).  Like I said, in
> the middle of the night (probably after one of these analyses), it
> switched from using index scans where appropriate to using sequential
> scans for everything.  If I turn off sequential scans altogether, it
> uses the indices, but I don't get the performance of benefits of using
> sequential scans when appropriate.

VACUUM ANALYZE frequency depends on numbers of updates. I believe someone
has been looking at a way of doing this in the background.

For the purposes of setting SEQSCAN try something like:

SET ENABLE_SEQSCAN TO OFF;

Can't remember if it applies to this transaction or this connection. Run a
grep on the docs - you'll only find a couple of hits.

HTH

- Richard Huxton

Re: Determining scan types

From

Philip Molter

Date:

03 July 2001, 13:18:39

On Tue, Jul 03, 2001 at 05:12:43PM +0100, Richard Huxton wrote:
: VACUUM ANALYZE frequency depends on numbers of updates. I believe someone
: has been looking at a way of doing this in the background.

Oh yeah, definitely depends on updates, or rather, changes to the
table contents (insertions, deletions).

: For the purposes of setting SEQSCAN try something like:
:
: SET ENABLE_SEQSCAN TO OFF;
:
: Can't remember if it applies to this transaction or this connection. Run a
: grep on the docs - you'll only find a couple of hits.

Well, I turned it off for the entire database (since right now, we're
only using the db for this one application), but I lose the benefit of
seqscans in situations where it's appropriate.  That's why I was
wondering if there's anyway to tell the optimizer to prefer index scans
over sequential scans when it has a choice.  Right now, it's using less
efficient joining methods where it would normally use sequential scans
(not that I'm complaining too much; CPU performance has more than
doubled since taking out sequential scans).

* Philip Molter
* DataFoundry.net
* http://www.datafoundry.net/
* philip@datafoundry.net

Re: Determining scan types

From

Richard Huxton

Date:

04 July 2001, 04:01:00

Philip Molter wrote:

> : For the purposes of setting SEQSCAN try something like:
> :
> : SET ENABLE_SEQSCAN TO OFF;
>
> Well, I turned it off for the entire database (since right now, we're
> only using the db for this one application), but I lose the benefit of
> seqscans in situations where it's appropriate.  That's why I was
> wondering if there's anyway to tell the optimizer to prefer index scans
> over sequential scans when it has a choice.

AFAIK that's exactly what ENABLE_SEQSCAN does.

> Right now, it's using less
> efficient joining methods where it would normally use sequential scans
> (not that I'm complaining too much; CPU performance has more than
> doubled since taking out sequential scans).

Try wrapping the one or two key queries with ...OFF and ...ON - should
give you the best of both worlds.

I've always managed to persuade PG to make the "right" choices, but
there are a whole set of parameters you can tweak if absolutely
necessary. The ENABLE_SEQSCAN is a pretty blunt tool but is the easiest
for (me :-) to understand. See the runtime configuration chapter for
details and check the mail archives for examples of use.

- Richard Huxton

Re: Determining scan types

From

Philip Molter

Date:

04 July 2001, 12:31:22

On Wed, Jul 04, 2001 at 07:31:53AM +0100, Richard Huxton wrote:
: Philip Molter wrote:
:
: > : For the purposes of setting SEQSCAN try something like:
: > :
: > : SET ENABLE_SEQSCAN TO OFF;
: >
: > Well, I turned it off for the entire database (since right now, we're
: > only using the db for this one application), but I lose the benefit of
: > seqscans in situations where it's appropriate.  That's why I was
: > wondering if there's anyway to tell the optimizer to prefer index scans
: > over sequential scans when it has a choice.
:
: AFAIK that's exactly what ENABLE_SEQSCAN does.

It appears that ENABLE_SEQSCAN prevents the optimizer from using
seqscans at all.  That's not really the behavior I want.  I just don't
want it to be using them when it should be using index scans instead.

: Try wrapping the one or two key queries with ...OFF and ...ON - should
: give you the best of both worlds.

Yeah, except I'm trying to write generally database agnostic (SQL92)
code.  It's rather idiotic that I have to tell the optimizer not only
in which order to join the tables (that I can live with), but also what
the best method is to join them with.

: I've always managed to persuade PG to make the "right" choices, but
: there are a whole set of parameters you can tweak if absolutely
: necessary. The ENABLE_SEQSCAN is a pretty blunt tool but is the easiest
: for (me :-) to understand. See the runtime configuration chapter for
: details and check the mail archives for examples of use.

Yeah, I don't mind tweaking, but the query optimizer, even with as many
tables as this join has, should be able to recognize when unique
indices are available and use them.  I mean, it chops the aggregate row
count from 3000+ down to 11 (that's just shutting off seq scans).
That's pretty significant and symbolizes a problem with the optimizer
when it can't figure it out.  Furthermore, it *was* using indices and
something happened to make it stop.  I still have no clue what that
was.

It's rather irrelevant anyway.  Turning off ENABLE_SEQSCAN appears to
make the optimizer work much more like I intended, so off it stays.

* Philip Molter
* DataFoundry.net
* http://www.datafoundry.net/
* philip@datafoundry.net