Thread: Determining scan types
Is there any way to force the query optimizer to prefer one scan type over another? I have a rather large query which the optimizer sometime last night decided to switch from using index scans in a variety of places to using sequential scans. This has resulted in a doubling of CPU usage and weird behavior where the number of rows the optimizer thinks it has to search through is rising slowly, even though new rows aren't being entered into any of the tables it's searching through. What causes this and how can I fix it? The query has not changed at all. The data in the rows have changed, but the just values, not the number of rows nor the indexing of the tables. It's baffling. All tables being joined (10) are being explicitly joined using INNER JOIN or LEFT JOIN in an order that should (at least it was) making optimal use of the indexing. * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net
On Tue, Jul 03, 2001 at 10:42:37AM -0400, Sam Tregar wrote: : On Tue, 3 Jul 2001, Philip Molter wrote: : : > What causes this and how can I fix it? : : Have you tried a VACUUM ANALYZE? For some reason Postgres isn't able to : use its indexes fully unless you VACUUM ANALYZE often. : : I consider this a bug but I gather the developers are OK with it. Yes. In fact, I have to VACUUM ANALYZE the tables every half hour on this constantly running system or else kernel CPU usage rises to unacceptable levels (another thing I consider a bug). Like I said, in the middle of the night (probably after one of these analyses), it switched from using index scans where appropriate to using sequential scans for everything. If I turn off sequential scans altogether, it uses the indices, but I don't get the performance of benefits of using sequential scans when appropriate. * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net
On Tue, 3 Jul 2001, Philip Molter wrote: > What causes this and how can I fix it? Have you tried a VACUUM ANALYZE? For some reason Postgres isn't able to use its indexes fully unless you VACUUM ANALYZE often. I consider this a bug but I gather the developers are OK with it. -sam
From: "Philip Molter" <philip@datafoundry.net> > On Tue, Jul 03, 2001 at 10:42:37AM -0400, Sam Tregar wrote: > : On Tue, 3 Jul 2001, Philip Molter wrote: > : > : > What causes this and how can I fix it? > : > : Have you tried a VACUUM ANALYZE? For some reason Postgres isn't able to > : use its indexes fully unless you VACUUM ANALYZE often. > : > : I consider this a bug but I gather the developers are OK with it. > > Yes. In fact, I have to VACUUM ANALYZE the tables every half hour on > this constantly running system or else kernel CPU usage rises to > unacceptable levels (another thing I consider a bug). Like I said, in > the middle of the night (probably after one of these analyses), it > switched from using index scans where appropriate to using sequential > scans for everything. If I turn off sequential scans altogether, it > uses the indices, but I don't get the performance of benefits of using > sequential scans when appropriate. VACUUM ANALYZE frequency depends on numbers of updates. I believe someone has been looking at a way of doing this in the background. For the purposes of setting SEQSCAN try something like: SET ENABLE_SEQSCAN TO OFF; Can't remember if it applies to this transaction or this connection. Run a grep on the docs - you'll only find a couple of hits. HTH - Richard Huxton
On Tue, Jul 03, 2001 at 05:12:43PM +0100, Richard Huxton wrote: : VACUUM ANALYZE frequency depends on numbers of updates. I believe someone : has been looking at a way of doing this in the background. Oh yeah, definitely depends on updates, or rather, changes to the table contents (insertions, deletions). : For the purposes of setting SEQSCAN try something like: : : SET ENABLE_SEQSCAN TO OFF; : : Can't remember if it applies to this transaction or this connection. Run a : grep on the docs - you'll only find a couple of hits. Well, I turned it off for the entire database (since right now, we're only using the db for this one application), but I lose the benefit of seqscans in situations where it's appropriate. That's why I was wondering if there's anyway to tell the optimizer to prefer index scans over sequential scans when it has a choice. Right now, it's using less efficient joining methods where it would normally use sequential scans (not that I'm complaining too much; CPU performance has more than doubled since taking out sequential scans). * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net
Philip Molter wrote: > : For the purposes of setting SEQSCAN try something like: > : > : SET ENABLE_SEQSCAN TO OFF; > > Well, I turned it off for the entire database (since right now, we're > only using the db for this one application), but I lose the benefit of > seqscans in situations where it's appropriate. That's why I was > wondering if there's anyway to tell the optimizer to prefer index scans > over sequential scans when it has a choice. AFAIK that's exactly what ENABLE_SEQSCAN does. > Right now, it's using less > efficient joining methods where it would normally use sequential scans > (not that I'm complaining too much; CPU performance has more than > doubled since taking out sequential scans). Try wrapping the one or two key queries with ...OFF and ...ON - should give you the best of both worlds. I've always managed to persuade PG to make the "right" choices, but there are a whole set of parameters you can tweak if absolutely necessary. The ENABLE_SEQSCAN is a pretty blunt tool but is the easiest for (me :-) to understand. See the runtime configuration chapter for details and check the mail archives for examples of use. - Richard Huxton
On Wed, Jul 04, 2001 at 07:31:53AM +0100, Richard Huxton wrote: : Philip Molter wrote: : : > : For the purposes of setting SEQSCAN try something like: : > : : > : SET ENABLE_SEQSCAN TO OFF; : > : > Well, I turned it off for the entire database (since right now, we're : > only using the db for this one application), but I lose the benefit of : > seqscans in situations where it's appropriate. That's why I was : > wondering if there's anyway to tell the optimizer to prefer index scans : > over sequential scans when it has a choice. : : AFAIK that's exactly what ENABLE_SEQSCAN does. It appears that ENABLE_SEQSCAN prevents the optimizer from using seqscans at all. That's not really the behavior I want. I just don't want it to be using them when it should be using index scans instead. : Try wrapping the one or two key queries with ...OFF and ...ON - should : give you the best of both worlds. Yeah, except I'm trying to write generally database agnostic (SQL92) code. It's rather idiotic that I have to tell the optimizer not only in which order to join the tables (that I can live with), but also what the best method is to join them with. : I've always managed to persuade PG to make the "right" choices, but : there are a whole set of parameters you can tweak if absolutely : necessary. The ENABLE_SEQSCAN is a pretty blunt tool but is the easiest : for (me :-) to understand. See the runtime configuration chapter for : details and check the mail archives for examples of use. Yeah, I don't mind tweaking, but the query optimizer, even with as many tables as this join has, should be able to recognize when unique indices are available and use them. I mean, it chops the aggregate row count from 3000+ down to 11 (that's just shutting off seq scans). That's pretty significant and symbolizes a problem with the optimizer when it can't figure it out. Furthermore, it *was* using indices and something happened to make it stop. I still have no clue what that was. It's rather irrelevant anyway. Turning off ENABLE_SEQSCAN appears to make the optimizer work much more like I intended, so off it stays. * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net