Thread: optimizer tuning/forcing correct index use
Greetings, I have a query that sometimes refuses to use a sensible index. Some info first: Version: PostgreSQL v 7.2 OS: SuSE Linux 7.2 w/Kernel 2.4.14 HW: Dual Athlon MP1800, 768MB RAM DB exists on a single fast SCSI drive. DB was loaded with a copy of our production system data (currently running on a SQLServer database). 'vacuumdb -a -z' was run after load, no rows have been added or deleted since vacuum. Here is the query: explain analyze select t.tb_order_number, f.account_number, f.clearing_account_id, f.symbol_full_name, f.fill_quantity, f.buy_or_sell, f.trader_username, f.fill_price, f.fill_quantity, f.fill_ts, f.last_update_ts, s.symbol_type, s.base_symbol_full_name, s.symbol_name, ca.clearing_account_number, ca.clearing_firm, co.tick_value from fill f, tb_order t, symbol s, clearing_account ca, contract co where f.fill_ts > '2002-02-06 00:00:00' and f.fill_ts <= '2002-02-08 23:59:59' and t.order_id = f.order_id and s.full_name= f.symbol_full_name and co.contract_name = s.contract_name and ca.clearing_account_id = f.clearing_account_idand s.symbol_name not in ('ISLD:TESTA', 'ISLD:TESTB', 'ISLD:TESTC'); Total number of rows in the tables of the join: fill : 7674725 tb_order : 20059204 symbol : 559 clearing_account : 57 contract : 13 When the time range (defined by fill_ts) is under 48 hours, the following execution plan is used: Nested Loop (cost=35.98..477477.54 rows=37403 width=170) (actual time=6.38..7868.65 rows=143582 loops=1) -> Hash Join (cost=35.98..256958.14 rows=37403 width=152) (actual time=6.21..3937.04 rows=143582 loops=1) -> Hash Join (cost=1.71..256229.53rows=45364 width=90) (actual time=0.50..2685.68 rows=143582 loops=1) -> Index Scan usingfill_ak2 on fill f (cost=0.00..255314.58 rows=69239 width=66) (actual time=0.26..1875.38 rows=143582 loops=1) -> Hash (cost=1.57..1.57 rows=57 width=24) (actual time=0.20..0.20 rows=0 loops=1) -> Seq Scanon clearing_account ca (cost=0.00..1.57 rows=57 width=24) (actual time=0.05..0.14 rows=57 loops=1) -> Hash (cost=32.89..32.89rows=547 width=62) (actual time=5.66..5.66 rows=0 loops=1) -> Hash Join (cost=1.16..32.89rows=547 width=62) (actual time=0.21..4.73 rows=559 loops=1) -> Seq Scan on symbol s (cost=0.00..20.78 rows=547 width=48) (actual time=0.05..2.43 rows=559 loops=1) -> Hash (cost=1.13..1.13rows=13 width=14) (actual time=0.11..0.11 rows=0 loops=1) -> Seq Scan on contractco (cost=0.00..1.13 rows=13 width=14) (actual time=0.06..0.08 rows=13 loops=1) -> Index Scan using order_pk ontb_order t (cost=0.00..5.88 rows=1 width=18) (actual time=0.02..0.02 rows=1 loops=143582) Total runtime: 7934.64 msec Somewhere above 48 hours the plan changes to this: Nested Loop (cost=35.98..644155.97 rows=56104 width=170) (actual time=48504.60..128512.95 rows=196483 loops=1) -> HashJoin (cost=35.98..313376.24 rows=56104 width=152) (actual time=48450.25..121546.48 rows=196483 loops=1) -> HashJoin (cost=1.71..312300.45 rows=68045 width=90) (actual time=48444.62..119822.68 rows=196483 loops=1) -> Seq Scan on fill f (cost=0.00..310928.88 rows=103859 width=66) (actual time=48444.38..118737.56 rows=196483 loops=1) -> Hash (cost=1.57..1.57 rows=57 width=24) (actual time=0.19..0.19 rows=0 loops=1) -> Seq Scan on clearing_account ca (cost=0.00..1.57 rows=57 width=24) (actual time=0.05..0.13 rows=57 loops=1) -> Hash (cost=32.89..32.89 rows=547 width=62) (actual time=5.57..5.57 rows=0 loops=1) -> Hash Join (cost=1.16..32.89rows=547 width=62) (actual time=0.21..4.64 rows=559 loops=1) -> Seq Scan on symbol s (cost=0.00..20.78 rows=547 width=48) (actual time=0.05..2.34 rows=559 loops=1) -> Hash (cost=1.13..1.13rows=13 width=14) (actual time=0.11..0.11 rows=0 loops=1) -> Seq Scan on contractco (cost=0.00..1.13 rows=13 width=14) (actual time=0.07..0.09 rows=13 loops=1) -> Index Scan using order_pk ontb_order t (cost=0.00..5.88 rows=1 width=18) (actual time=0.03..0.03 rows=1 loops=196483) Total runtime: 128598.23 msec The critical part is driving the fill table query from a sequential scan rather than a scan of fill_ak2. I believe I understand why the optimizer changes the plan: it thinks that the larger time range will make the fill_ak2 index scan a more expensive option than a sequential scan of the fill table. Can someone recommend a way for me to show the optimizer the error of its ways? (short of using 'set enable_seqscan = no', that is). Incidentally, in our current SQL server database I use a hint to tell the optimizer specifically what index to use. When using Oracle databases in the past, I've had to do the same thing. I understand the PostgreSQL developers are against this approach, preferring to make the optimizer smarter instead. I agree with this in principal, however, it seems to me that until the optimizer is perfect, this type of feature is needed. Comments? -K
On Tuesday 19 March 2002 09:28 am, Tom Lane wrote: > Kelly Burkhart <kelly@tradebotsystems.com> writes: > > The critical part is driving the fill table query from a sequential scan > > rather than a scan of fill_ak2. > > Have you done an ANALYZE or VACUUM ANALYZE lately? If so, what do you > get from > > select * from pg_stats where tablename = 'fill'; > select * from pg_class where relname = 'fill'; > > Offhand I am guessing that the table is fairly well ordered by fill_ts > and the planner is underestimating the effects of this. There is a > provision in there to try to account for data ordering, but it's new > code in 7.2 and doubtless still needs refinement. I've attached the results of those queries. The table was loaded in one week chunks, mostly but not entirely in order. 11/2001-3/2002 was loaded first in order, then 1/2001-10/2001 was loaded some time later. Each dump file is probably in order, if you trust that SQL Server used the index order in the query (which you probably can). In the future, when/if this database becomes production, each week we will nuke all rows older than 1 year (or maybe 2 years... who knows). So after a point, the db should grow little. -K
Kelly Burkhart <kelly@tradebotsystems.com> writes: >> Offhand I am guessing that the table is fairly well ordered by fill_ts >> and the planner is underestimating the effects of this. There is a >> provision in there to try to account for data ordering, but it's new >> code in 7.2 and doubtless still needs refinement. fill | fill_ts | 0 | 8 | 152655 | {"2001-10-22 15:28:07-05","2001-10-22 15:28:16-05","2001-10-2212:43:28-05","2001-08-13 08:49:19-05","2001-08-13 08:49:41-05","2001-09-25 16:13:41-05","2001-10-1009:04:33-05","2001-10-22 14:50:05-05","2001-10-31 14:05:43-06","2002-01-07 13:35:48-06"} | {0.002,0.002,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667}| {"2001-01-10 14:13:01-06","2001-07-1013:05:01-05","2001-09-10 09:26:08-05","2001-10-15 09:01:54-05","2001-11-02 09:49:58-06","2001-11-2809:36:33-06","2001-12-21 08:38:03-06","2002-01-15 09:34:59-06","2002-02-04 11:34:20-06","2002-02-2509:39:55-06","2002-03-08 14:43:10-06"} | -0.546947 Hmm. So the correlation of fill_ts with physical position is actually negative, according to the analyze results. Still, -0.54 represents rather strong correlation which would reduce the cost of the index scan. There was some discussion a couple weeks ago on the pgsql-bugs list about changing the equation the planner uses to estimate the effects of correlation order. Are you interested in experimenting? I previously said: : If you look in cost_index (see approx. lines 270-340 in : src/backend/optimizer/path/costsize.c) you'll see that it computes : access cost estimates for both the perfectly sequential case and : the perfectly uncorrelated case, and then tries to interpolate : between them. I have reasonable faith in both of the endpoint : estimation methods, but very little in the interpolation equation --- : it was chosen on the spur of the moment and hasn't really been tested. : : It might be interesting to replace csquared with just : fabs(indexCorrelation) to see if the results are better. regards, tom lane
Kelly Burkhart <kelly@tradebotsystems.com> writes: > The critical part is driving the fill table query from a sequential scan > rather than a scan of fill_ak2. Have you done an ANALYZE or VACUUM ANALYZE lately? If so, what do you get from select * from pg_stats where tablename = 'fill'; select * from pg_class where relname = 'fill'; Offhand I am guessing that the table is fairly well ordered by fill_ts and the planner is underestimating the effects of this. There is a provision in there to try to account for data ordering, but it's new code in 7.2 and doubtless still needs refinement. regards, tom lane
On Tuesday 19 March 2002 11:32 am, Tom Lane wrote: <snip> > > Hmm. So the correlation of fill_ts with physical position is actually > negative, according to the analyze results. Still, -0.54 represents > rather strong correlation which would reduce the cost of the index scan. > > There was some discussion a couple weeks ago on the pgsql-bugs list about > changing the equation the planner uses to estimate the effects of > correlation order. Are you interested in experimenting? I previously > > said: > : If you look in cost_index (see approx. lines 270-340 in > : src/backend/optimizer/path/costsize.c) you'll see that it computes > : access cost estimates for both the perfectly sequential case and > : the perfectly uncorrelated case, and then tries to interpolate > : between them. I have reasonable faith in both of the endpoint > : estimation methods, but very little in the interpolation equation --- > : it was chosen on the spur of the moment and hasn't really been tested. > : > : It might be interesting to replace csquared with just > : fabs(indexCorrelation) to see if the results are better. in costsize.c:334 I've changed: csquared = indexCorrelation * indexCorrelation; to: csquared = fabs(indexCorrelation); It looks like the estimated cost of the fill index scan is lower, but not enough to change the plan. I'll be happy to run more tests if you can think of other things to try. -K