Re: Forcing the use of particular execution plans - Mailing list pgsql-performance

From Tim Truman
Subject Re: Forcing the use of particular execution plans
Date
Msg-id 000001c6e6a0$5903b990$0340050a@sdg
Whole thread Raw
In response to Re: Forcing the use of particular execution plans  ("Dave Dutcher" <dave@tridecap.com>)
Responses Re: Forcing the use of particular execution plans
List pgsql-performance
Here is an "explain analyze" for the query that performs slowly, I hope this
helps unfortunately I can't reproduce the version of the query that ran
quickly and therefore can't provide and 'explain analyze' for it.

"Aggregate  (cost=88256.32..88256.32 rows=1 width=0) (actual
time=55829.000..55829.000 rows=1 loops=1)"
"  ->  Subquery Scan foobar  (cost=0.00..88256.23 rows=35 width=0) (actual
time=19235.000..55829.000 rows=24 loops=1)"
"        ->  Append  (cost=0.00..88255.88 rows=35 width=631) (actual
time=19235.000..55829.000 rows=24 loops=1)"
"              ->  Subquery Scan "*SELECT* 1"  (cost=0.00..1165.12 rows=1
width=631) (actual time=16.000..16.000 rows=0 loops=1)"
"                    ->  Nested Loop  (cost=0.00..1165.11 rows=1 width=631)
(actual time=16.000..16.000 rows=0 loops=1)"
"                          ->  Index Scan using ix_transaction_merchant_id
on "transaction" t  (cost=0.00..1159.98 rows=1 width=349) (actual
time=16.000..16.000 rows=0 loops=1)"
"                                Index Cond: (198 = merchant_id)"
"                                Filter: ((transaction_date >=
'2005-01-01'::date) AND (transaction_date <= '2006-09-25'::date) AND
((credit_card_no)::text ~~ '4564%549'::text))"
"                          ->  Index Scan using pk_merchant on merchant m
(cost=0.00..5.11 rows=1 width=282) (never executed)"
"                                Index Cond: (id = 198)"
"              ->  Subquery Scan "*SELECT* 2"  (cost=20.90..87090.76 rows=34
width=631) (actual time=19219.000..55813.000 rows=24 loops=1)"
"                    ->  Hash Join  (cost=20.90..87090.42 rows=34 width=631)
(actual time=19219.000..55813.000 rows=24 loops=1)"
"                          Hash Cond: ("outer".merchant_id = "inner".id)"
"                          ->  Seq Scan on "transaction" t
(cost=0.00..87061.04 rows=1630 width=349) (actual time=234.000..55797.000
rows=200 loops=1)"
"                                Filter: ((transaction_date >=
'2005-01-01'::date) AND (transaction_date <= '2006-09-25'::date) AND
((credit_card_no)::text ~~ '4564%549'::text))"
"                          ->  Hash  (cost=20.88..20.88 rows=8 width=282)
(actual time=16.000..16.000 rows=0 loops=1)"
"                                ->  Seq Scan on merchant m
(cost=0.00..20.88 rows=8 width=282) (actual time=0.000..16.000 rows=7
loops=1)"
"                                      Filter: (parent_merchant_id = 198)"
"Total runtime: 55829.000 ms"

Once again any help much appreciated.

Tim

-----Original Message-----
From: Dave Dutcher [mailto:dave@tridecap.com]
Sent: Thursday, 28 September 2006 1:21 AM
To: 'Tim Truman'; pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Forcing the use of particular execution plans

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of  Tim Truman
>
> Hi,
>
> I have the following query which has been running very slowly
> and after a
> lot of testing/trial and error I found an execution plan that
> ran the query
> in a fraction of the time (and then lost the statistics that
> produced it).
> What I wish to know is how to force the query to use the
> faster execution
> plan.

It would be a bit easier to diagnose the problem if you posted EXPLAIN
ANALYZE rather than just EXPLAIN.  The two plans you posted looked very
similar except for the order of the nested loop in subquery 1 and an index
scan rather than a seq scan in subquery 2.

My guess would be that the order of the nested loop is determined mostly by
estimates of matching rows.  If you ran an EXPLAIN ANALYZE you could tell if
the planner is estimating correctly.  If it is not, you could try increasing
your statistics target and running ANALYZE.

To make the planner prefer an index scan over a seq scan, I would first
check the statistics again, and then you can try setting enable_seqscan to
false (enable_seqscan is meant more for testing than production) or, you
could try reducing random_page_cost, but you should test that against a
range of queries before putting it in production.

Dave

pgsql-performance by date:

Previous
From: "Carlo Stonebanks"
Date:
Subject: Re: Performace Optimization for Dummies
Next
From: Tom Lane
Date:
Subject: Re: Forcing the use of particular execution plans