Re: Slow inner join, but left join is fast - Mailing list pgsql-performance

From Jeremy Haile
Subject Re: Slow inner join, but left join is fast
Date
Msg-id 1168453218.22654.1168592237@webmail.messagingengine.com
Whole thread Raw
In response to Re: Slow inner join, but left join is fast  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Slow inner join, but left join is fast  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
The table should have been analyzed, but to make sure I ran analyze on
the table before executing the explain analyze queries.  Well - problem
solved.  This time the inner join query runs quickly.

I still don't understand why the inner join would be so different from
the left join prior to the analyze.  It looks like the amount of rows
expected in the original query plan for inner join was 1 (not correct
since it was really 8728)  The left join query had the exact same
subqueries but expected 77214 rows to be returned from them, which was
still not correct but resulted in a better query plan.

After the recent analyze, here's the new inner join query plan.  I won't
bother pasting the left join plan, since it is almost identical now
(including row counts)  FYI -the result of the queries is (and always
was) identical for inner join and left join.


QUERY PLAN (inner join)
Merge Join  (cost=279457.86..279479.83 rows=43 width=120) (actual
time=626.771..670.275 rows=8728 loops=1)
  Merge Cond: ((a.dcms_dim_id = b.dcms_dim_id) AND (a.merchant_dim_id =
  b.merchant_dim_id))
  ->  Sort  (cost=139717.30..139722.38 rows=2029 width=64) (actual
  time=265.669..269.878 rows=8728 loops=1)
        Sort Key: a.dcms_dim_id, a.merchant_dim_id
        ->  HashAggregate  (cost=139519.61..139585.56 rows=2029
        width=16) (actual time=211.368..247.429 rows=8728 loops=1)
              ->  Bitmap Heap Scan on transaction_facts
              (cost=4427.62..138316.05 rows=68775 width=16) (actual
              time=21.858..100.998 rows=65789 loops=1)
                    Recheck Cond: ((transaction_date >= '2007-01-09
                    00:00:00'::timestamp without time zone) AND
                    (transaction_date < '2007-01-09 09:30:00'::timestamp
                    without time zone))
                    ->  Bitmap Index Scan on
                    transaction_facts_transaction_date_idx
                    (cost=0.00..4410.42 rows=68775 width=0) (actual
                    time=21.430..21.430 rows=65789 loops=1)
                          Index Cond: ((transaction_date >= '2007-01-09
                          00:00:00'::timestamp without time zone) AND
                          (transaction_date < '2007-01-09
                          09:30:00'::timestamp without time zone))
  ->  Sort  (cost=139740.56..139742.67 rows=843 width=64) (actual
  time=361.083..365.418 rows=8728 loops=1)
        Sort Key: b.dcms_dim_id, b.merchant_dim_id
        ->  Subquery Scan b  (cost=139663.76..139699.59 rows=843
        width=64) (actual time=308.567..346.135 rows=8728 loops=1)
              ->  HashAggregate  (cost=139663.76..139691.16 rows=843
              width=16) (actual time=308.563..337.677 rows=8728 loops=1)
                    ->  HashAggregate  (cost=139347.68..139431.97
                    rows=8429 width=55) (actual time=198.093..246.591
                    rows=48942 loops=1)
                          ->  Bitmap Heap Scan on transaction_facts
                          (cost=4427.62..138316.05 rows=68775 width=55)
                          (actual time=24.080..83.988 rows=65789
                          loops=1)
                                Recheck Cond: ((transaction_date >=
                                '2007-01-09 00:00:00'::timestamp without
                                time zone) AND (transaction_date <
                                '2007-01-09 09:30:00'::timestamp without
                                time zone))
                                ->  Bitmap Index Scan on
                                transaction_facts_transaction_date_idx
                                (cost=0.00..4410.42 rows=68775 width=0)
                                (actual time=23.596..23.596 rows=65789
                                loops=1)
                                      Index Cond: ((transaction_date >=
                                      '2007-01-09 00:00:00'::timestamp
                                      without time zone) AND
                                      (transaction_date < '2007-01-09
                                      09:30:00'::timestamp without time
                                      zone))
Total runtime: 675.638 ms



On Wed, 10 Jan 2007 12:15:44 -0500, "Tom Lane" <tgl@sss.pgh.pa.us> said:
> "Jeremy Haile" <jhaile@fastmail.fm> writes:
> > I have a query made by joining two subqueries where the outer query
> > performing the join takes significantly longer to run than the two
> > subqueries.
>
> Please show EXPLAIN ANALYZE results, not just EXPLAIN.
> Also, have you analyzed your tables recently?
>
>             regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Slow inner join, but left join is fast
Next
From: Tom Lane
Date:
Subject: Re: Slow inner join, but left join is fast