"Francisco Reyes" <lists@stringsutils.com> writes:
> SET ENABLE_SEQSCAN TO OFF;
> SET ENABLE_BITMAPSCAN TO OFF;
> Aggregate (cost=25665216.10..25665216.11 rows=1 width=12) (actual
> time=3088.894..3088.896 rows=1 loops=1)
> -> Nested Loop (cost=0.00..25662307.70 rows=387785 width=12)
> (actual time=0.264..2624.680 rows=194734 loops=1)
> -> Index Scan using join_ids_join_id on join_ids
> (cost=0.00..2867051.21 rows=5020 width=4) (actual
> time=0.237..1236.019 rows=4437 loops=1)
> Filter: (customer_id = ANY ('{1014,2485,4636,4635,1255,547,374,580}'::integer[]))
> -> Index Scan using historical_join_id_date on historical
> (cost=0.00..4522.43 rows=1477 width=16) (actual
> time=0.010..0.153 rows=44 loops=4437)
> Index Cond: ((historical.join_id = join_ids.join_id) AND
> (historical.date > '2007-04-01'::date)
> AND (historical.date < '2008-05-01'::date))
> Filter: (trans.f5 > 0::numeric)
> Total runtime: 3091.227 ms --> 3 seconds
You might be more likely to get a sane plan if you had an index on
join_ids.customer_id. The first indexscan above is really a completely
silly choice, and would never have been used if you weren't holding
a gun to the planner's head. The index isn't contributing any
selectivity at all.
The other part of the problem is the factor-of-thirty overestimate of
the number of rows that the inner indexscan will produce (which means
also a factor-of-thirty overestimate of its cost). Perhaps higher
statistics targets for these two relations would give you a better
estimate there.
But there's something else going on, because the estimated rowcount for
the join (387785) is considerably less than the product of the scan
estimates (5020 * 1477 = 7414540), when it should be the same since
there's no additional join condition. What PG version are you running
exactly?
regards, tom lane