Dan Harris <fbsd@drivefaster.net> writes:
> Yes, eventactivity.incidentid is indexed. The datatype is varchar(40).
> Although, by checking this, I noticed that k_h.incidentid was
> varchar(100). Perhaps the difference in length between the keys caused
> the planner to not use the fastest method?
No, the planner wouldn't care about that.
> Here's the EXPLAIN analyze with enable_hashjoin = off and
> enable_mergejoin = off :
OK, so it does consider the "right" plan, but it's estimating it'll take
longer than the other one. One thing that's very strange is that the
estimated number of rows out has changed ... did you re-ANALYZE since
the previous message?
> -> Index Scan using eventactivity1 on
> eventactivity (cost=0.00..5774.81 rows=20 width=52) (actual
> time=29.768..51.334 rows=3 loops=1162)
> Index Cond: (("outer".incidentid)::text =
> (eventactivity.incidentid)::text)
> Filter: ((((' '::text || (recordtext)::text)
> || ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01
> 00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08
> 00:00:00'::timestamp without time zone))
So it's estimating 5775 cost units per probe into eventactivity, which
is pretty high --- it must think that a lot of rows will be retrieved by
the index (way more than the 20 or so it thinks will get past the filter
condition). What does the pg_stats entry for eventactivity.incidentid
contain? It might be worth increasing the statistics target for that
column to try to get a better estimate.
regards, tom lane