Richard van den Berg wrote:
>We have a table with 1M rows that contain sessions with a start and
>finish timestamps. When joining this table with a 10k table with rounded
>timestamps, explain shows me sequential scans are used, and the join
>takes about 6 hours (2s per seq scan on session table * 10000):
>
> Nested Loop (cost=252.80..233025873.16 rows=1035480320 width=97)
>Join Filter: (("outer".starttime <= "inner".ts) AND ("outer".finishtime
>
>
>>= "inner".ts))
>>
>>
> -> Seq Scan on sessions us (cost=0.00..42548.36 rows=924536
>width=105) -> Materialize (cost=252.80..353.60 rows=10080 width=8)
> -> Seq Scan on duration du (cost=0.00..252.80 rows=10080 width=8)
>
>However, during the initial loading of the data (we first load into text
>tables, then convert to tables using timestamps etc, then run this
>query) the same query took only 12 minutes. While debugging, I increased
>cpu_tuple_cost to 0.1 (from 0.01). Now the explain shows an index scan,
>and the run time comes down to 11 minutes:
>
> Nested Loop (cost=0.00..667700310.42 rows=1035480320 width=97)
> -> Seq Scan on sessions us (cost=0.00..125756.60 rows=924536 width=105)
> -> Index Scan using ix_du_ts on duration du (cost=0.00..604.46
>rows=1120 width=8)
> Index Cond: (("outer".starttime <= du.ts) AND
>("outer".finishtime >= du.ts))
>
>I am glad that I found a way to force the use of the index, but still
>can't explain why in the initial run the planner made the right choice,
>but now I need to give it a hand. Could this have to do with the
>statistics of the tables? I make very sure (during the initial load and
>while testing) that I vacuum analyze all tables after I fill them.
>
>I'm runing postgres 7.4.7.
>
>Any help is appreciated.
>
>
>
I believe the problem is that postgres doesn't recognize how restrictive
a date-range is unless it uses constants.
So saying:
select blah from du WHERE time between '2004-10-10' and '2004-10-15';
Will properly use the index, because it realizes it only returns a few rows.
However
select blah from du, us where du.ts between us.starttime and us.finishtime;
Doesn't know how selective that BETWEEN is.
This has been discussed as a future improvement to the planner (in
8.*). I don't know the current status.
Also, in the future, you really should post your table schema, and
explain analyze instead of just explain. (I realize that with a 6hr
query it is a little painful.)
Notice that in the above plans, the expected number of rows drops from
10k down to 1k (which is probably where the planner decides to switch).
And if you actually did the analyze probably the number of rows is much
lower still.
Probably you should try to find out the status of multi-table
selectivity. It was discussed in the last couple of months.
John
=:->