Edmund Dengler <edmundd@eSentire.com> writes:
> select * from event
> where timestamp > (select now() - '2 hours'::interval)
> and exists (select 1 from hack_pull_sid where sid = event.sid)
> (note: <hack_pull_sid> is a table of SIDs I am interested in so that I
> avoid the issues with IN)
I think you're creating more issues than you're avoiding. With the
above query, the planner has little chance of guessing how many rows
will be retrieved from "event" ... and indeed the EXPLAIN output shows
that its guess is off by more than a factor of 1000:
> -> Index Scan using timestamp_idx on event (cost=0.00..558165.62 rows=237893 width=24) (actual time=0.18..3.05
rows=129loops=1)
With a misestimate of that magnitude at the core of the query, it's
unsurprising that all the other plan choices are bad too.
But actually I suspect the easiest point of attack is not the EXISTS
subquery, but the timestamp comparison. Can you get your application
to supply a simple literal constant to compare to the timestamp, viz
'2003-06-10 21:44' rather than now()-'2 hours'? The former gives the
planner something to compare to its statistics, the latter doesn't.
Oh ... you have done an ANALYZE on event reasonably recently, no?
regards, tom lane