Inner join vs where-clause subquery - Mailing list pgsql-performance

From Jeremy Haile
Subject Inner join vs where-clause subquery
Date
Msg-id 1166538737.31684.281175363@webmail.messagingengine.com
Whole thread Raw
Responses Re: Inner join vs where-clause subquery
List pgsql-performance
I have the following query which performs extremely slow:
select min(nlogid) as start_nlogid,
       max(nlogid) as end_nlogid,
       min(dtCreateDate) as start_transaction_timestamp,
       max(dtCreateDate) as end_transaction_timestamp
from activity_log_facts
where nlogid > ( select max(a.end_nlogid) from
activity_log_import_history a)
and dtCreateDate < '2006-12-18 9:10'


If I change the where clause to have the return value of the subquery it
runs very fast:
select min(nlogid) as start_nlogid,
       max(nlogid) as end_nlogid,
       min(dtCreateDate) as start_transaction_timestamp,
       max(dtCreateDate) as end_transaction_timestamp
from activity_log_facts
where nlogid > 402123456
and dtCreateDate < '2006-12-18 9:10'


If I change the query to the following, it runs fast:
select min(nlogid) as start_nlogid,
       max(nlogid) as end_nlogid,
       min(dtCreateDate) as start_transaction_timestamp,
       max(dtCreateDate) as end_transaction_timestamp
from activity_log_facts
inner join ( select max(end_nlogid) as previous_nlogid from
activity_log_import_history) as a
on activity_log_facts.nlogid > a.previous_nlogid
where dtCreateDate < ${IMPORT_TIMESTAMP}


I am running PG 8.2.  Why is that this the case?  Shouldn't the query
planner be smart enough to know that the first query is the same as the
second and third?  The inner query does not refer to any columns outside
of itself.  I personally find the first query easiest to read and wish
it performed well.

Jeremy Haile

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Insertion to temp table deteriorating over time
Next
From: "Steven Flatt"
Date:
Subject: Re: Insertion to temp table deteriorating over time