Quoting Tom Lane <tgl@sss.pgh.pa.us>:
> Kim Bisgaard <kib+pg@dmi.dk> writes:
> > SELECT station_id, timeobs,temp_grass, temp_dry_at_2m
> > FROM temp_dry_at_2m a
> > FULL OUTER JOIN temp_grass b
> > USING (station_id, timeobs)
> > WHERE station_id = 52981
> > AND timeobs = '2004-1-1 0:0:0'
>
> > explain analyse SELECT b.station_id, b.timeobs,temp_grass, temp_dry_at_2m
> > FROM temp_dry_at_2m a
> > FULL OUTER JOIN temp_grass b
> > USING (station_id, timeobs)
> > WHERE b.station_id = 52981
> > AND b.timeobs = '2004-1-1 0:0:0'
>
> > Why will PostgreSQL not use the same plan for both these queries - they
> > are virtually identical??
>
> Because they're semantically completely different. The second query is
> effectively a RIGHT JOIN, because join rows in which b is all-null will
> be thrown away by the WHERE. The optimizer sees this (note your second
> plan doesn't use a Full Join step anywhere) and is able to produce a
> much better plan. Full outer join is difficult to optimize, in part
> because we have no choice but to use a merge join for it --- the other
> join types don't support full join.
>
> regards, tom lane
>
Yes I am aware that they are not "identical", they also give different results,
but the data nessesary to compute the results is (0-2 rows, 0-1 row from each
table), and thus ideally have the potential to have similar performance - to my
head anyway, but I may not have grasped the complete picture yet :-)
Regards,
Kim.