On Tue, May 19, 2009 at 6:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> On Tue, 2009-05-19 at 08:58 -0400, Tom Lane wrote:
>>> Nonsense. The planner might think some other plan is cheaper, but
>>> it definitely knows how to do this, and has since at least 8.1.
>
>> Please look at Dimitri's plan. If it can remove the pointless sort, why
>> does it not do so?
>
> I haven't followed the whole thread, but the plan in the original post
> is for a hash join. The planner does not trust a hash join to preserve
> the order of its left input, because of possible batching. See the
> discussion a couple of months ago where we considered allowing the
> planner to disable batching so it *could* assume order preservation, and
> decided the risk of hashtable bloat was too great.
Hmm, my recollection of that conversation was that we decided that we
should have the planner tell the executor whether or not we are
relying on it to produce sorted output. We set this flag only when
the hash join is expected to fit comfortably within one batch (that
is, we allow a safety margin). If this flag is set and a hash join
unexpectedly goes multi-batch, then we perform a final merge pass (a
la merge sort) before returning any results.
I don't think it's a good idea to write off the idea of implementing
this optimization at some point. I see a lot of queries that join one
fairly large table against a whole bunch of little tables, and then
sorting the results by a column that is indexed in the big table. The
optimizer handles this by sequentially scanning the big table, hash
joining against all of the little tables, and then sorting the output,
which is pretty silly (given that all of the tables fit in RAM and are
in fact actually cached there). If there is a LIMIT clause, then it
might instead index-scan the big table, do the hash joins, and then
sort the already-ordered results. This is better because at least
we're not sorting the entire table unnecessarily but it's still poor.
...Robert