Re: unnesesary sorting after Merge Full Join - Mailing list pgsql-general

From Gregory Stark
Subject Re: unnesesary sorting after Merge Full Join
Date
Msg-id 87wsos0z9z.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: unnesesary sorting after Merge Full Join  ("Alexey A. Nalbat" <alexey_nalbat@hotbox.ru>)
Responses Re: unnesesary sorting after Merge Full Join
List pgsql-general
"Alexey A. Nalbat" <alexey_nalbat@hotbox.ru> writes:

> Yes. But may be the FULL MERGE JOIN could be improved, because it
> is ordered, it actually has "outer path's path key": "coalesce(id1,id2)".

Hm, there's more going on here than meets the eye. If you changed the query
slightly Postgres would actually _almost_ do the right thing here.

The immediate blocker is that currently in build_join_pathkeys() for FULL
OUTER JOIN we don't note any path keys at all. We could note COALESCE(id1,id2)
as a path key, though we would have to create an equivalence class and add
COALESCE(id2,id1) to it as well I think.

But if you changed the query to use USING you would get something like this:

postgres=# explain select * from a1 full outer join a2 as a2(i) using (i) order by i;
                               QUERY PLAN
------------------------------------------------------------------------
 Sort  (cost=2914.68..2986.68 rows=28800 width=8)
   Sort Key: (COALESCE(a1.i, a2.i))
   ->  Merge Full Join  (cost=337.49..781.49 rows=28800 width=8)
         Merge Cond: (a1.i = a2.i)
         ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
               Sort Key: a1.i
               ->  Seq Scan on a1  (cost=0.00..34.00 rows=2400 width=4)
         ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
               Sort Key: a2.i
               ->  Seq Scan on a2  (cost=0.00..34.00 rows=2400 width=4)
(10 rows)


Note that it does recognize that "i" is in fact equivalent to COALESCE(). It
just doesn't recognize that the merge join is producing such an ordering.

Even if it wasn't hard to add that at least for this case in
reconsider_outer_join_clauses() we explicitly don't consider join clauses
unless they're against a constant. I haven't quite absorbed the logic here but
I think it only applies to path keys useful for joining, not for ordering:

 * For full-join cases, we can only do something useful if it's a FULL JOIN
 * USING and a merged column has an equivalence MERGEDVAR = CONSTANT.
 * By the time it gets here, the merged column will look like
 *        COALESCE(LEFTVAR, RIGHTVAR)
 * and we will have a full-join clause LEFTVAR = RIGHTVAR that we can match
 * the COALESCE expression to. In this situation we can push LEFTVAR = CONSTANT
 * and RIGHTVAR = CONSTANT into the input relations, since any rows not
 * meeting these conditions cannot contribute to the join result.
 *
 * Again, there isn't any traction to be gained by trying to deal with
 * clauses comparing a mergedvar to a non-pseudoconstant.  So we can make
 * use of the EquivalenceClasses to search for matching variables that were
 * equivalenced to constants.  The interesting outer-join clauses were
 * accumulated for us by distribute_qual_to_rels.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

pgsql-general by date:

Previous
From: "Andreas Lau"
Date:
Subject: Re: syntax error at or near "PROCEDURAL"
Next
From: "Pavan Deolasee"
Date:
Subject: Re: autovacuum not freeing up unused space on 8.3.0