On Thu, May 15, 2014 at 4:50 PM, David G Johnston
<david.g.johnston@gmail.com> wrote:
> Jon Nelson-14 wrote
>> I was watching a very large recursive CTE get built today and this CTE
>> involves on the order of a dozen or so "loops" joining the initial
>> table against existing tables. It struck me that - every time through
>> the loop the tables were sorted and then joined and that it would be
>> much more efficient if the tables remained in a sorted state and could
>> avoid being re-sorted each time through the loop. Am I missing
>> something here? I am using PG 8.4 if that matters.
>
> I'm not sure what you mean by "watching" but maybe this is a simple as
> changing your CTE to use "UNION ALL" instead of "UNION [DISTINCT]"?
In fact, I'm using UNION ALL.
> If you really think it could be improved upon maybe you can help and provide
> a minimal self-contained example query and data that exhibits the behavior
> you describe so others can see it and test changes? It would be nice to
> know if other versions than one that is basically no longer supported
> exhibits the same behavior.
Pretty much any CTE that looks like this:
with cte AS ( select stuff from A UNION ALL select more_stuff from B, cte WHERE <join conditions>
) SELECT * FROM cte;
*and* where the planner chooses to join B and cte by sorting and doing
a merge join.
I'll see if I can come up with a self-contained example.
--
Jon