Thread: Subquery Factoring ?

Subquery Factoring ?

From
"Adam Rich"
Date:
Are there any plans to support materialized subquery / factoring (sql-99
WITH)
in Postgresql?  I am spoiled with this feature in oracle, and find myself
wishing I had it in postgresql more and more.  It *seems* to an outsider
like a relatively easy addition.

I searched the archives but only found a brief mention of a
syntax-support-only
patch from last winter.

Adam





Re: Subquery Factoring ?

From
Gregory Stark
Date:
"Adam Rich" <adam.r@sbcglobal.net> writes:

> Are there any plans to support materialized subquery / factoring (sql-99
> WITH) in Postgresql? I am spoiled with this feature in oracle, and find
> myself wishing I had it in postgresql more and more. It *seems* to an
> outsider like a relatively easy addition.

We have the patch to support the syntax but we're a bit unsure whether that's
actually helpful. One thing we're missing is exactly what you could help with.

What I would like to know is what the user expectations are when using this
feature.

Are you using it just to avoid retyping a complex subquery? Or do you expect
that the feature will reduce the execution time by avoiding re-executing the
subquery for each call site in the query?

How disappointing would it be if the WITH clause acted as an optimization
barrier, preventing WHERE clauses from being pushed down and potentially using
indexes? Or if the query ended up not needing the data in the WITH subquery
but the query had to execute it anyways?

Another way of looking at this question is: if you called some volatile
function from the subquery such as one which printed diagnostic messages or
accessed some remote service, how many times would you expect it to be called?
Would you expect the feature to guarantee that the function would only be
called once or would it be ok if it were called 0 times if the subquery data
was never needed or many times if the optimizer thought that would be faster?

> I searched the archives but only found a brief mention of a
> syntax-support-only patch from last winter.

Yeah, it was the first step in implementing recursive queries. It was deemed
uninteresting without actual recursive query support, but there was some more
recent emails where someone else was interested in it so that decision may be
revisited for 8.4. And in any case there's still a good chance (though no
guarantee) that recursive queries will be done for 8.4 which will definitely
imply this feature.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Subquery Factoring ?

From
"Adam Rich"
Date:
Gregory,

> Are you using it just to avoid retyping a complex subquery? Or do you
> expect
> that the feature will reduce the execution time by avoiding re-
> executing the
> subquery for each call site in the query?

The only situation where I rely on this currently is when my main/outer
query references or executes an expensive subquery multiple times.
The main goal is to speed the whole thing up by only executing the
subquery once, but making the query shorter and easier to read is
a nice side effect.

> How disappointing would it be if the WITH clause acted as an
> optimization
> barrier, preventing WHERE clauses from being pushed down and
> potentially using indexes?

Do you mean that WHERE clauses in the main/outer query are not used to
optimize the subquery?  This would be a great feature down the road,
but I would be quite happy without it.

> Or if the query ended up not needing the data in the WITH
> subquery but the query had to execute it anyways?

Hmmm...  I would expect the query be executed once, regardless.
Indeed, I would think that executing it more than once would produce
strange results.  I'm counting on all references to the subquery to
contain the same data.

> Another way of looking at this question is: if you called some volatile
> function from the subquery such as one which printed diagnostic
> messages or
> accessed some remote service, how many times would you expect it to be
> called?
> Would you expect the feature to guarantee that the function would only
> be
> called once or would it be ok if it were called 0 times if the subquery
> data
> was never needed or many times if the optimizer thought that would be
> faster?

I would expect it to be called exactly once.

I think that logically, I expect "WITH" to work as shorthand for creating
a temporary table, filling it with my subquery, and then executing the
rest of the main/outer query.  (As opposed to creating a temporary view)

Adam