Re: Common Table Expressions (WITH RECURSIVE) patch - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Common Table Expressions (WITH RECURSIVE) patch
Date
Msg-id 20080909.134555.99255049.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Common Table Expressions (WITH RECURSIVE) patch  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Common Table Expressions (WITH RECURSIVE) patch  ("Robert Haas" <robertmhaas@gmail.com>)
Re: Common Table Expressions (WITH RECURSIVE) patch  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
Thanks for the review.

[I dropped y-asaba@sraoss.co.jp from the Cc list since he has left our
company and the email address is being deleted.]

I'm going to look into issues which are seem to be bug (of course if
you know what to fix, patches are always welcome:-).

> These are my initial comments about the Common Table Expressions (CTE)
> patch, also known as WITH [RECURSIVE]. These comments are based on the
> patch here:
> 
> http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php
> 
> This is a semantically complex feature, and the standard is fairly
> complex as well. So I'm approaching this by making my own
> interpretations from the standard first (I included my interpretations
> and section references at the end of this email) and comparing to the
> behavior of the patch.
> 
> The following examples may be inconsistent with the standard. Some
> have already been mentioned, and I don't think they all need to be
> fixed for 8.4, but I mention them here for completeness.
> 
> * Mutual Recursion:
> 
>   with recursive
>     foo(i) as (values(1) union all select i+1 from bar where i < 10),
>     bar(i) as (values(1) union all select i+1 from foo where i < 10)
>   select * from foo;
>   ERROR:  mutual recursive call is not supported
> 
>   The standard allows mutual recursion.

The discussion seems to agree that let it leave for post 8.4.

> * Single Evaluation:
> 
>   with
>     foo(i) as (select random() as i)
>   select * from foo union all select * from foo;
>            i
>   -------------------
>    0.233165248762816
>     0.62126633618027
>   (2 rows)
> 
>   The standard specifies that non-recursive WITH should be evaluated
>   once.

What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?

> * RHS only:
> 
>   with recursive
>     foo(i) as (select i+1 from foo where i < 10 union all values(1))
>   select * from foo;
>   ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
> recursive query
> 
>   The standard does not require that the recursive term be on the RHS.

The discussion seems to agree that let it leave for post 8.4.

> * UNION ALL only:
> 
>   with recursive
>     foo(i) as (values(1) union select i+1 from foo where i < 10)
>   select * from foo;
>   ERROR:  non-recursive term and recursive term must be combined with
> UNION ALL
> 
>   The standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
>   (when the recursive term is not on the RHS of the EXCEPT).

The discussion seems to agree that let it leave for post 8.4.

> * Binary recursion and subselect strangeness:
> 
>   with recursive foo(i) as
>     (values (1)
>     union all
>     select * from
>       (select i+1 from foo where i < 10
>       union all
>       select i+1 from foo where i < X) t)
>   select * from foo;
> 
>   Produces 10 rows of output regardless of what "X" is. This should be
> fixed for 8.4.
>   Also, this is non-linear recursion, which the standard seems to
> disallow.

I will try to fix this. However detecting the query being not a
non-linear one is not so easy.

> * Multiple recursive references:
> 
>   with recursive foo(i) as
>     (values (1)
>     union all
>     select i+1 from foo where i < 10
>     union all
>     select i+1 from foo where i < 20)
>   select * from foo;
>   ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
> recursive query
> 
>   If we're going to allow non-linear recursion (which the standard
>   does not), this seems like it should be a valid case.

I will try to disallow this.

> * Strange result with except:
> 
>   with recursive foo(i) as
>     (values (1)
>     union all
>     select * from
>     (select i+1 from foo where i < 10
>     except
>     select i+1 from foo where i < 5) t)
>   select * from foo;
>   ERROR:  table "foo" has 0 columns available but 1 columns specified
> 
>   This query works if you replace "except" with "union". This should be
> fixed for 8.4.

I will try to fix this.

> * Aggregates allowed:
> 
>   with recursive foo(i) as
>     (values(1)
>     union all
>     select max(i)+1 from foo where i < 10)
>   select * from foo;
> 
>   Aggregates should be blocked according to the standard.
>   Also, causes an infinite loop. This should be fixed for 8.4.

I will try to fix this.

> * DISTINCT should supress duplicates:
> 
>   with recursive foo(i) as
>     (select distinct * from (values(1),(2)) t
>     union all
>     select distinct i+1 from foo where i < 10)
>   select * from foo;
> 
>   This outputs a lot of duplicates, but they should be supressed
> according to the standard. This query is essentially the same as
> supporting UNION for recursive queries, so we should either fix both for
> 8.4 or block both for consistency.

I'm not sure if it's possible to fix this. Will look into.

> * outer joins on a recursive reference should be blocked:
> 
>   with recursive foo(i) as
>     (values(1)
>     union all
>     select i+1 from foo left join (values(1)) t on (i=column1))
>   select * from foo;
> 
>   Causes an infinite loop, but the standard says using an outer join
>   in this situation should be prohibited. This should be fixed for 8.4.

Not an issue, I think.

> * ORDER BY, LIMIT, and OFFSET are rejected for recursive queries. The
> standard does not seem to say that these should be rejected.
> 
> 
> The following are my interpretations of relevant parts of the SQL
> standard (200n), and the associated sections. These are only my
> interpretation, so let me know if you interpret the standard
> differently.
> 
> Non-linear recursion forbidden:
>   7.13: Syntax Rules: 2.g.ii
>     My interpretation of 2.g.ii.2 is that WQN[k] and WQN[l] may be the
>     same <query name>.
>   7.13: Syntax Rules: 2.g.iv
> 
> EXCEPT can't be used for recursive queries if a recursive reference
> appears on the RHS:
>   7.13: Syntax Rules: 2.g.iii.1
> 
> INTERSECT ALL/EXCEPT ALL can't be used for recursive queries:
>   7.13: Syntax Rules: 2.g.iii.5
> 
> recursive references must appear in FROM clause:
>   7.13: Syntax Rules: 2.g.iii.3
>     My interpretation of this rule is that it does not allow a
>     recursive reference in a subquery in the targetlist or a subquery
>     in the where clause.
> 
> stratum defined:
>   7.13: Syntax Rules: 2.f
>   7.13: Syntax Rules: 2.g.i.1
> 
> recursive query must have anchor for every stratum:
>   7.13: Syntax Rules: 2.g.i.3
> 
> outer joins not allowed to join recursive references:
>   7.13: Syntax Rules: 2.g.iii.6
>   7.13: Syntax Rules: 2.g.iii.7
> 
> Aggregates/HAVING disallowed:
>   7.13: Syntax Rules: 2.g.iii.4.B
> 
> Mutual recursion defined:
>   7.13: Syntax Rules: 2.g.i.1
> 
> Evaluate each WITH entry once, even if it's referenced multiple times:
>   7.13: General Rules: 1
>   7.13: General Rules: 2.b
>   See also:
> http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
> 
> Evaluation order with mutual recursion:
>   7.13: General Rules: 2.a
>   7.13: General Rules: 2.b
> 
> Evaluation semantics:
>   7.13: General Rules: 2.c
> 
> DISTINCT:
>   7.13: General Rules: 2.c.iv
>   7.13: General Rules: 2.c.ix.3.A
> 
> I will provide comments about the code and documentation soon. This is a
> very useful feature.

Thanks. Enclosed is the latest patch to adopt CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: SQL standard question about Common Table Expressions
Next
From: Tatsuo Ishii
Date:
Subject: Re: Common Table Expressions (WITH RECURSIVE) patch