Re: SQL standard question about Common Table Expressions - Mailing list pgsql-hackers

From Andrew Gierth
Subject Re: SQL standard question about Common Table Expressions
Date
Msg-id 87abeijge4.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to SQL standard question about Common Table Expressions  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: SQL standard question about Common Table Expressions  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
>>>>> "Jeff" == Jeff Davis <pgsql@j-davis.com> writes:
Jeff> I am looking into the SQL standard to try to determineJeff> precisely how the CTE feature should behave.
Jeff> Taking a simple case like:
Jeff>   with recursiveJeff>     foo(i) asJeff>       (values(1)Jeff>       union allJeff>       select i+1 from foo
wherei < 5)Jeff>   select * from foo;
 
Jeff> And looking at the SQL standard 200n 7.13: General Rules: 2.c,Jeff> it provides an algorithm for evaluating the
recursivequery.
 
Jeff> In this algorithm, AQEk is a <query expression>. Syntactically,Jeff> I only see two <query expression>s, and one
isthe entireJeff> query. The other is: "(values(1) union all select i+1 from fooJeff> where i < 5)", so I'll assume
thatAQEk must be equal to that*.
 

The "contains" language in the spec is tricky. And I think there's some
issue here with the spec confusing <query expression> and <query
expression body>; some of the language refers to "If a <query expression>
AQEk not marked as recursive is immediately contained in a <query
expression body>" which is not possible as the syntax production for
<query expression body> does not contain <query expression>.

Working through the spec, we have here only one WLE and therefore only
one partition. So WLE1 is the only element, and WQN1 is "foo" and
WQE1 is "(values(1) union all select i+1 from foo where i < 5)". The
way I suspect it's _intended_ to work is like this (following the order
of steps in the spec):

AQE1 = "values(1) union all select i+1 from foo where i < 5"
AQE2 = "values (1)"
AQE3 = "select i+1 from foo where i < 5"

AQE1 and AQE3 are recursive but AQE2 is not.

AQE1 is associated with WQE1 and therefore RT1 and WT1 are the result
and working tables for WQN1.

AQE2 is not marked as recursive and is contained in a recursive union,
therefore it is marked iteration-ignorable.

None of the AQEn suppress duplicates.

Let RT2 and WT2 be the result of AQE2, so both now contain (1)

AQE2 now becomes "TABLE RTN2" (and hence AQE1/WQE1 are now "TABLE RTN2
union all select i+1 from foo where i < 5"

Evaluate WQE1:
 RT1 and WT1 become (1) RT2/WT2 unchanged RT3 and WT3 remain empty

AQE2 is iteration ignorable, so RT2 is emptied.

WT1 and WT2 are nonempty, so:

associate "foo" with WT1

evaluate WQE1:
 WT1 becomes (2) WT2 is empty WT3 becomes (2)
 RT1 becomes (RT1 UNION ALL WT1) i.e. (1),(2) RT3 becomes (RT3 UNION ALL WT3) i.e. (1),(2)

WT1 and WT2 are nonempty, so:

associate "foo" with WT1

evaluate WQE1:
 WT1 becomes (3) WT2 is empty WT3 becomes (3)
 RT1 becomes (RT1 UNION ALL WT1) i.e. (1),(2),(3) RT3 becomes (RT3 UNION ALL WT3) i.e. (1),(2),(3)

etc.

[The main differences between this and the logic used by the existing
patch are the fact that the spec's version is keeping around many more
working and result tables in order to do the duplicate-suppression logic,
which requires knowing not just whether you've already produced a given
row, but also _which branch of the query_ produced it. Within the
constraints of the patch, it's not possible to trigger the duplicate
elimination, and therefore RT2/WT2 and RT3/WT3 become redundant; also all
non-recursive clauses are iteration-ignorable.]

-- 
Andrew (irc:RhodiumToad)


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Verbosity of Function Return Type Checks
Next
From: Alvaro Herrera
Date:
Subject: Re: sql2008 diff sql2003