Thread: WITH RECUSIVE patches 0723

WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> Hi,
>
> Here is the lastest WITH RECURSIVE patches against 2007/07/17 CVS (CVS
> HEAD won't compile for me).
>
> This version includes regression tests and is almost ready for commit
> IMO.

I pulled fresh CVS HEAD and it seems the problem is gone.  Here is the
lastest WITH RECURSIVE patches against CVS HEAD.

Reviewers, please let me know if you find problems with the
patches. If none, I would like to commit this weekend.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> Reviewers, please let me know if you find problems with the
> patches. If none, I would like to commit this weekend.

Has this patch actually been reviewed yet?  The only reports I've
seen are from testing; nothing from anyone actually reading the
code.  I know I've not looked at it yet.

            regards, tom lane

Re: WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <ishii@postgresql.org> writes:
> > Reviewers, please let me know if you find problems with the
> > patches. If none, I would like to commit this weekend.
>
> Has this patch actually been reviewed yet?  The only reports I've
> seen are from testing; nothing from anyone actually reading the
> code.  I know I've not looked at it yet.

The reviewer registered at the Wiki is David Fetter and I believe he
is reading the patches. Michael Makes has contributed the ecpg
part. So apparently he is knowing the ecpg part at least.

I know the patch is huge. Reviewers, please let me know if you have
any question about the code. I would like to do anything for helping
the review.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: [PATCHES] WITH RECUSIVE patches 0723

From
David Fetter
Date:
On Wed, Jul 23, 2008 at 10:59:20AM -0400, Tom Lane wrote:
> Tatsuo Ishii <ishii@postgresql.org> writes:
> > Reviewers, please let me know if you find problems with the
> > patches. If none, I would like to commit this weekend.
>
> Has this patch actually been reviewed yet?  The only reports I've
> seen are from testing; nothing from anyone actually reading the
> code.  I know I've not looked at it yet.

I've read the code, for what that's worth, which isn't much.  I just
tried out this patch on a fresh checkout of CVS TIP and found:

EXPLAIN WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t AS
t2USING(i); 
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Hash Join  (cost=0.08..0.16 rows=2 width=4)
   Hash Cond: (t1.i = t2.i)
   ->  Recursion on t1  (cost=0.00..0.06 rows=2 width=4)
         ->  Append  (cost=0.00..0.04 rows=2 width=4)
               ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
               ->  Recursive Scan on t  (cost=0.00..0.00 rows=1 width=4)
                     Filter: (i < 5)
   ->  Hash  (cost=0.06..0.06 rows=2 width=4)
         ->  Recursion on t2  (cost=0.00..0.06 rows=2 width=4)
               ->  Append  (cost=0.00..0.04 rows=2 width=4)
                     ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
                     ->  Recursive Scan on t  (cost=0.00..0.00 rows=1 width=4)
                           Filter: (i < 5)
(13 rows)

When I try to execute the query without the EXPLAIN, having attached a debugger
to the back-end, I get.

(gdb) continue
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
4513            expr_value = ExecEvalExpr(clause, econtext, &isNull, NULL);
(gdb) i s
#0  0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
#1  0x08199b23 in ExecScan (node=0xa1831a4, accessMtd=0x81a6bb0 <RecursivescanNext>) at execScan.c:131
#2  0x081a6ba9 in ExecRecursiveScan (node=0xa1831a4) at nodeRecursivescan.c:48
#3  0x08192320 in ExecProcNode (node=0xa1831a4) at execProcnode.c:380
#4  0x081a6923 in RecursionNext (node=0xa17fe18) at nodeRecursion.c:68
#5  0x08199a83 in ExecScan (node=0xa17fe18, accessMtd=0x81a6840 <RecursionNext>) at execScan.c:68
#6  0x081a6839 in ExecRecursion (node=0xa17fe18) at nodeRecursion.c:116
#7  0x081923e0 in ExecProcNode (node=0xa17fe18) at execProcnode.c:339
#8  0x081a1c13 in MultiExecHash (node=0xa17fcc4) at nodeHash.c:94
#9  0x081a28b9 in ExecHashJoin (node=0xa17b654) at nodeHashjoin.c:159
#10 0x081922d8 in ExecProcNode (node=0xa17b654) at execProcnode.c:395
#11 0x081901db in standard_ExecutorRun (queryDesc=0xa15c334, direction=ForwardScanDirection, count=0) at
execMain.c:1271
#12 0x08240dc4 in PortalRunSelect (portal=0xa15631c, forward=1 '\001', count=0, dest=0xa1733d8) at pquery.c:937
#13 0x082420e6 in PortalRun (portal=0xa15631c, count=2147483647, isTopLevel=1 '\001', dest=0xa1733d8,
altdest=0xa1733d8, 
    completionTag=0xbfcacaea "") at pquery.c:793
#14 0x0823d0a7 in exec_simple_query (
    query_string=0xa12fc9c "WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT *
FROMt AS t1 JOIN t AS t2 USING(i);") at postgres.c:977 
#15 0x0823e84c in PostgresMain (argc=4, argv=0xa0d0dac, username=0xa0d0d7c "shackle") at postgres.c:3559
#16 0x0820957f in ServerLoop () at postmaster.c:3238
#17 0x0820a4e0 in PostmasterMain (argc=3, argv=0xa0ced50) at postmaster.c:1023
#18 0x081b69d6 in main (argc=3, argv=0xa0ced50) at main.c:188

What other information could help track down this problem?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [PATCHES] WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> On Wed, Jul 23, 2008 at 10:59:20AM -0400, Tom Lane wrote:
> > Tatsuo Ishii <ishii@postgresql.org> writes:
> > > Reviewers, please let me know if you find problems with the
> > > patches. If none, I would like to commit this weekend.
> >
> > Has this patch actually been reviewed yet?  The only reports I've
> > seen are from testing; nothing from anyone actually reading the
> > code.  I know I've not looked at it yet.
>
> I've read the code, for what that's worth, which isn't much.  I just
> tried out this patch on a fresh checkout of CVS TIP and found:
>
> EXPLAIN WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t
ASt2 USING(i); 
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  Hash Join  (cost=0.08..0.16 rows=2 width=4)
>    Hash Cond: (t1.i = t2.i)
>    ->  Recursion on t1  (cost=0.00..0.06 rows=2 width=4)
>          ->  Append  (cost=0.00..0.04 rows=2 width=4)
>                ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
>                ->  Recursive Scan on t  (cost=0.00..0.00 rows=1 width=4)
>                      Filter: (i < 5)
>    ->  Hash  (cost=0.06..0.06 rows=2 width=4)
>          ->  Recursion on t2  (cost=0.00..0.06 rows=2 width=4)
>                ->  Append  (cost=0.00..0.04 rows=2 width=4)
>                      ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
>                      ->  Recursive Scan on t  (cost=0.00..0.00 rows=1 width=4)
>                            Filter: (i < 5)
> (13 rows)
>
> When I try to execute the query without the EXPLAIN, having attached a debugger
> to the back-end, I get.

Thanks for the report. We will look into this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> (gdb) continue
> Continuing.
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
> 4513            expr_value = ExecEvalExpr(clause, econtext, &isNull, NULL);
> (gdb) i s
> #0  0x08192dcd in ExecQual (qual=0xa183824, econtext=0xa183230, resultForNull=0 '\0') at execQual.c:4513
> #1  0x08199b23 in ExecScan (node=0xa1831a4, accessMtd=0x81a6bb0 <RecursivescanNext>) at execScan.c:131
> #2  0x081a6ba9 in ExecRecursiveScan (node=0xa1831a4) at nodeRecursivescan.c:48
> #3  0x08192320 in ExecProcNode (node=0xa1831a4) at execProcnode.c:380
> #4  0x081a6923 in RecursionNext (node=0xa17fe18) at nodeRecursion.c:68
> #5  0x08199a83 in ExecScan (node=0xa17fe18, accessMtd=0x81a6840 <RecursionNext>) at execScan.c:68
> #6  0x081a6839 in ExecRecursion (node=0xa17fe18) at nodeRecursion.c:116
> #7  0x081923e0 in ExecProcNode (node=0xa17fe18) at execProcnode.c:339
> #8  0x081a1c13 in MultiExecHash (node=0xa17fcc4) at nodeHash.c:94
> #9  0x081a28b9 in ExecHashJoin (node=0xa17b654) at nodeHashjoin.c:159
> #10 0x081922d8 in ExecProcNode (node=0xa17b654) at execProcnode.c:395
> #11 0x081901db in standard_ExecutorRun (queryDesc=0xa15c334, direction=ForwardScanDirection, count=0) at
execMain.c:1271
> #12 0x08240dc4 in PortalRunSelect (portal=0xa15631c, forward=1 '\001', count=0, dest=0xa1733d8) at pquery.c:937
> #13 0x082420e6 in PortalRun (portal=0xa15631c, count=2147483647, isTopLevel=1 '\001', dest=0xa1733d8,
altdest=0xa1733d8, 
>     completionTag=0xbfcacaea "") at pquery.c:793
> #14 0x0823d0a7 in exec_simple_query (
>     query_string=0xa12fc9c "WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT *
FROMt AS t1 JOIN t AS t2 USING(i);") at postgres.c:977 
> #15 0x0823e84c in PostgresMain (argc=4, argv=0xa0d0dac, username=0xa0d0d7c "shackle") at postgres.c:3559
> #16 0x0820957f in ServerLoop () at postmaster.c:3238
> #17 0x0820a4e0 in PostmasterMain (argc=3, argv=0xa0ced50) at postmaster.c:1023
> #18 0x081b69d6 in main (argc=3, argv=0xa0ced50) at main.c:188
>
> What other information could help track down this problem?
>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [PATCHES] WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> On Wed, Jul 23, 2008 at 10:59:20AM -0400, Tom Lane wrote:
> > Tatsuo Ishii <ishii@postgresql.org> writes:
> > > Reviewers, please let me know if you find problems with the
> > > patches. If none, I would like to commit this weekend.
> >
> > Has this patch actually been reviewed yet?  The only reports I've
> > seen are from testing; nothing from anyone actually reading the
> > code.  I know I've not looked at it yet.
>
> I've read the code, for what that's worth, which isn't much.  I just
> tried out this patch on a fresh checkout of CVS TIP and found:
>
> EXPLAIN WITH RECURSIVE t(i) AS (VALUES(1::int) UNION ALL SELECT i+1 FROM t WHERE i < 5) SELECT * FROM t AS t1 JOIN t
ASt2 USING(i); 
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  Hash Join  (cost=0.08..0.16 rows=2 width=4)
>    Hash Cond: (t1.i = t2.i)
>    ->  Recursion on t1  (cost=0.00..0.06 rows=2 width=4)
>          ->  Append  (cost=0.00..0.04 rows=2 width=4)
>                ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
>                ->  Recursive Scan on t  (cost=0.00..0.00 rows=1 width=4)
>                      Filter: (i < 5)
>    ->  Hash  (cost=0.06..0.06 rows=2 width=4)
>          ->  Recursion on t2  (cost=0.00..0.06 rows=2 width=4)
>                ->  Append  (cost=0.00..0.04 rows=2 width=4)
>                      ->  Values Scan on "*VALUES*"  (cost=0.00..0.01 rows=1 width=4)
>                      ->  Recursive Scan on t  (cost=0.00..0.00 rows=1 width=4)
>                            Filter: (i < 5)
> (13 rows)
>
> When I try to execute the query without the EXPLAIN, having attached a debugger
> to the back-end, I get.
>
> (gdb) continue
> Continuing.
>
> Program received signal SIGSEGV, Segmentation fault.

Thanks for the report. Here is the new patches from Yoshiyuki. It
appeared that addRangeTableEntryForRecursive() needs to do deep copy
for the subquery and ref in the RangeTblEntry to avoid double free
bug (remember that your example is a self join case).

Also I added your query to the regression test case with minor
modifications.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: [PATCHES] WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> Reviewers, please let me know if you find problems with the
> patches. If none, I would like to commit this weekend.

Given that everyone who has tested this has found a different way to
crash it, and that the frequency of crash reports shows no signs of
slowing down, I have to think that committing it is premature.

I tried to look through the patch just now and failed to make any
sense of it, because of the complete absence of documentation.
Two unexplained examples added to the SELECT reference page don't
do it for me.  I want to see an explanation of exactly what behaviors
are intended to be provided (and, in view of the long TODO list that
was posted awhile back, what isn't provided).  And there needs to be
more than zero internal documentation.  A README file, or perhaps
a very long file header comment, needs to be provided to explain
what's supposed to happen, when, and where when processing a recursive
query.  (For comparison look at the README.HOT file that was created
to explain the HOT patch --- something at about that level of detail
would help this patch a lot.  Or consider adding a section to
chapter 43 in the SGML docs.)

We really can't accept a patch that is so poorly documented as to
be unreviewable.  Unreviewable also means it'll be unmaintainable
going forward.

            regards, tom lane

Re: [PATCHES] WITH RECUSIVE patches 0723

From
David Fetter
Date:
On Thu, Jul 24, 2008 at 01:55:37PM +0900, Tatsuo Ishii wrote:
> > Program received signal SIGSEGV, Segmentation fault.
>
> Thanks for the report. Here is the new patches from Yoshiyuki.

Thanks for the patch :)

Now, I get a different problem, this time with the following code
intended to materialize paths on the fly and summarize down to a
certain depth in a tree:

CREATE TABLE tree(
    id INTEGER PRIMARY KEY,
    parent_id INTEGER REFERENCES tree(id)
);

INSERT INTO tree
VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3),
       (9,4), (10,4), (11,7), (12,7), (13,7), (14, 9), (15,11), (16,11);

WITH RECURSIVE t(id, path) AS (
    VALUES(1,ARRAY[NULL::integer])
UNION ALL
    SELECT tree.id, t.path || tree.id
    FROM tree JOIN t ON (tree.parent_id = t.id)
)
SELECT
    t1.id, count(t2.*)
FROM
    t t1
JOIN
    t t2
ON (
    t1.path[1:2] = t2.path[1:2]
AND
    array_upper(t1.path,1) = 2
AND
    array_upper(t2.path,1) > 2
)
GROUP BY t1.id;
ERROR: unrecognized node type: 203

Please apply the attached patch to help out with tab
completion in psql.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment

Re: [PATCHES] WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> Now, I get a different problem, this time with the following code
> intended to materialize paths on the fly and summarize down to a
> certain depth in a tree:
>
> CREATE TABLE tree(
>     id INTEGER PRIMARY KEY,
>     parent_id INTEGER REFERENCES tree(id)
> );
>
> INSERT INTO tree
> VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3),
>        (9,4), (10,4), (11,7), (12,7), (13,7), (14, 9), (15,11), (16,11);
>
> WITH RECURSIVE t(id, path) AS (
>     VALUES(1,ARRAY[NULL::integer])
> UNION ALL
>     SELECT tree.id, t.path || tree.id
>     FROM tree JOIN t ON (tree.parent_id = t.id)
> )
> SELECT
>     t1.id, count(t2.*)
> FROM
>     t t1
> JOIN
>     t t2
> ON (
>     t1.path[1:2] = t2.path[1:2]
> AND
>     array_upper(t1.path,1) = 2
> AND
>     array_upper(t2.path,1) > 2
> )
> GROUP BY t1.id;
> ERROR: unrecognized node type: 203

Thanks for the report. We will look into this.

> Please apply the attached patch to help out with tab
> completion in psql.

Ok, it will appear in the next patches.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: [PATCHES] WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> Thanks for the patch :)
>
> Now, I get a different problem, this time with the following code
> intended to materialize paths on the fly and summarize down to a
> certain depth in a tree:
>
> CREATE TABLE tree(
>     id INTEGER PRIMARY KEY,
>     parent_id INTEGER REFERENCES tree(id)
> );
>
> INSERT INTO tree
> VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3),
>        (9,4), (10,4), (11,7), (12,7), (13,7), (14, 9), (15,11), (16,11);
>
> WITH RECURSIVE t(id, path) AS (
>     VALUES(1,ARRAY[NULL::integer])
> UNION ALL
>     SELECT tree.id, t.path || tree.id
>     FROM tree JOIN t ON (tree.parent_id = t.id)
> )
> SELECT
>     t1.id, count(t2.*)
> FROM
>     t t1
> JOIN
>     t t2
> ON (
>     t1.path[1:2] = t2.path[1:2]
> AND
>     array_upper(t1.path,1) = 2
> AND
>     array_upper(t2.path,1) > 2
> )
> GROUP BY t1.id;
> ERROR: unrecognized node type: 203

Thanks for the report. Here is the new patches from Yoshiyuki against
CVS HEAD. Also I have added your test case to the regression test.

> Please apply the attached patch to help out with tab
> completion in psql.

Thanks. Your patches has been included.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
At David's request I've been looking through this patch.

Regarding documentation: if it would help, I can write some; I have
already made a start on writing down what is going on internally in
order to understand it myself.

I've found three more bugs so far:

1)

create view v2(id) as values (1);
with recursive t(id) as (select id from v2                        union all select id+1 from t where id < 5)
select * from t;
ERROR:  could not open relation 1663/16384/24588: No such file or directory

Here it seems that rewriting is simply not being applied to CTEs where
a recursive clause is present; the reference to "v2" remains in the
query up until execution time, at which point it errors out (in
ExecInitSeqScan called from InitPlan).

2)

with recursive t(id) as (values (1)                        union all select id+1 from t where id < 5
   union all values (2))
 
select * from t;
ERROR:  table "t" has 0 columns available but 1 columns specified

This seems to be caused by incorrect assumptions in checkWellFormedCte
and checkCteSelectStmt (which should have been rejecting the query).
The query tree as seen by checkWellFormedCte here is (values(1) union
all select ...) union all (values (2)), and when the left subtree is
passed to checkCteSelectStmt, it believes it to be non-recursive due
to the lack of any From clause. The unexpected error is produced
later.

3)

with recursive t(id) as (values (1)     union all select t.id+1                 from t left join (values (1)) as s(x)
on(false)                where t.id < 5)
 
select * from t;id 
---- 1 2
(2 rows)

This behaviour is clearly intentional, since the entire mechanism of
estate->es_disallow_tuplestore exists for no other reason, but it
seems to me to be clearly wrong. What is the justification for it?

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> At David's request I've been looking through this patch.
> 
> Regarding documentation: if it would help, I can write some; I have
> already made a start on writing down what is going on internally in
> order to understand it myself.

Thanks. There was some docs written in Japanese by Yoshiyuki. Recently
he updagted it. I will translate into English and post here.

> I've found three more bugs so far:
> 
> 1)
> 
> create view v2(id) as values (1);
> with recursive t(id) as (select id from v2
>                          union all select id+1 from t where id < 5)
> select * from t;
> ERROR:  could not open relation 1663/16384/24588: No such file or directory
> 
> Here it seems that rewriting is simply not being applied to CTEs where
> a recursive clause is present; the reference to "v2" remains in the
> query up until execution time, at which point it errors out (in
> ExecInitSeqScan called from InitPlan).

Yes, we need to make the rewrite system to understand CTEs. Probably
fireRIRrules() needs to have lines something like:
    if (rte->rtekind == RTE_RECURSIVE)    {        rte->non_recursive_query = fireRIRrules(rte->non_recursive_query,
activeRIRs);       continue;    }
 

But I still see the error message. Will look into more.

For below, I will ask Yoshiyuki.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> 2)
> 
> with recursive t(id) as (values (1)
>                          union all select id+1 from t where id < 5
>                          union all values (2))
> select * from t;
> ERROR:  table "t" has 0 columns available but 1 columns specified
> 
> This seems to be caused by incorrect assumptions in checkWellFormedCte
> and checkCteSelectStmt (which should have been rejecting the query).
> The query tree as seen by checkWellFormedCte here is (values(1) union
> all select ...) union all (values (2)), and when the left subtree is
> passed to checkCteSelectStmt, it believes it to be non-recursive due
> to the lack of any From clause. The unexpected error is produced
> later.
> 
> 3)
> 
> with recursive t(id)
>   as (values (1)
>       union all select t.id+1
>                   from t left join (values (1)) as s(x) on (false)
>                  where t.id < 5)
> select * from t;
>  id 
> ----
>   1
>   2
> (2 rows)
> 
> This behaviour is clearly intentional, since the entire mechanism of
> estate->es_disallow_tuplestore exists for no other reason, but it
> seems to me to be clearly wrong. What is the justification for it?
> 
> -- 
> Andrew (irc:RhodiumToad)
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: WITH RECUSIVE patches 0723

From
"Pavel Stehule"
Date:
Hello

I played with CTE and I have to say, it's great feature - great work.

One questions - can I enforce materialisation of query?

It would be usefull for some analytical queries like:

with tmp as (select a, sum(b) as b from test) select * from tmp union
all select 'all', sum(b) from tmp;

regards
Pavel Stehule

2008/7/27 Tatsuo Ishii <ishii@postgresql.org>:
>> At David's request I've been looking through this patch.
>>
>> Regarding documentation: if it would help, I can write some; I have
>> already made a start on writing down what is going on internally in
>> order to understand it myself.
>
> Thanks. There was some docs written in Japanese by Yoshiyuki. Recently
> he updagted it. I will translate into English and post here.
>
>> I've found three more bugs so far:
>>
>> 1)
>>
>> create view v2(id) as values (1);
>> with recursive t(id) as (select id from v2
>>                          union all select id+1 from t where id < 5)
>> select * from t;
>> ERROR:  could not open relation 1663/16384/24588: No such file or directory
>>
>> Here it seems that rewriting is simply not being applied to CTEs where
>> a recursive clause is present; the reference to "v2" remains in the
>> query up until execution time, at which point it errors out (in
>> ExecInitSeqScan called from InitPlan).
>
> Yes, we need to make the rewrite system to understand CTEs. Probably
> fireRIRrules() needs to have lines something like:
>
>                if (rte->rtekind == RTE_RECURSIVE)
>                {
>                        rte->non_recursive_query = fireRIRrules(rte->non_recursive_query, activeRIRs);
>                        continue;
>                }
>
> But I still see the error message. Will look into more.
>
> For below, I will ask Yoshiyuki.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
>
>> 2)
>>
>> with recursive t(id) as (values (1)
>>                          union all select id+1 from t where id < 5
>>                          union all values (2))
>> select * from t;
>> ERROR:  table "t" has 0 columns available but 1 columns specified
>>
>> This seems to be caused by incorrect assumptions in checkWellFormedCte
>> and checkCteSelectStmt (which should have been rejecting the query).
>> The query tree as seen by checkWellFormedCte here is (values(1) union
>> all select ...) union all (values (2)), and when the left subtree is
>> passed to checkCteSelectStmt, it believes it to be non-recursive due
>> to the lack of any From clause. The unexpected error is produced
>> later.
>>
>> 3)
>>
>> with recursive t(id)
>>   as (values (1)
>>       union all select t.id+1
>>                   from t left join (values (1)) as s(x) on (false)
>>                  where t.id < 5)
>> select * from t;
>>  id
>> ----
>>   1
>>   2
>> (2 rows)
>>
>> This behaviour is clearly intentional, since the entire mechanism of
>> estate->es_disallow_tuplestore exists for no other reason, but it
>> seems to me to be clearly wrong. What is the justification for it?
>>
>> --
>> Andrew (irc:RhodiumToad)
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: WITH RECUSIVE patches 0723

From
Tatsuo Ishii
Date:
> At David's request I've been looking through this patch.
> 
> Regarding documentation: if it would help, I can write some; I have
> already made a start on writing down what is going on internally in
> order to understand it myself.
> 
> I've found three more bugs so far:
> 
> 1)
> 
> create view v2(id) as values (1);
> with recursive t(id) as (select id from v2
>                          union all select id+1 from t where id < 5)
> select * from t;
> ERROR:  could not open relation 1663/16384/24588: No such file or directory
> 
> Here it seems that rewriting is simply not being applied to CTEs where
> a recursive clause is present; the reference to "v2" remains in the
> query up until execution time, at which point it errors out (in
> ExecInitSeqScan called from InitPlan).
> 
> 2)
> 
> with recursive t(id) as (values (1)
>                          union all select id+1 from t where id < 5
>                          union all values (2))
> select * from t;
> ERROR:  table "t" has 0 columns available but 1 columns specified
> 
> This seems to be caused by incorrect assumptions in checkWellFormedCte
> and checkCteSelectStmt (which should have been rejecting the query).
> The query tree as seen by checkWellFormedCte here is (values(1) union
> all select ...) union all (values (2)), and when the left subtree is
> passed to checkCteSelectStmt, it believes it to be non-recursive due
> to the lack of any From clause. The unexpected error is produced
> later.

Included patches from Yoshiyuki should fix 1) and 2). I also add your
SQLs to the regression test. Thanks.

> 3)
> 
> with recursive t(id)
>   as (values (1)
>       union all select t.id+1
>                   from t left join (values (1)) as s(x) on (false)
>                  where t.id < 5)
> select * from t;
>  id 
> ----
>   1
>   2
> (2 rows)
> 
> This behaviour is clearly intentional, since the entire mechanism of
> estate->es_disallow_tuplestore exists for no other reason, but it
> seems to me to be clearly wrong. What is the justification for it?

Yes, this is due to prevent infinit recursion caused by following
case for example.

CREATE TABLE test (a TEXT, b TEXT);

INSERT INTO test VALUES ('aaa', 'bbb');
INSERT INTO test VALUES ('bbb', 'ccc');
INSERT INTO test VALUES ('ddd', 'eee');
INSERT INTO test VALUES ('ccc', 'qqq');

WITH RECURSIVE x AS ( SELECT * FROM test WHERE a = 'aaa'
 UNION ALL
 SELECT test.* FROM x LEFT JOIN test on test.a = x.b
) SELECT * FROM x;

Now we think that we were wrong. This type of query should run into
infinit recursion and it's user's responsibility that he does not make
such a query.

Another idea would be prohibiting *any* outer joins in the recursive
term (DB2 style), but this may be overkill.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: WITH RECUSIVE patches 0723

From
"Robert Haas"
Date:
> Now we think that we were wrong. This type of query should run into
> infinit recursion and it's user's responsibility that he does not make
> such a query.
>
> Another idea would be prohibiting *any* outer joins in the recursive
> term (DB2 style), but this may be overkill.

Even if you were to do that, I'm pretty sure that you'd find that it
is insufficient to prevent infinite recursion.  I suspect it's not
difficult to show that SQL with WITH RECURSIVE is Turing-complete, and
therefore detecting infinite recursion is equivalent to the Halting
problem.

http://en.wikipedia.org/wiki/Halting_problem

Even if it isn't, someone can always call a function written in any of
the procedural languages, which is definitely sufficient to make it
Turing-complete.  Then you don't even need WITH RECURSIVE:

rhaas=# create or replace function keep_going() returns setof integer as $$
begin
loop
return next 1;
end loop;
end
$$ language plpgsql;
CREATE FUNCTION
rhaas=# select * from keep_going();
<...>

The way to attack this is by putting in some kind of logic that cuts
the query off when the result-set gets too large or consumes too much
memory or CPU time or something along those lines.  Actually detecting
or preventing infinite loops is impossible, and not the real goal
anyway, since a query that generates 10^100 rows is for all practical
purposes just as bad.

...Robert


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Tatsuo" == Tatsuo Ishii <ishii@postgresql.org> writes:
>> This behaviour is clearly intentional, since the entire mechanism of>> estate-> es_disallow_tuplestore exists for no
otherreason, but it>> seems to me to be clearly wrong. What is the justification for it?
 
Tatsuo> Yes, this is due to prevent infinit recursion caused byTatsuo> following case for example.

[...]
Tatsuo> WITH RECURSIVE x AS (Tatsuo>   SELECT * FROM test WHERE a = 'aaa'
Tatsuo>   UNION ALL
Tatsuo>   SELECT test.* FROM x LEFT JOIN test on test.a = x.bTatsuo> ) SELECT * FROM x;
Tatsuo> Now we think that we were wrong. This type of query shouldTatsuo> run into infinit recursion and it's user's
responsibilityTatsuo>that he does not make such a query.
 

I agree.
Tatsuo> Another idea would be prohibiting *any* outer joins in theTatsuo> recursive term (DB2 style), but this may be
overkill.

There are legitimate cases for wanting to do a left join in the
recursion - for example, to use the content of another table to
prune the tree where matching records exist (consider the standard
bill-of-materials example with the addition of another table listing
components already in stock).

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
I spent some time reading the SQL spec over the weekend, and I believe
I've identified a fairly serious problem in the WITH patch.  SQL99
7.12 <query expression> General Rule 1 is
        1) If a non-recursive <with clause> is specified, then:
           a) For every <with list element> WLE, let WQN be the <query             name> immediately contained in WLE.
LetWQE be the <query             expression> immediately contained in WLE. Let WLT be the             table resulting
fromevaluation of WQE, with each column name             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^             replaced by
thecorresponding element of the <with column             list>, if any, immediately contained in WLE.
 
           b) Every <table reference> contained in <query expression> that             specifies WQN identifies WLT.

I think what this is saying is that the subquery defined by a WITH
clause is to be evaluated only once, even if it is referenced in
multiple places in the upper query.  This is sensible because if there
is no such rule, then there really is no semantic difference between
non-recursive WITH and ordinary subqueries; and the SQL committee is not
known for inventing duplicate syntax.  It is a useful property for users
because (1) it lets them prevent duplicate evaluations of an expensive
subquery, and (2) it lets them prevent multiple evaluations of volatile
functions in a subquery.  (Right now we tell people to use OFFSET 0 as
an optimization fence, but that's an unportable hack, and it doesn't
cover all cases anyway.)  Another thing in the back of my head is that
having these semantics could enable using INSERT ... RETURNING etc
as WITH subexpressions, whereas we can't really allow them as arbitrary
subqueries because of the lack of guarantees about one-time execution.
That's something for later, though.

I think this is a "must fix" because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time.

This isn't going to be a particularly simple fix :-(.  The basic
implementation clearly ought to be to dump the result of the subquery
into a tuplestore and then have the upper level read out from that.
However, we don't have any infrastructure for having multiple
upper-level RTEs reference the same tuplestore.  (Perhaps the InitPlan
infrastructure could be enhanced in that direction, but it's not ready
for non-scalar outputs today.)  Also, I think we'd have to teach
tuplestore how to support multiple readout cursors.  For example,
considerWITH foo AS (SELECT ...) SELECT ... FROM foo a, foo b WHERE ...
If the planner chooses to do the join as a nested loop then each
Scan node needs to keep track of its own place in the tuplestore,
concurrently with the other node having a different place.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Tatsuo" == Tatsuo Ishii <ishii@postgresql.org> writes:
Tatsuo> Included patches from Yoshiyuki should fix 1) and 2). I alsoTatsuo> add your SQLs to the regression test.
Thanks.

I think it needs this change in addition; without it, incorrect
results are returned when you reference a recursive view from within
the recursive query, due to the RecursionScan nodes becoming linked to
the wrong tuplestores.

The testcase for this would be something like

create view v0 as with recursive t1(id) as (values (1)                           union all select id+1 from t1 where id
<5) select * from t1;
 

with recursive t(id) as (select * from v0                        union all select id+1 from t where t.id < 8)
select count(*) from t;

-- expected output is 30, not 5


*** src/backend/executor/nodeRecursion.c~       Mon Jul 28 14:41:38 2008
--- src/backend/executor/nodeRecursion.c        Mon Jul 28 15:22:55 2008
***************
*** 124,129 ****
--- 124,130 ---- ExecInitRecursion(Recursion *node, EState *estate, int eflags) {       RecursionState
*recursionstate;
+       Tuplestorestate **save_tuplestorestate;        /* check for unsupported flags */       Assert(!(eflags &
EXEC_FLAG_MARK));
***************
*** 149,154 ****
--- 150,156 ----       /*        * Save the reference for the working table to share        */
+       save_tuplestorestate = estate->es_tuplestorestate;       estate->es_tuplestorestate =
&recursionstate->working_table;       /*
 
***************
*** 196,201 ****
--- 198,205 ----       ExecAssignResultTypeFromTL(&recursionstate->ss.ps);
ExecAssignScanProjectionInfo(&recursionstate->ss);
 
+       estate->es_tuplestorestate = save_tuplestorestate;
+        return recursionstate; } 


-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> I think what this is saying is that the subquery defined by a WITH
> clause is to be evaluated only once, even if it is referenced in
> multiple places in the upper query.  This is sensible because if there
> is no such rule, then there really is no semantic difference between
> non-recursive WITH and ordinary subqueries; and the SQL committee is not
> known for inventing duplicate syntax.  

Well I guess that answers the question of whether to apply the partial patch
before full recursive queries come along. In any case since that work seems to
be making a lot of progress (no thanks to me:( ) I don't think it's a problem
to wait.

> This isn't going to be a particularly simple fix :-(.  The basic
> implementation clearly ought to be to dump the result of the subquery
> into a tuplestore and then have the upper level read out from that.
> However, we don't have any infrastructure for having multiple
> upper-level RTEs reference the same tuplestore.

Uhm, indeed, isn't that the whole point of the work needed to make recursive
queries work? Once we have that we'll just use those executor nodes even for
non-recursive queries.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!


Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> This isn't going to be a particularly simple fix :-(.  The basic
>> implementation clearly ought to be to dump the result of the subquery
>> into a tuplestore and then have the upper level read out from that.
>> However, we don't have any infrastructure for having multiple
>> upper-level RTEs reference the same tuplestore.

> Uhm, indeed, isn't that the whole point of the work needed to make recursive
> queries work?

No, I don't think so.  The present patch (so far as I've gathered what
it can do, in the absence of any documentation) knows how to feed back
tuples collected at an upper level to a scan node at a lower level,
which is needed to implement recursion.  However, if you do
WITH RECURSIVE foo AS (...) SELECT ... FROM foo a, foo b WHERE

then you will still get two independent evaluations of duplicate
querytrees.  Which is at least a performance bug, and if there are
any volatile functions inside foo I claim it's a spec violation too.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tatsuo" == Tatsuo Ishii <ishii@postgresql.org> writes:
>  Tatsuo> Included patches from Yoshiyuki should fix 1) and 2). I also
>  Tatsuo> add your SQLs to the regression test. Thanks.

> I think it needs this change in addition; without it, incorrect
> results are returned when you reference a recursive view from within
> the recursive query, due to the RecursionScan nodes becoming linked to
> the wrong tuplestores.

That whole business of using the EState to pass tuplestores back and
forth looks fundamentally broken to me anyway; there's just no way it'll
be certain to link the right nodes together in complicated cases with
multiple recursions.  The nodes should be carrying IDs (such as the name
of the WITH item) which they use to search a lookaside list.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I think it needs this change in addition; without it, incorrect>> results are returned when you reference a
recursiveview from>> within the recursive query, due to the RecursionScan nodes>> becoming linked to the wrong
tuplestores.
Tom> That whole business of using the EState to pass tuplestores backTom> and forth looks fundamentally broken to me
anyway;there's justTom> no way it'll be certain to link the right nodes together inTom> complicated cases with multiple
recursions.

Mutual recursion is not allowed; as far as I can determine, every
remaining case is such that any RecursiveScan node should be
referencing the nearest parent Recursion node, which is what the patch
(with the above fix) does.

If you have a counterexample I'd be interested to see it; I've spent a
significant amount of time looking at this code, and I can't find one.
Tom> The nodes should be carrying IDs (such as the name of the WITHTom> item) which they use to search a lookaside
list.

create view v0 as with recursive t(id) as (select ...);
with recursive t(id) as (select ... from v0 ...) select ...;

That gives you two WITH items with the same name. You would need
additional qualification of some sort.

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> That whole business of using the EState to pass tuplestores back
>  Tom> and forth looks fundamentally broken to me anyway; there's just
>  Tom> no way it'll be certain to link the right nodes together in
>  Tom> complicated cases with multiple recursions.

> Mutual recursion is not allowed;

Well, the spec allows it, so we're going to have to fix this kluge
sooner or later, even if it chances not to fail on the subset we
currently support.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
[snip spec]

Just out of curiosity, since I don't have a copy of the spec handy, how
does the language for WITH compare to that for views?
Tom> I think this is a "must fix" because of the point about volatileTom> functions --- changing it later will result
inuser-visibleTom> semantics changes, so we have to get it right the first time.
 

I strongly disagree that this should be a blocking issue - the patch
as it stands is an insanely useful feature, allowing many real-world
queries to work which simply were not possible before without
resorting to procedural code or awkward database designs.
Tom> This isn't going to be a particularly simple fix :-(.  The basicTom> implementation clearly ought to be to dump
theresult of theTom> subquery into a tuplestore and then have the upper level readTom> out from that.
 

Which will be a serious pessimization in many common cases if you do
it all the time. Googling for examples of non-recursive WITH queries
shows that it is very widely used for clarity or convenience, in
contexts where you _don't_ want materialization.

Recursive WITH queries that self-join the recursion result seem to be
rare in practice.

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> Recursive WITH queries that self-join the recursion result seem to be
> rare in practice.

We are not in the business of getting spec-required semantics 80% right;
and as I took pains to point out to start with, there are good
functionality reasons to want it to work per spec, even without getting
into "interesting" cases like self-joins.  If you don't think this is
grounds for rejecting the patch, think again.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> We are not in the business of getting spec-required semanticsTom> 80% right;

I guess that's why we still fold identifiers to lowercase, why our
timestamp implementation differs from the standard, why we used to
require AS for select-list aliases, why our views aren't updateable,
why we violate the spec for UNIQUE, why we don't have schema-unique
constraint names, why we don't implement MATCH PARTIAL, ...

The SQL spec is a Platonic ideal to which _none_ of the implementations
are more than a vague approximation.

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Gregory Stark
Date:
"Andrew Gierth" <andrew@tao11.riddles.org.uk> writes:

>>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>
>  Tom> This isn't going to be a particularly simple fix :-(.  The basic
>  Tom> implementation clearly ought to be to dump the result of the
>  Tom> subquery into a tuplestore and then have the upper level read
>  Tom> out from that.
>
> Which will be a serious pessimization in many common cases if you do
> it all the time. Googling for examples of non-recursive WITH queries
> shows that it is very widely used for clarity or convenience, in
> contexts where you _don't_ want materialization.

I just wonder where all these examples of real-world queries were when I
posted this patch and asked for such feedback originally. sigh.

In any case I think we've already made this decision. If we wanted the 80%
solution it was ready for Postgres 8.3. It wouldn't make much sense to skip it
then but put it in now when that there's time to finish it and a lot of the
work's already done.

I think the spec-compliant approach is clearly-superior. If we have the choice
there's no question we should do it properly. 

In an ideal world we would then have logic to check if the semantics are
maintained if the subquery is inlined and detect cases where that would be an
advantage. One case that comes to mind would be if there's an indexable qual
that could be pushed down into it such as:
WITH foo(a) as (SELECT a                  FROM tab                 WHERE long complex condition
youonly want to write once) SELECT a from foo where a = 1 UNION ALL SELECT a from foo where a = 2 UNION ALL ...
 

So I disagree with Tom that we should advertise this as the approved way to
disable subquery inlining. I would still suggest using OFFSET 0 for that. But
I also don't agree with you that this is more common than the converse. I
think if we have a choice between always materializing and always inlining
then always materializing is much better.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Gregory" == Gregory Stark <stark@enterprisedb.com> writes:
Gregory> I just wonder where all these examples of real-world queriesGregory> were when I posted this patch and asked
forsuch feedbackGregory> originally. sigh.
 
Gregory> In any case I think we've already made this decision. If weGregory> wanted the 80% solution it was ready for
Postgres8.3. ItGregory> wouldn't make much sense to skip it then but put it in nowGregory> when that there's time to
finishit and a lot of the work'sGregory> already done.
 

I wasn't paying much attention in the 8.3 development phases so I have
no idea what patch you had then.

The patch under review is for WITH RECURSIVE, not just for WITH, and it
handles a large enough proportion of the use-cases for recursive queries
(I've never seen a requirement for mutual recursion in the wild, though
I've several times had to kludge around pg lack of recursion support)
that I think it should be included in 8.4 either as-is, or with whatever
improvements are possible between now and release.

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Martijn van Oosterhout
Date:
On Mon, Jul 28, 2008 at 07:57:16PM +0100, Andrew Gierth wrote:
> Which will be a serious pessimization in many common cases if you do
> it all the time. Googling for examples of non-recursive WITH queries
> shows that it is very widely used for clarity or convenience, in
> contexts where you _don't_ want materialization.

Since the problem is using the result of a WITH clause more than once,
would it be sufficient to simply detect that case and bail? You don't
want materialisation is most cases, there's just a few where it is
needed.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Since the problem is using the result of a WITH clause more than once,
> would it be sufficient to simply detect that case and bail? You don't
> want materialisation is most cases, there's just a few where it is
> needed.

Really?  I tried googling to see what other people thought that the
WITH clause was for, and the first relevant hit I got was this one:
http://www.oracle-developer.net/display.php?id=212
which certainly treats it as a key part of the feature.

My thought is that we could optimize away materialization in cases where
we can tell it's not needed (no volatile functions and/or no multiple
scans of the subquery).  But not being able to do it means we've
implemented the feature incorrectly.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Andrew Dunstan
Date:

Tom Lane wrote:
> My thought is that we could optimize away materialization in cases where
> we can tell it's not needed (no volatile functions and/or no multiple
> scans of the subquery).  But not being able to do it means we've
> implemented the feature incorrectly.
>
>             
>   

I'm not sure how much work that would involve, but none of this means we 
can't have the feature for 8.4, right? Just that there is more work to do.

cheers

andrew


Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> My thought is that we could optimize away materialization in cases where
>> we can tell it's not needed (no volatile functions and/or no multiple
>> scans of the subquery).  But not being able to do it means we've
>> implemented the feature incorrectly.

> I'm not sure how much work that would involve, but none of this means we 
> can't have the feature for 8.4, right? Just that there is more work to do.

I would be *extremely* surprised if we don't find ourselves improving
the optimization of WITH clauses long after 8.4.  We're still working on
outer joins, remember ;-).  My point here is just that the base case
before optimization has to behave per spec.  Optimizing more later is
good, fixing deliberately introduced non-compliance not so good.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> Martijn van Oosterhout <kleptog@svana.org> writes:>> Since the problem is using the result of a WITH clause more
than>>once, would it be sufficient to simply detect that case and bail?>> You don't want materialisation is most cases,
there'sjust a few>> where it is needed.
 
Tom> Really?  I tried googling to see what other people thought thatTom> the WITH clause was for, and the first
relevanthit I got wasTom> this one: http://www.oracle-developer.net/display.php?id=212Tom> which certainly treats it as
akey part of the feature.
 

Try searching for "common table expression".

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
"Pavel Stehule"
Date:
Hello

2008/7/29 Martijn van Oosterhout <kleptog@svana.org>:
> On Mon, Jul 28, 2008 at 07:57:16PM +0100, Andrew Gierth wrote:
>> Which will be a serious pessimization in many common cases if you do
>> it all the time. Googling for examples of non-recursive WITH queries
>> shows that it is very widely used for clarity or convenience, in
>> contexts where you _don't_ want materialization.
>
> Since the problem is using the result of a WITH clause more than once,
> would it be sufficient to simply detect that case and bail? You don't
> want materialisation is most cases, there's just a few where it is
> needed.
>

I thing so materialisation is more important than you thing. Without
materialisation I could use derived tables, but materialisation in
WITH statement is unique feature usefull for analytical queries. I am
sure, so materialisation should be one from possible strategies.

I like to see this feature in core, with/without materialisation is
usefull for recursive queries and I thing so materialisation should be
add in next months. I don't see it as mayor break. I prefere early
commit of this patch (with neccessary documentation), because there
will be some work that cannot be commited concurently - analytic
queries and my implementation of rollup and cube operator.

Regards
Pavel Stehule

> Have a nice day,
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
>> Please line up in a tree and maintain the heap invariant while
>> boarding. Thank you for flying nlogn airlines.
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFIjkcuIB7bNG8LQkwRAto4AJwPkKlCWD/yBjAyEBL/LXMLK08LPwCfZ2dq
> qSHGPoPPGwGIQOP62eQimGE=
> =Yog+
> -----END PGP SIGNATURE-----
>
>


Re: WITH RECUSIVE patches 0723

From
David Fetter
Date:
On Mon, Jul 28, 2008 at 02:49:01PM -0400, Tom Lane wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> >  Tom> That whole business of using the EState to pass tuplestores back
> >  Tom> and forth looks fundamentally broken to me anyway; there's just
> >  Tom> no way it'll be certain to link the right nodes together in
> >  Tom> complicated cases with multiple recursions.
> 
> > Mutual recursion is not allowed;
> 
> Well, the spec allows it, so we're going to have to fix this kluge
> sooner or later, even if it chances not to fail on the subset we
> currently support.

Any ideas how to approach this?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: WITH RECUSIVE patches 0723

From
Andrew Gierth
Date:
>>>>> "Tatsuo" == Tatsuo Ishii <ishii@postgresql.org> writes:
>> At David's request I've been looking through this patch.>> >> Regarding documentation: if it would help, I can write
some;I>> have already made a start on writing down what is going on>> internally in order to understand it myself.
 
Tatsuo> Thanks. There was some docs written in Japanese byTatsuo> Yoshiyuki. Recently he updagted it. I will translate
intoTatsuo>English and post here.
 

(Still waiting on this)

One more oversight: the patch isn't updating the ECPG preproc.y other
than trivially, so WITH queries aren't working in ecpg:

test.pgc:111: ERROR: syntax error at or near "recursive"

-- 
Andrew (irc:RhodiumToad)


Re: WITH RECUSIVE patches 0723

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> One more oversight: the patch isn't updating the ECPG preproc.y other
> than trivially, so WITH queries aren't working in ecpg:

> test.pgc:111: ERROR: syntax error at or near "recursive"

By and large we don't expect core patches to worry about fixing the ecpg
grammar.  Right now the situation is that Michael Meskes makes a manual
cleanup pass every so often :-(.  I would like to see that get automated
sooner rather than later, but in the near term it is not the
responsibility of core-patch authors to update ecpg.
        regards, tom lane


Re: WITH RECUSIVE patches 0723

From
Michael Meskes
Date:
On Sat, Aug 02, 2008 at 12:35:55AM +0100, Andrew Gierth wrote:
> One more oversight: the patch isn't updating the ECPG preproc.y other
> than trivially, so WITH queries aren't working in ecpg:

I'll take care of this as soon as the patch settles down enough, gets
included into our CVS or into its own accessible source archive. It
doesn't make sense IMO to bloat the patch with code not needed for the
backend functionality at this point in time.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL!


Re: WITH RECUSIVE patches 0723

From
Michael Meskes
Date:
On Sat, Aug 02, 2008 at 12:33:38AM -0400, Tom Lane wrote:
> grammar.  Right now the situation is that Michael Meskes makes a manual
> cleanup pass every so often :-(.  I would like to see that get automated
> sooner rather than later, but in the near term it is not the

I received a very promising first version of an automation script from
Mike Auberry. Waiting for the next one now. I'm hopeful that we are not
too far away.

Michael

-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL!