Thread: SEARCH and CYCLE clauses

Re: SEARCH and CYCLE clauses

From

Vik Fearing

Date:

20 May 2020, 13:04:20

On 5/20/20 1:46 PM, Peter Eisentraut wrote:
> I have implemented the SEARCH and CYCLE clauses.

YES!

> This is standard SQL syntax attached to a recursive CTE to compute a
> depth- or breadth-first ordering and cycle detection, respectively. This
> is just convenience syntax for what you can already do manually. The
> original discussion about recursive CTEs briefly mentioned these as
> something to do later but then it was never mentioned again.
> 
> SQL specifies these in terms of syntactic transformations, and so that's
> how I have implemented them also, mainly in the rewriter.

I've attempted to do this several times but didn't get anywhere with it.
 I'm looking forward to reviewing this.

(And maybe it will put me on the right path for implementing <unique
predicate>.)
-- 
Vik Fearing

Re: SEARCH and CYCLE clauses

From

Vik Fearing

Date:

20 May 2020, 19:28:04

On 5/20/20 3:04 PM, Vik Fearing wrote:

> I'm looking forward to reviewing this.
A few quick things I've noticed so far:

1)
There are some smart quotes in the comments that should be converted to
single quotes.


2)
This query is an infinite loop, as expected:

  with recursive a as (select 1 as b union all select b from a)
  table a;

But it becomes an error when you add a cycle clause to it:

  with recursive a as (select 1 as b union all table a)
    cycle b set c to true default false using p
  table a;

  ERROR:  each UNION query must have the same number of columns

The same error occurs with a search clause.


3)
If I take the same infinite loop query but replace the TABLE syntax with
a SELECT and add a cycle clause, it's not an infinite loop anymore.

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to true default false using p
  table a;

 b | c |     p
---+---+-----------
 1 | f | {(1)}
 1 | t | {(1),(1)}
(2 rows)

Why does it stop?  It should still be an infinite loop.


4)
If I use NULL instead of false, I only get one row back.

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to true default false using p
  table a;

 b | c |   p
---+---+-------
 1 |   | {(1)}
(1 row)


5)
I can set both states to the same value.

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to true default true using p
  table a;

 b | c |   p
---+---+-------
 1 | t | {(1)}
(1 row)

This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
 BTW, I applaud your decision to violate the other part of that rule and
allowing any data type here.


5)
The same rule as above says that the value and the default value must be
literals but not everything that a human might consider a literal is
accepted.  In particular:

  with recursive a as (select 1 as b union all select b from a)
    cycle b set c to 1 default -1 using p
  table a;

  ERROR:  syntax error at or near "-"

Can we just accept a full a_expr here instead of AexprConst?  Both
DEFAULT and USING are fully reserved keywords.



That's all for now; will test more later.
Thanks for working on this!
-- 
Vik Fearing

Re: SEARCH and CYCLE clauses

From

Peter Eisentraut

Date:

22 May 2020, 09:24:55

On 2020-05-20 21:28, Vik Fearing wrote:
> 1)
> There are some smart quotes in the comments that should be converted to
> single quotes.

ok, fixing that

> 2)
> This query is an infinite loop, as expected:
> 
>    with recursive a as (select 1 as b union all select b from a)
>    table a;
> 
> But it becomes an error when you add a cycle clause to it:
> 
>    with recursive a as (select 1 as b union all table a)
>      cycle b set c to true default false using p
>    table a;
> 
>    ERROR:  each UNION query must have the same number of columns

table a expands to select * from a, and if you have a cycle clause, then 
a has three columns, but the other branch of the union only has one, so 
that won't work anymore, will it?

> 3)
> If I take the same infinite loop query but replace the TABLE syntax with
> a SELECT and add a cycle clause, it's not an infinite loop anymore.
> 
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default false using p
>    table a;
> 
>   b | c |     p
> ---+---+-----------
>   1 | f | {(1)}
>   1 | t | {(1),(1)}
> (2 rows)
> 
> Why does it stop?  It should still be an infinite loop.

If you specify the cycle clause, then the processing will stop if it 
sees the same row more than once, which it did here.

> 4)
> If I use NULL instead of false, I only get one row back.
> 
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default false using p
>    table a;
> 
>   b | c |   p
> ---+---+-------
>   1 |   | {(1)}
> (1 row)

If you specify null, then the cycle check expression will always fail, 
so it will abort after the first row.  (We should perhaps prohibit 
specifying null, but see below.)

> 5)
> I can set both states to the same value.
> 
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to true default true using p
>    table a;

> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
>   BTW, I applaud your decision to violate the other part of that rule and
> allowing any data type here.
> 
> 
> 5)
> The same rule as above says that the value and the default value must be
> literals but not everything that a human might consider a literal is
> accepted.  In particular:
> 
>    with recursive a as (select 1 as b union all select b from a)
>      cycle b set c to 1 default -1 using p
>    table a;
> 
>    ERROR:  syntax error at or near "-"
> 
> Can we just accept a full a_expr here instead of AexprConst?  Both
> DEFAULT and USING are fully reserved keywords.

This is something we need to think about.  If we want to check at parse 
time whether the two values are not the same (and perhaps not null), 
then we either need to restrict the generality of what we can specify, 
or we need to be prepared to do full expression evaluation in the 
parser.  A simple and practical way might be to only allow string and 
boolean literal.  I don't have a strong opinion here.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SEARCH and CYCLE clauses

From

Pavel Stehule

Date:

22 May 2020, 09:33:30

pá 22. 5. 2020 v 11:25 odesílatel Peter Eisentraut <peter.eisentraut@2ndquadrant.com> napsal:

On 2020-05-20 21:28, Vik Fearing wrote:
> 1)
> There are some smart quotes in the comments that should be converted to
> single quotes.

ok, fixing that

> 2)
> This query is an infinite loop, as expected:
>
> with recursive a as (select 1 as b union all select b from a)
> table a;
>
> But it becomes an error when you add a cycle clause to it:
>
> with recursive a as (select 1 as b union all table a)
> cycle b set c to true default false using p
> table a;
>
> ERROR: each UNION query must have the same number of columns

table a expands to select * from a, and if you have a cycle clause, then
a has three columns, but the other branch of the union only has one, so
that won't work anymore, will it?

> 3)
> If I take the same infinite loop query but replace the TABLE syntax with
> a SELECT and add a cycle clause, it's not an infinite loop anymore.
>
> with recursive a as (select 1 as b union all select b from a)
> cycle b set c to true default false using p
> table a;
>
> b | c | p
> ---+---+-----------
> 1 | f | {(1)}
> 1 | t | {(1),(1)}
> (2 rows)
>
> Why does it stop? It should still be an infinite loop.

If you specify the cycle clause, then the processing will stop if it
sees the same row more than once, which it did here.

> 4)
> If I use NULL instead of false, I only get one row back.
>
> with recursive a as (select 1 as b union all select b from a)
> cycle b set c to true default false using p
> table a;
>
> b | c | p
> ---+---+-------
> 1 | | {(1)}
> (1 row)

If you specify null, then the cycle check expression will always fail,
so it will abort after the first row. (We should perhaps prohibit
specifying null, but see below.)

> 5)
> I can set both states to the same value.
>
> with recursive a as (select 1 as b union all select b from a)
> cycle b set c to true default true using p
> table a;

> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
> BTW, I applaud your decision to violate the other part of that rule and
> allowing any data type here.
>
>
> 5)
> The same rule as above says that the value and the default value must be
> literals but not everything that a human might consider a literal is
> accepted. In particular:
>
> with recursive a as (select 1 as b union all select b from a)
> cycle b set c to 1 default -1 using p
> table a;
>
> ERROR: syntax error at or near "-"
>
> Can we just accept a full a_expr here instead of AexprConst? Both
> DEFAULT and USING are fully reserved keywords.

This is something we need to think about. If we want to check at parse
time whether the two values are not the same (and perhaps not null),
then we either need to restrict the generality of what we can specify,
or we need to be prepared to do full expression evaluation in the
parser. A simple and practical way might be to only allow string and
boolean literal. I don't have a strong opinion here.

if you check it in parse time, then you disallow parametrization there.

Is any reason to do this check in parse time?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SEARCH and CYCLE clauses

From

Vik Fearing

Date:

22 May 2020, 10:40:12

On 5/22/20 11:24 AM, Peter Eisentraut wrote:
> On 2020-05-20 21:28, Vik Fearing wrote:
>> 1)
>> There are some smart quotes in the comments that should be converted to
>> single quotes.
> 
> ok, fixing that
> 
>> 2)
>> This query is an infinite loop, as expected:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>    table a;
>>
>> But it becomes an error when you add a cycle clause to it:
>>
>>    with recursive a as (select 1 as b union all table a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>    ERROR:  each UNION query must have the same number of columns
> 
> table a expands to select * from a, and if you have a cycle clause, then
> a has three columns, but the other branch of the union only has one, so
> that won't work anymore, will it?

It seems there was a copy/paste error here.  The first query should have
been the same as the second but without the cycle clause.

It seems strange to me that adding a <search or cycle clause> would
break a previously working query.  I would rather see the * expanded
before adding the new columns.  This is a user's opinion, I don't know
how hard that would be to implement.

>> 3)
>> If I take the same infinite loop query but replace the TABLE syntax with
>> a SELECT and add a cycle clause, it's not an infinite loop anymore.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>   b | c |     p
>> ---+---+-----------
>>   1 | f | {(1)}
>>   1 | t | {(1),(1)}
>> (2 rows)
>>
>> Why does it stop?  It should still be an infinite loop.
> 
> If you specify the cycle clause, then the processing will stop if it
> sees the same row more than once, which it did here.

Yes, this was a misplaced expectation on my part.

>> 4)
>> If I use NULL instead of false, I only get one row back.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>   b | c |   p
>> ---+---+-------
>>   1 |   | {(1)}
>> (1 row)
> 
> If you specify null, then the cycle check expression will always fail,
> so it will abort after the first row.  (We should perhaps prohibit
> specifying null, but see below.)

I would rather make the cycle check expression work with null.

>> 5)
>> I can set both states to the same value.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default true using p
>>    table a;
> 
>> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
>>   BTW, I applaud your decision to violate the other part of that rule and
>> allowing any data type here.
>>
>>
>> 5)
>> The same rule as above says that the value and the default value must be
>> literals but not everything that a human might consider a literal is
>> accepted.  In particular:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to 1 default -1 using p
>>    table a;
>>
>>    ERROR:  syntax error at or near "-"
>>
>> Can we just accept a full a_expr here instead of AexprConst?  Both
>> DEFAULT and USING are fully reserved keywords.
> 
> This is something we need to think about.  If we want to check at parse
> time whether the two values are not the same (and perhaps not null),
> then we either need to restrict the generality of what we can specify,
> or we need to be prepared to do full expression evaluation in the
> parser.  A simple and practical way might be to only allow string and
> boolean literal.  I don't have a strong opinion here.


I'm with Pavel on this one.  Why does it have to be a parse-time error?
 Also, I regularly see people write functions as sort of pauper's
variables, but a function call isn't allowed here.

----

Another bug I found.  If I try to do your regression test as an
autonomous query, I get this:

    with recursive

    graph (f, t, label) as (
        values (1, 2, 'arc 1 -> 2'),
               (1, 3, 'arc 1 -> 3'),
               (2, 3, 'arc 2 -> 3'),
               (1, 4, 'arc 1 -> 4'),
               (4, 5, 'arc 4 -> 5'),
               (5, 1, 'arc 5 -> 1')
    ),

    search_graph (f, t, label) as (
        select * from graph g
        union all
        select g.*
        from graph g, search_graph sg
        where g.f = sg.t
    )
    cycle f, t set is_cycle to true default false using path

    select * from search_graph;

    ERROR:  could not find CTE "graph"

----

As an improvement over the spec, I think the vast majority of people
will be using simple true/false values.  Can we make that optional?

    CYCLE f, t SET is_cycle USING path

would be the same as

    CYCLE f, t SET is_cycle TO true DEFAULT false USING path
-- 
Vik Fearing

Re: SEARCH and CYCLE clauses

From

Peter Eisentraut

Date:

22 May 2020, 12:32:41

On 2020-05-22 12:40, Vik Fearing wrote:
>> If you specify null, then the cycle check expression will always fail,
>> so it will abort after the first row.  (We should perhaps prohibit
>> specifying null, but see below.)
> 
> I would rather make the cycle check expression work with null.

It works correctly AFAICT.  What is your expectation?

>> This is something we need to think about.  If we want to check at parse
>> time whether the two values are not the same (and perhaps not null),
>> then we either need to restrict the generality of what we can specify,
>> or we need to be prepared to do full expression evaluation in the
>> parser.  A simple and practical way might be to only allow string and
>> boolean literal.  I don't have a strong opinion here.
> 
> 
> I'm with Pavel on this one.  Why does it have to be a parse-time error?
>   Also, I regularly see people write functions as sort of pauper's
> variables, but a function call isn't allowed here.

If not parse-time error, at what time do you want to check it?

> As an improvement over the spec, I think the vast majority of people
> will be using simple true/false values.  Can we make that optional?
> 
>      CYCLE f, t SET is_cycle USING path
> 
> would be the same as
> 
>      CYCLE f, t SET is_cycle TO true DEFAULT false USING path

I was also considering that.  It would be an easy change to make.

(Apparently, in DB2 you can omit the USING path clause.  Not sure how to 
make that work, however.)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SEARCH and CYCLE clauses

From

Peter Eisentraut

Date:

03 July 2020, 07:08:26

While the larger patch is being considered, I think some simpler and 
separable pieces could be addressed.

Here is a patch that adjusts the existing cycle detection example and 
test queries to put the cycle column before the path column.  The CYCLE 
clause puts them in that order, and so if we added that feature that 
would make the sequence of examples more consistent and easier to follow.

(And while the order of columns has no semantic meaning, for a human 
left-to-right reader it also makes a bit more sense because the cycle 
flag is computed against the previous path value, so it happens "before" 
the path column.)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

0001-Adjust-cycle-detection-examples-and-tests.patch

Re: SEARCH and CYCLE clauses

From

Pavel Stehule

Date:

22 September 2020, 18:29:33

út 22. 9. 2020 v 20:01 odesílatel Peter Eisentraut <peter.eisentraut@2ndquadrant.com> napsal:

I have implemented the SEARCH and CYCLE clauses.

This is standard SQL syntax attached to a recursive CTE to compute a
depth- or breadth-first ordering and cycle detection, respectively.
This is just convenience syntax for what you can already do manually.
The original discussion about recursive CTEs briefly mentioned these as
something to do later but then it was never mentioned again.

SQL specifies these in terms of syntactic transformations, and so that's
how I have implemented them also, mainly in the rewriter.

I have successfully tested this against examples I found online that
were aimed at DB2.

The contained documentation and the code comment in rewriteHandler.c
explain the details.

I am playing with this patch. It looks well, but I found some issues (example is from attached data.sql)

WITH recursive destinations (departure, arrival, connections, cost) AS
(SELECT f.departure, f.arrival, 0, price
FROM flights f
WHERE f.departure = 'New York'
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure) cycle departure, arrival set is_cycle to true default false using path

SELECT *
FROM destinations ;
;

The result is correct. When I tried to use UNION instead UNION ALL, the pg crash

Program received signal SIGABRT, Aborted.
0x00007f761338ebc5 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f761338ebc5 in raise () from /lib64/libc.so.6
#1 0x00007f76133778a4 in abort () from /lib64/libc.so.6
#2 0x000000000090e7eb in ExceptionalCondition (conditionName=<optimized out>, errorType=<optimized out>, fileName=<optimized out>,
lineNumber=<optimized out>) at assert.c:67
#3 0x00000000007205e7 in generate_setop_grouplist (targetlist=targetlist@entry=0x7f75fce5d018, op=<optimized out>, op=<optimized out>)
at prepunion.c:1412
#4 0x00000000007219d0 in generate_recursion_path (pTargetList=0x7fff073ee728, refnames_tlist=<optimized out>, root=0xf90bd8, setOp=0xf90840)
at prepunion.c:502
#5 plan_set_operations (root=0xf90bd8) at prepunion.c:156
#6 0x000000000070f79b in grouping_planner (root=0xf90bd8, inheritance_update=false, tuple_fraction=<optimized out>) at planner.c:1886
#7 0x0000000000712ea7 in subquery_planner (glob=<optimized out>, parse=<optimized out>, parent_root=<optimized out>, hasRecursion=<optimized out>,
tuple_fraction=0) at planner.c:1015
#8 0x000000000071a614 in SS_process_ctes (root=0xf7abd8) at subselect.c:952
#9 0x00000000007125d4 in subquery_planner (glob=glob@entry=0xf8a010, parse=parse@entry=0xf6cf20, parent_root=parent_root@entry=0x0,
hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at planner.c:645
#10 0x000000000071425b in standard_planner (parse=0xf6cf20, query_string=<optimized out>, cursorOptions=256, boundParams=<optimized out>)
at planner.c:405
#11 0x00000000007e5f68 in pg_plan_query (querytree=0xf6cf20,
query_string=query_string@entry=0xea6370 "WITH recursive destinations (departure, arrival, connections, cost) AS \n (SELECT f.departure, f.arrival, 0, price\n", ' ' <repeats 12 times>, "FROM flights f \n", ' ' <repeats 12 times>, "WHERE f.departure = 'New York' \n UNION "...,
cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at postgres.c:875
#12 0x00000000007e6061 in pg_plan_queries (querytrees=0xf8b690,
query_string=query_string@entry=0xea6370 "WITH recursive destinations (departure, arrival, connections, cost) AS \n (SELECT f.departure, f.arrival, 0, price\n", ' ' <repeats 12 times>, "FROM flights f \n", ' ' <repeats 12 times>, "WHERE f.departure = 'New York' \n UNION "...,
cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at postgres.c:966
#13 0x00000000007e63b8 in exec_simple_query (
query_string=0xea6370 "WITH recursive destinations (departure, arrival, connections, cost) AS \n (SELECT f.departure, f.arrival, 0, price\n", ' ' <repeats 12 times>, "FROM flights f \n", ' ' <repeats 12 times>, "WHERE f.departure = 'New York' \n UNION "...) at postgres.c:1158
#14 0x00000000007e81e4 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:4309
#15 0x00000000007592b9 in BackendRun (port=0xecaf20) at postmaster.c:4541
#16 BackendStartup (port=0xecaf20) at postmaster.c:4225
#17 ServerLoop () at postmaster.c:1742
#18 0x000000000075a0ed in PostmasterMain (argc=<optimized out>, argv=0xea0c90) at postmaster.c:1415
#19 0x00000000004832ec in main (argc=3, argv=0xea0c90) at main.c:209

looks so clause USING in cycle detection is unsupported for DB2 and Oracle - the examples from these databases doesn't work on PG without modifications

Regards

Pavel

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

data.sql

Re: SEARCH and CYCLE clauses

From

Pavel Stehule

Date:

22 September 2020, 18:43:58

I found another bug

create view xx as WITH recursive destinations (departure, arrival, connections, cost, itinerary) AS
(SELECT f.departure, f.arrival, 1, price,
CAST(f.departure || f.arrival AS VARCHAR(2000))
FROM flights f
WHERE f.departure = 'New York'
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1 ,
r.cost + b.price, CAST(r.itinerary || b.arrival AS VARCHAR(2000))
FROM destinations r, flights b
WHERE r.arrival = b.departure)
CYCLE arrival SET cyclic_data TO '1' DEFAULT '0' using path
SELECT departure, arrival, itinerary, cyclic_data
FROM destinations ;

postgres=# select * from xx;
ERROR: attribute number 6 exceeds number of columns 5

Regards

Pavel

Re: SEARCH and CYCLE clauses

From

Peter Eisentraut

Date:

09 October 2020, 09:40:40

On 2020-09-22 20:29, Pavel Stehule wrote:
> The result is correct. When I tried to use UNION instead UNION ALL, the 
> pg crash

I fixed the crash, but UNION [DISTINCT] won't actually work here because 
row/record types are not hashable.  I'm leaving the partial support in, 
but I'm documenting it as currently not supported.

> looks so clause USING in cycle detection is unsupported for DB2 and 
> Oracle - the examples from these databases doesn't work on PG without 
> modifications

Yeah, the path clause is actually not necessary from a user's 
perspective, but it's required for internal bookkeeping.  We could 
perhaps come up with a mechanism to make it invisible coming out of the 
CTE (maybe give the CTE a target list internally), but that seems like a 
separate project.

The attached patch fixes the issues you have reported (also the view 
issue from the other email).  I have also moved the whole rewrite 
support to a new file to not blow up rewriteHandler.c so much.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

v3-0001-SEARCH-and-CYCLE-clauses.patch

Re: SEARCH and CYCLE clauses

From

Pavel Stehule

Date:

09 October 2020, 10:17:32

pá 9. 10. 2020 v 11:40 odesílatel Peter Eisentraut <peter.eisentraut@2ndquadrant.com> napsal:

On 2020-09-22 20:29, Pavel Stehule wrote:
> The result is correct. When I tried to use UNION instead UNION ALL, the
> pg crash

I fixed the crash, but UNION [DISTINCT] won't actually work here because
row/record types are not hashable. I'm leaving the partial support in,
but I'm documenting it as currently not supported.

I think so UNION is a common solution against the cycles. So missing support for this specific case is not a nice thing. How much work is needed for hashing rows. It should not be too much code.

> looks so clause USING in cycle detection is unsupported for DB2 and
> Oracle - the examples from these databases doesn't work on PG without
> modifications

Yeah, the path clause is actually not necessary from a user's
perspective, but it's required for internal bookkeeping. We could
perhaps come up with a mechanism to make it invisible coming out of the
CTE (maybe give the CTE a target list internally), but that seems like a
separate project.

The attached patch fixes the issues you have reported (also the view
issue from the other email). I have also moved the whole rewrite
support to a new file to not blow up rewriteHandler.c so much.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SEARCH and CYCLE clauses

From

Pavel Stehule

Date:

10 October 2020, 05:25:25

pá 9. 10. 2020 v 12:17 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

pá 9. 10. 2020 v 11:40 odesílatel Peter Eisentraut <peter.eisentraut@2ndquadrant.com> napsal:
On 2020-09-22 20:29, Pavel Stehule wrote:
> The result is correct. When I tried to use UNION instead UNION ALL, the
> pg crash

I fixed the crash, but UNION [DISTINCT] won't actually work here because
row/record types are not hashable. I'm leaving the partial support in,
but I'm documenting it as currently not supported.

I think so UNION is a common solution against the cycles. So missing support for this specific case is not a nice thing. How much work is needed for hashing rows. It should not be too much code.

> looks so clause USING in cycle detection is unsupported for DB2 and
> Oracle - the examples from these databases doesn't work on PG without
> modifications

Yeah, the path clause is actually not necessary from a user's
perspective, but it's required for internal bookkeeping. We could
perhaps come up with a mechanism to make it invisible coming out of the
CTE (maybe give the CTE a target list internally), but that seems like a
separate project.

The attached patch fixes the issues you have reported (also the view
issue from the other email). I have also moved the whole rewrite
support to a new file to not blow up rewriteHandler.c so much.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

This patch is based on transformation CYCLE and SEARCH clauses to specific expressions - it is in agreement with ANSI SQL

There is not a problem with compilation

Nobody had objections in discussion

There are enough regress tests and documentation

check-world passed

doc build passed

I'll mark this patch as ready for committer

Possible enhancing for this feature (can be done in next steps)

1. support UNION DISTINCT

2. better compatibility with Oracle and DB2 (USING clause can be optional)

Regards

Pavel

Re: SEARCH and CYCLE clauses

From

Robert Haas

Date:

12 October 2020, 20:24:00

On Fri, May 22, 2020 at 5:25 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> This is something we need to think about.  If we want to check at parse
> time whether the two values are not the same (and perhaps not null),
> then we either need to restrict the generality of what we can specify,
> or we need to be prepared to do full expression evaluation in the
> parser.  A simple and practical way might be to only allow string and
> boolean literal.  I don't have a strong opinion here.

I don't have an opinion on this feature, but I think doing expression
evaluation in the raw parser would be a pretty bad idea. I think we're
not supposed to do anything in the parser that involves catalog access
or even references to GUC values. It might be OK if it happens in
parse analysis rather than gram.y, but even that sounds a bit sketchy
to me. We'd need to think carefully about what effects such things
woul have on the plan cache, and whether they introduce any security
holes, and maybe some other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: SEARCH and CYCLE clauses

From

Anastasia Lubennikova

Date:

27 October 2020, 19:31:19

On 10.10.2020 08:25, Pavel Stehule wrote:

Hi

pá 9. 10. 2020 v 12:17 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

pá 9. 10. 2020 v 11:40 odesílatel Peter Eisentraut <peter.eisentraut@2ndquadrant.com> napsal:
On 2020-09-22 20:29, Pavel Stehule wrote:
> The result is correct. When I tried to use UNION instead UNION ALL, the
> pg crash

I fixed the crash, but UNION [DISTINCT] won't actually work here because
row/record types are not hashable. I'm leaving the partial support in,
but I'm documenting it as currently not supported.

I think so UNION is a common solution against the cycles. So missing support for this specific case is not a nice thing. How much work is needed for hashing rows. It should not be too much code.

> looks so clause USING in cycle detection is unsupported for DB2 and
> Oracle - the examples from these databases doesn't work on PG without
> modifications

Yeah, the path clause is actually not necessary from a user's
perspective, but it's required for internal bookkeeping. We could
perhaps come up with a mechanism to make it invisible coming out of the
CTE (maybe give the CTE a target list internally), but that seems like a
separate project.

The attached patch fixes the issues you have reported (also the view
issue from the other email). I have also moved the whole rewrite
support to a new file to not blow up rewriteHandler.c so much.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

This patch is based on transformation CYCLE and SEARCH clauses to specific expressions - it is in agreement with ANSI SQL

There is not a problem with compilation
Nobody had objections in discussion
There are enough regress tests and documentation
check-world passed
doc build passed

I'll mark this patch as ready for committer

Possible enhancing for this feature (can be done in next steps)

1. support UNION DISTINCT
2. better compatibility with Oracle and DB2 (USING clause can be optional)

Regards

Pavel

Status update for a commitfest entry.
According to cfbot patch no longer applies. So I moved it to waiting on author.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: SEARCH and CYCLE clauses

From

Peter Eisentraut

Date:

25 November 2020, 13:06:03

On 2020-10-10 07:25, Pavel Stehule wrote:
> This patch is based on transformation CYCLE and SEARCH clauses to 
> specific expressions - it is in agreement with ANSI SQL
> 
> There is not a problem with compilation
> Nobody had objections in discussion
> There are enough regress tests and documentation
> check-world passed
> doc build passed
> 
> I'll mark this patch as ready for committer
> 
> Possible enhancing for this feature (can be done in next steps)
> 
> 1. support UNION DISTINCT
> 2. better compatibility with Oracle and DB2 (USING clause can be optional)

Here is an updated patch.  New since last time:

- UNION DISTINCT is now supported (since hash_record() was added)

- Some code has been cleaned up.

- Some code has been moved from the rewriter to the parser so that 
certain errors are properly detected at parse time.

- Added more syntax checks and more tests.

- Support for dependency tracking was added (the type and operator for 
the cycle mark need to be added as dependencies).

I found a bug that nested UNIONs (foo UNION bar UNION baz) were not 
handled (would crash) in the rewriter code.  For now, I have just 
changed that to error out.  This could be fixed, it would be a localized 
change in the rewriter code in any case.  Doesn't seem important for the 
first pass, though.

-- 
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Attachment

v4-0001-SEARCH-and-CYCLE-clauses.patch

Re: SEARCH and CYCLE clauses

From

Pavel Stehule

Date:

25 November 2020, 19:35:30

st 25. 11. 2020 v 14:06 odesílatel Peter Eisentraut <peter.eisentraut@2ndquadrant.com> napsal:

On 2020-10-10 07:25, Pavel Stehule wrote:
> This patch is based on transformation CYCLE and SEARCH clauses to
> specific expressions - it is in agreement with ANSI SQL
>
> There is not a problem with compilation
> Nobody had objections in discussion
> There are enough regress tests and documentation
> check-world passed
> doc build passed
>
> I'll mark this patch as ready for committer
>
> Possible enhancing for this feature (can be done in next steps)
>
> 1. support UNION DISTINCT
> 2. better compatibility with Oracle and DB2 (USING clause can be optional)

Here is an updated patch. New since last time:

- UNION DISTINCT is now supported (since hash_record() was added)

- Some code has been cleaned up.

- Some code has been moved from the rewriter to the parser so that
certain errors are properly detected at parse time.

- Added more syntax checks and more tests.

- Support for dependency tracking was added (the type and operator for
the cycle mark need to be added as dependencies).

I found a bug that nested UNIONs (foo UNION bar UNION baz) were not
handled (would crash) in the rewriter code. For now, I have just
changed that to error out. This could be fixed, it would be a localized
change in the rewriter code in any case. Doesn't seem important for the
first pass, though.

I checked this patch, and I didn't find any issue.

make check-world passed

make doc passed

I'll mark it as ready for committer

Regards

Pavel

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Re: SEARCH and CYCLE clauses

From

Vik Fearing

Date:

08 December 2020, 16:31:09

On 5/22/20 12:40 PM, Vik Fearing wrote:
>>> 2)
>>> This query is an infinite loop, as expected:
>>>
>>>    with recursive a as (select 1 as b union all select b from a)
>>>    table a;
>>>
>>> But it becomes an error when you add a cycle clause to it:
>>>
>>>    with recursive a as (select 1 as b union all table a)
>>>      cycle b set c to true default false using p
>>>    table a;
>>>
>>>    ERROR:  each UNION query must have the same number of columns
>> table a expands to select * from a, and if you have a cycle clause, then
>> a has three columns, but the other branch of the union only has one, so
>> that won't work anymore, will it?
> It seems there was a copy/paste error here.  The first query should have
> been the same as the second but without the cycle clause.
> 
> It seems strange to me that adding a <search or cycle clause> would
> break a previously working query.  I would rather see the * expanded
> before adding the new columns.  This is a user's opinion, I don't know
> how hard that would be to implement.


After thinking about it quite a bit more, I have changed my mind on
this.  The transformation does add columns to the <with list element>
and so TABLE or SELECT * should see them.  Especially since they see
them from outside of the wle.
-- 
Vik Fearing

Re: SEARCH and CYCLE clauses

From

Vik Fearing

Date:

08 December 2020, 16:33:55

On 6/15/20 11:49 AM, Peter Eisentraut wrote:
> To fix the star expansion I had to add a little bit of infrastructure
> that could also be used as a more general facility "don't include this
> column in star expansion", so this could perhaps use some consideration
> from a more general angle as well.

Could this work be salvaged to add the ability to ALTER a column to hide
it from star expansion?  That's a feature I've often seen requested,
especially from people working with PostGIS's geometry.

Totally off-topic for this thread, though.
-- 
Vik Fearing

Re: SEARCH and CYCLE clauses

From

Peter Eisentraut

Date:

01 February 2021, 18:02:16

On 2020-11-25 20:35, Pavel Stehule wrote:
> I checked this patch, and I didn't find any issue.
> 
> make check-world passed
> make doc passed
> 
> I'll mark it as ready for committer

This has been committed.  Thanks.

-- 
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Re: SEARCH and CYCLE clauses

From

Pavel Stehule

Date:

01 February 2021, 18:16:01

po 1. 2. 2021 v 19:02 odesílatel Peter Eisentraut <peter.eisentraut@2ndquadrant.com> napsal:

On 2020-11-25 20:35, Pavel Stehule wrote:
> I checked this patch, and I didn't find any issue.
>
> make check-world passed
> make doc passed
>
> I'll mark it as ready for committer

This has been committed. Thanks.

great!

Pavel

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/