Thread: Proposing WITH ITERATIVE

Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
Hey, everyone.

It's been quite a while since I last contributed a patch but, as this is a new feature, I wanted to gather feedback before doing so. I've found this functionality, already in use at Universität Tübingen, to be both highly beneficial in many of my queries as well as a small change to Postgres - a good trade-off IMO.

As you know, Postgres currently supports SQL:1999 recursive common table expressions, constructed using WITH RECURSIVE, which permit the computation of growing relations (e.g., transitive closures.) Although it is possible to use this recursion for general-purpose iterations, doing so is a deviation from its intended use case. Accordingly, an iterative-only use of WITH RECURSIVE often results in sizable overhead and poor optimizer decisions. In this email, I'd like to propose a similar but iterative-optimized form of CTE - WITH ITERATIVE.

In that it can reference a relation within its definition, this iterative variant has similar capabilities as recursive CTEs. In contrast to recursive CTEs, however, rather than appending new tuples, this variant performs a replacement of the intermediate relation. As such, the final result consists of a relation containing tuples computed during the last iteration only. Why would we want this?

This type of computation pattern is often found in data and graph mining algorithms. In PageRank, for example, the initial ranks are updated in each iteration. In clustering algorithms, assignments of data tuples to clusters are updated in every iteration. Something these types of algorithms have in common is that they operate on fixed-size datasets, where only specific values (ranks, assigned clusters, etc.) are updated.

Rather than stopping when no new tuples are generated, WITH ITERATIVE stops when a user-defined predicate evaluates to true. By providing a non-appending iteration concept with a while-loop-style stop criterion, we can massively speed up query execution due to better optimizer decisions. Although it is possible to implement the types of algorithms mentioned above using recursive CTEs, this feature has two significant advantages:

First, iterative CTEs consume significantly less memory. Consider a CTE of N tuples and I iterations. With recursive CTEs, such a relation would grow to (N * I) tuples. With iterative CTEs, however, all prior iterations are discarded early. As such, the size of the relation would remain N, requiring only a maximum of (2 * N) tuples for comparisons of the current and the prior iteration. Furthermore, in queries where the number of executed iterations represents the stop criterion, recursive CTEs require even more additional per-tuple overhead - to carry along the iteration counter.

Second, iterative CTEs have lower query response times. Because of its smaller size, the time to scan and process the intermediate relation, to re-compute ranks, clusters, etc., is significantly reduced.

In short, iterative CTEs retain the flexibility of recursive CTEs, but offer a significant advantage for algorithms which don't require results computed throughout all iterations. As this feature deviates only slightly from WITH RECURSIVE, there is very little code required to support it (~250 loc.)

If there's any interest, I'll clean-up their patch and submit. Thoughts?

--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
Jeff Davis
Date:
Hi,

You might get better feedback in a month or so; right now we just got
into feature freeze.

On Mon, 2020-04-27 at 12:52 -0400, Jonah H. Harris wrote:
> In that it can reference a relation within its definition, this
> iterative variant has similar capabilities as recursive CTEs. In
> contrast to recursive CTEs, however, rather than appending new
> tuples, this variant performs a replacement of the intermediate
> relation. As such, the final result consists of a relation containing
> tuples computed during the last iteration only. Why would we want
> this?

Can you illustrate with some examples? I get that one is appending and
the other is modifying in-place, but how does this end up looking in
the query language?

> This type of computation pattern is often found in data and graph
> mining algorithms.

It certainly sounds useful.

> Rather than stopping when no new tuples are generated, WITH ITERATIVE
> stops when a user-defined predicate evaluates to true.

Why stop when it evaluates to true, and not false?

> First, iterative CTEs consume significantly less memory. Consider a
> CTE of N tuples and I iterations. With recursive CTEs, such a
> relation would grow to (N * I) tuples. With iterative CTEs, however,
> all prior iterations are discarded early. As such, the size of the
> relation would remain N, requiring only a maximum of (2 * N) tuples
> for comparisons of the current and the prior iteration. Furthermore,
> in queries where the number of executed iterations represents the
> stop criterion, recursive CTEs require even more additional per-tuple 
> overhead - to carry along the iteration counter.

It seems like the benefit comes from carrying information along within
tuples (by adding to scores or counters) rather than appending tuples.
Is it possible to achieve this in other ways? The recursive CTE
implementation is a very direct implementation of the standard, perhaps
there are smarter approaches?

Regards,
    Jeff Davis




Re: Proposing WITH ITERATIVE

From
David Fetter
Date:
On Mon, Apr 27, 2020 at 12:52:48PM -0400, Jonah H. Harris wrote:
> Hey, everyone.

> If there's any interest, I'll clean-up their patch and submit. Thoughts?

Where's the current patch?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Mon, Apr 27, 2020 at 10:32 PM David Fetter <david@fetter.org> wrote:
On Mon, Apr 27, 2020 at 12:52:48PM -0400, Jonah H. Harris wrote:
> Hey, everyone.

> If there's any interest, I'll clean-up their patch and submit. Thoughts?

Where's the current patch?

It’s private. Hence, why I’m inquiring as to interest.

--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
David Fetter
Date:
On Mon, Apr 27, 2020 at 10:44:04PM -0400, Jonah H. Harris wrote:
> On Mon, Apr 27, 2020 at 10:32 PM David Fetter <david@fetter.org> wrote:
> > On Mon, Apr 27, 2020 at 12:52:48PM -0400, Jonah H. Harris wrote:
> > > Hey, everyone.
> >
> > > If there's any interest, I'll clean-up their patch and submit. Thoughts?
> >
> > Where's the current patch?
> 
> 
> It’s private. Hence, why I’m inquiring as to interest.

Have the authors agreed to make it available to the project under a
compatible license?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Mon, Apr 27, 2020 at 11:33 PM David Fetter <david@fetter.org> wrote:

Have the authors agreed to make it available to the project under a
compatible license?

If there’s interest, obviously. Otherwise I wouldn’t be asking.

I said from the start why I wasn’t attaching a patch and that I was seeing feedback. Honestly, David, stop wasting my, and list time, asking pointless off-topic questions.

--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
Oleksandr Shulgin
Date:
On Tue, Apr 28, 2020 at 5:49 AM Jonah H. Harris <jonah.harris@gmail.com> wrote:
On Mon, Apr 27, 2020 at 11:33 PM David Fetter <david@fetter.org> wrote:

Have the authors agreed to make it available to the project under a
compatible license?

If there’s interest, obviously. Otherwise I wouldn’t be asking.

I said from the start why I wasn’t attaching a patch and that I was seeing feedback. Honestly, David, stop wasting my, and list time, asking pointless off-topic questions.

Jonah,

I see it the other way round—it could end up as a waste of everyone's time discussing the details, if the authors don't agree to publish their code in the first place.  Of course, it could also be written from scratch, in which case I guess someone else from the community (who haven't seen that private code) would have to take a stab at it, but I believe it helps to know this in advance.

I also don't see how it "obviously" follows from your two claims: "I've found this functionality" and "I'll clean-up their patch and submit", that you have even asked (or, for that matter—found) the authors of that code.

Finally, I'd like to suggest you adopt a more constructive tone and become familiar with the project's Code of Conduct[1], if you haven't yet.  I am certain that constructive, respectful communication from your side will help the community to focus on the actual details of your proposal.

Kind regards,
--

Re: Proposing WITH ITERATIVE

From
Andreas Karlsson
Date:
On 4/27/20 6:52 PM, Jonah H. Harris wrote:> If there's any interest, 
I'll clean-up their patch and submit. Thoughts?

Hi,

Do you have any examples of queries where it would help? It is pretty 
hard to say how much value some new syntax adds without seeing how it 
improves an intended use case.

Andreas



Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Tue, Apr 28, 2020 at 6:16 AM Oleksandr Shulgin <oleksandr.shulgin@zalando.de> wrote:
will help the community to focus on the actual details of your proposal.

I'd like it if the community would focus on the details of the proposal as well :)

--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Tue, Apr 28, 2020 at 6:19 AM Andreas Karlsson <andreas@proxel.se> wrote:
Do you have any examples of queries where it would help? It is pretty
hard to say how much value some new syntax adds without seeing how it
improves an intended use case.

Hey, Andreas.

Thanks for the response. I'm currently working on a few examples per Jeff's response along with some benchmark information including improvements in response time and memory usage of the current implementation. In the meantime, as this functionality has been added to a couple of other databases and there's academic research on it, if you're interested, here's a few papers with examples:


--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
Stephen Frost
Date:
Greetings Jonah!

* Jonah H. Harris (jonah.harris@gmail.com) wrote:
> On Tue, Apr 28, 2020 at 6:19 AM Andreas Karlsson <andreas@proxel.se> wrote:
>
> > Do you have any examples of queries where it would help? It is pretty
> > hard to say how much value some new syntax adds without seeing how it
> > improves an intended use case.
>
> Thanks for the response. I'm currently working on a few examples per Jeff's
> response along with some benchmark information including improvements in
> response time and memory usage of the current implementation. In the
> meantime, as this functionality has been added to a couple of other
> databases and there's academic research on it, if you're interested, here's
> a few papers with examples:
>
> http://faculty.neu.edu.cn/cc/zhangyf/papers/2018-ICDCS2018-sqloop.pdf
> http://db.in.tum.de/~passing/publications/dm_in_hyper.pdf

Nice!

One of the first question that we need to ask though, imv anyway, is- do
the other databases use the WITH ITERATIVE syntax?  How many of them?
Are there other approaches?  Has this been brought up to the SQL
committee?

In general, we really prefer to avoid extending SQL beyond the standard,
especially in ways that the standard is likely to be expanded.  In
other words, we'd really prefer to avoid the risk that the SQL committee
declares WITH ITERATIVE to mean something else in the future, causing us
to have to have a breaking change to become compliant.  Now, if all the
other major DB vendors have WITH ITERATIVE and they all work the same
way, then we can have at least some confidence that the SQL committee
will end up defining it in that way and there won't be any issue.

We do have someone on the committee who watches these lists, hopefully
they'll have a chance to comment on this.  Perhaps it's already in
discussion in the committee.

Thanks!

Stephen

Attachment

Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Mon, Apr 27, 2020 at 8:50 PM Jeff Davis <pgsql@j-davis.com> wrote:
Hi,

Hey, Jeff. Long time no talk. Good to see you're still on here.

You might get better feedback in a month or so; right now we just got
into feature freeze.

Yep. No hurry. I've just been playing with this and wanted to start getting feedback. It's a side-project for me anyway, so time is limited.
 
Can you illustrate with some examples? I get that one is appending and
the other is modifying in-place, but how does this end up looking in
the query language?

I'm putting together a few concrete real-world examples.

> Rather than stopping when no new tuples are generated, WITH ITERATIVE
> stops when a user-defined predicate evaluates to true.

Why stop when it evaluates to true, and not false?

It's how they implemented it. A few other databases have implemented similar functionality but, as it's not standard, it's kinda just up to each implementor. I'm not married to that idea, but it has worked well for me so far.

It seems like the benefit comes from carrying information along within
tuples (by adding to scores or counters) rather than appending tuples.
Is it possible to achieve this in other ways? The recursive CTE
implementation is a very direct implementation of the standard, perhaps
there are smarter approaches?

Yeah, in that specific case, one of the other implementations seems to carry the counters along in the executor itself. But, as not all uses of this functionality are iteration-count-based, I think that's a little limiting. Using a terminator expression (of some kind) seems most adaptable, I think. I'll give some examples of both types of cases.

--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Tue, Apr 28, 2020 at 11:51 AM Stephen Frost <sfrost@snowman.net> wrote:
Greetings Jonah!

Hey, Stephen. Hope you're doing well, man!
 
One of the first question that we need to ask though, imv anyway, is- do
the other databases use the WITH ITERATIVE syntax?  How many of them?
Are there other approaches?  Has this been brought up to the SQL
committee?

There are four that I'm aware of, but I'll put together a full list. I don't believe it's been proposed to the standards committee, but I'll see if I can find transcripts/proposals. I imagine those are all still public.

In general, we really prefer to avoid extending SQL beyond the standard,
especially in ways that the standard is likely to be expanded.  In
other words, we'd really prefer to avoid the risk that the SQL committee
declares WITH ITERATIVE to mean something else in the future, causing us
to have to have a breaking change to become compliant.

Agreed. That's my main concern as well. 
 
Now, if all the other major DB vendors have WITH ITERATIVE and they all work the same
way, then we can have at least some confidence that the SQL committee
will end up defining it in that way and there won't be any issue.

This is where it sucks, as each vendor has done it differently. One uses WITH ITERATE, one uses WITH ITERATIVE, others use, what I consider to be, a nasty operator-esque style... I think ITERATIVE makes the most sense, but it does create a future contention as that definitely seems like the syntax the committee would use as well.

Oracle ran into this issue as well, which is why they added an additional clause to WITH that permits selection of depth/breadth-first search et al. While it's kinda nasty, we could always similarly amend WITH RECURSIVE to add an additional ITERATE or something to the tail of the with_clause rule. But, that's why I'm looking for feedback :)

We do have someone on the committee who watches these lists, hopefully
they'll have a chance to comment on this.  Perhaps it's already in
discussion in the committee.

That would be awesome.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Mon, Apr 27, 2020 at 8:50 PM Jeff Davis <pgsql@j-davis.com> wrote:
Can you illustrate with some examples? I get that one is appending and
the other is modifying in-place, but how does this end up looking in
the query language?

To ensure credit is given to the original authors, the initial patch I'm working with (against Postgres 11 and 12) came from Denis Hirn, Torsten Grust, and Christian Duta. Attached is a quick-and-dirty merge of that patch that applies cleanly against 13-devel. It is not solid, at all, but demonstrates the functionality. At present, my updates can be found at https://github.com/jonahharris/postgres/tree/with-iterative

As the patch makes use of additional booleans for iteration, if there's interest in incorporating this functionality, I'd like to discuss with Tom, Robert, et al regarding the current use of booleans for CTE recursion differentiation in parsing and planning. In terms of making it production-ready, I think the cleanest method would be to use a bitfield (rather than booleans) to differentiate recursive and iterative CTEs. Though, as that would touch more code, it's obviously up for discussion.

I'm working on a few good standalone examples of PageRank, shortest path, etc. But, the simplest Fibonacci example can be found here:

EXPLAIN ANALYZE
WITH RECURSIVE fib_sum (iteration, previous_number, new_number)
  AS (SELECT 1, 0::numeric, 1::numeric
       UNION ALL
      SELECT (iteration + 1), new_number, (previous_number + new_number)
        FROM fib_sum
       WHERE iteration <= 10000)
SELECT r.iteration, r.new_number
  FROM fib_sum r
 ORDER BY 1 DESC
 LIMIT 1;

                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3.81..3.81 rows=1 width=36) (actual time=44.418..44.418 rows=1 loops=1)
   CTE fib_sum
     ->  Recursive Union  (cost=0.00..3.03 rows=31 width=68) (actual time=0.005..14.002 rows=10001 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=68) (actual time=0.004..0.004 rows=1 loops=1)
           ->  WorkTable Scan on fib_sum  (cost=0.00..0.24 rows=3 width=68) (actual time=0.001..0.001 rows=1 loops=10001)
                 Filter: (iteration <= 10000)
                 Rows Removed by Filter: 0
   ->  Sort  (cost=0.78..0.85 rows=31 width=36) (actual time=44.417..44.417 rows=1 loops=1)
         Sort Key: r.iteration DESC
         Sort Method: top-N heapsort  Memory: 27kB
         ->  CTE Scan on fib_sum r  (cost=0.00..0.62 rows=31 width=36) (actual time=0.009..42.660 rows=10001 loops=1)
 Planning Time: 0.331 ms
 Execution Time: 45.949 ms
(13 rows)

-- No ORDER/LIMIT is required with ITERATIVE as only a single tuple is present
EXPLAIN ANALYZE
WITH ITERATIVE fib_sum (iteration, previous_number, new_number)
  AS (SELECT 1, 0::numeric, 1::numeric
       UNION ALL
      SELECT (iteration + 1), new_number, (previous_number + new_number)
        FROM fib_sum
       WHERE iteration <= 10000)
SELECT r.iteration, r.new_number
  FROM fib_sum r;

                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 CTE Scan on fib_sum r  (cost=3.03..3.65 rows=31 width=36) (actual time=11.626..11.627 rows=1 loops=1)
   CTE fib_sum
     ->  Recursive Union  (cost=0.00..3.03 rows=31 width=68) (actual time=11.621..11.621 rows=1 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=68) (actual time=0.001..0.001 rows=1 loops=1)
           ->  WorkTable Scan on fib_sum  (cost=0.00..0.24 rows=3 width=68) (actual time=0.001..0.001 rows=1 loops=10001)
                 Filter: (iteration <= 10000)
                 Rows Removed by Filter: 0
 Planning Time: 0.068 ms
 Execution Time: 11.651 ms
(9 rows)


--
Jonah H. Harris

Attachment

Re: Proposing WITH ITERATIVE

From
Fabien COELHO
Date:
Hello Jonah,

Nice.

> -- No ORDER/LIMIT is required with ITERATIVE as only a single tuple is
> present
> EXPLAIN ANALYZE
> WITH ITERATIVE fib_sum (iteration, previous_number, new_number)
>  AS (SELECT 1, 0::numeric, 1::numeric
>       UNION ALL
>      SELECT (iteration + 1), new_number, (previous_number + new_number)
>        FROM fib_sum
>       WHERE iteration <= 10000)
> SELECT r.iteration, r.new_number
>  FROM fib_sum r;

Nice.

My 0,02€ about the feature design:

I'm wondering about how to use such a feature in the context of WITH query 
with several queries having different behaviors. Currently "WITH" 
introduces a named-query (like a view), "WITH RECURSIVE" introduces a mix 
of recursive and named queries, pg really sees whether each one is 
recursive or not, and "RECURSIVE" is required but could just be guessed.

Now that there could be 3 variants in the mix, and for the feature to be 
orthogonal I think that it should be allowed. However, there is no obvious 
way to distinguish a RECURSIVE from an ITERATIVE, as it is more a 
behavioral thing than a structural one. This suggests allowing to tag each 
query somehow, eg before, which would be closer to the current approach:

   WITH
     foo(i) AS (simple select),
     RECURSIVE bla(i) AS (recursive select),
     ITERATIVE blup(i) AS (iterative select),

or maybe after AS, which may be clearer because closer to the actual 
query, which looks better to me:

   WITH
     foo(i) AS (simple select),
     bla(i) AS RECURSIVE (recursive select),
     boo(i) AS ITERATIVE (iterative select),
     …


Also, with 3 cases I would prefer that the default has a name so someone 
can talk about it otherwise than saying "default". Maybe SIMPLE would 
suffice, or something else. ISTM that as nothing is expected between AS 
and the open parenthesis, there is no need to have a reserved keyword, 
which is a good thing for the parser.

-- 
Fabien.

Re: Proposing WITH ITERATIVE

From
Peter Eisentraut
Date:
On 2020-04-29 07:09, Fabien COELHO wrote:
> I'm wondering about how to use such a feature in the context of WITH query
> with several queries having different behaviors. Currently "WITH"
> introduces a named-query (like a view), "WITH RECURSIVE" introduces a mix
> of recursive and named queries, pg really sees whether each one is
> recursive or not, and "RECURSIVE" is required but could just be guessed.

Yeah the RECURSIVE vs ITERATIVE is a bit of a red herring here.  As you 
say, the RECURSIVE keyword doesn't specify the processing but marks the 
fact that the specification of the query is recursive.

I think a syntax that would fit better within the existing framework 
would be something like

WITH RECURSIVE t AS (
     SELECT base case
   REPLACE ALL  -- instead of UNION ALL
     SELECT recursive case
)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Wed, Apr 29, 2020 at 7:22 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
Yeah the RECURSIVE vs ITERATIVE is a bit of a red herring here.  As you
say, the RECURSIVE keyword doesn't specify the processing but marks the
fact that the specification of the query is recursive.

Agreed. I started thinking through Fabien's response last night.

I think a syntax that would fit better within the existing framework
would be something like

WITH RECURSIVE t AS (
     SELECT base case
   REPLACE ALL  -- instead of UNION ALL
     SELECT recursive case
)

I was originally thinking more along the lines of Fabien's approach, but this is similarly interesting.

--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
Corey Huinker
Date:
On Wed, Apr 29, 2020 at 10:34 AM Jonah H. Harris <jonah.harris@gmail.com> wrote:
On Wed, Apr 29, 2020 at 7:22 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
Yeah the RECURSIVE vs ITERATIVE is a bit of a red herring here.  As you
say, the RECURSIVE keyword doesn't specify the processing but marks the
fact that the specification of the query is recursive.

Agreed. I started thinking through Fabien's response last night.

I think a syntax that would fit better within the existing framework
would be something like

WITH RECURSIVE t AS (
     SELECT base case
   REPLACE ALL  -- instead of UNION ALL
     SELECT recursive case
)

I was originally thinking more along the lines of Fabien's approach, but this is similarly interesting.

Obviously I'm very concerned about doing something that the SQL Standard will clobber somewhere down the road. Having said that, the recursive syntax always struck me as awkward even by SQL standards.

Perhaps something like this would be more readable

WITH t AS (
    UPDATE ( SELECT 1 AS ctr, 'x' as val )
    SET ctr = ctr + 1, val = val || 'x'
    WHILE ctr <= 100
    RETURNING ctr, val
)

The notion of an UPDATE on an ephemeral subquery isn't that special, see "subquery2" in  https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/update_statement.htm , so the only syntax here without precedence is dropping a WHILE into an UPDATE statement. 

Re: Proposing WITH ITERATIVE

From
Fabien COELHO
Date:
Hello Corey, Hello Peter,

My 0.02 € about the alternative syntaxes:

Peter:

> I think a syntax that would fit better within the existing framework
> would be something like
>
> WITH RECURSIVE t AS (
>      SELECT base case
>    REPLACE ALL  -- instead of UNION ALL
>      SELECT recursive case
> )

A good point about this approach is that the replacement semantics is 
clear, whereas using ITERATIVE with UNION is very misleading, as it is 
*not* a union at all.

This said I'm wondering how the parser would react.

Moreover, having a different syntax for normal queries and inside WITH 
query looks very undesirable from a language design point of view. This
suggests that the user should be able to write it anywhere:

   SELECT 1 REPLACE SELECT 2;

Well, maybe.

I'm unclear whether "REPLACE ALL" vs "REPLACE" makes sense, ISTM that 
there could be only one replacement semantics (delete the content and 
insert a new one)?

REPLACE should have an associativity defined wrt other operators:

   SELECT 1 UNION SELECT 2 REPLACE SELECT 3; -- how many rows?

I do not see anything obvious. Probably 2 rows.

Corey:

> Obviously I'm very concerned about doing something that the SQL Standard
> will clobber somewhere down the road. Having said that, the recursive
> syntax always struck me as awkward even by SQL standards.

Indeed!

> Perhaps something like this would be more readable
>
> WITH t AS (
>    UPDATE ( SELECT 1 AS ctr, 'x' as val )
>    SET ctr = ctr + 1, val = val || 'x'
>    WHILE ctr <= 100
>    RETURNING ctr, val
> )
>
> The notion of an UPDATE on an ephemeral subquery isn't that special, see
> "subquery2" in
> https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/update_statement.htm ,

I must admit that I do not like much needing another level of subquery, 
but maybe it could just be another named query in the WITH statement.

ISTM that UPDATE is quite restrictive as the number of rows cannot 
change, which does not seem desirable at all? How could I add or remove 
rows from one iteration to the next?

ISTM that the WHILE would be checked before updating, so that WHILE FALSE 
does nothing, in which case its position after SET is odd.

Having both WHERE and WHILE might look awkward.

Also it looks much more procedural this way, which is the point, but also 
depart from the declarative SELECT approach of WITH RECURSIVE.

-- 
Fabien.

Re: Proposing WITH ITERATIVE

From
Corey Huinker
Date:

> Perhaps something like this would be more readable
>
> WITH t AS (
>    UPDATE ( SELECT 1 AS ctr, 'x' as val )
>    SET ctr = ctr + 1, val = val || 'x'
>    WHILE ctr <= 100
>    RETURNING ctr, val
> )
>
> The notion of an UPDATE on an ephemeral subquery isn't that special, see
> "subquery2" in
> https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/update_statement.htm ,

I must admit that I do not like much needing another level of subquery,
but maybe it could just be another named query in the WITH statement.

So like this:
WITH initial_conditions as (SELECT 1 as ctr, 'x' as val)
UPDATE initial_conditions
SET ctr = ctr + 1, val = val || 'x'
WHILE ctr <= 100
RETURNING ctr, val
 
ISTM that UPDATE is quite restrictive as the number of rows cannot
change, which does not seem desirable at all? How could I add or remove
rows from one iteration to the next?

My understanding was that maintaining a fixed number of rows was a desired feature.
 
ISTM that the WHILE would be checked before updating, so that WHILE FALSE
does nothing, in which case its position after SET is odd.

True, but having the SELECT before the FROM is equally odd.
 
Having both WHERE and WHILE might look awkward.

Maybe an UNTIL instead of WHILE?
 

Also it looks much more procedural this way, which is the point, but also
depart from the declarative SELECT approach of WITH RECURSIVE.

Yeah, just throwing it out as a possibility. Looking again at what I suggested, it looks a bit like the Oracle "CONNECT BY level <= x" idiom.

I suspect that the SQL standards body already has some preliminary work done, and we should ultimately follow that. 

Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Wed, Apr 29, 2020 at 4:50 PM Corey Huinker <corey.huinker@gmail.com> wrote:
Having both WHERE and WHILE might look awkward.

Maybe an UNTIL instead of WHILE?

While I'm not a huge fan of it, one of the other databases implementing this functionality does so using the syntax:

WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf

Where N in ITERATIONS represents termination at an explicit count and, in UPDATES, represents termination after Ri updates more than n rows on table R.

--
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
"Jonah H. Harris"
Date:
On Wed, Apr 29, 2020 at 5:54 PM Jonah H. Harris <jonah.harris@gmail.com> wrote:
On Wed, Apr 29, 2020 at 4:50 PM Corey Huinker <corey.huinker@gmail.com> wrote:
Having both WHERE and WHILE might look awkward.

Maybe an UNTIL instead of WHILE?

While I'm not a huge fan of it, one of the other databases implementing this functionality does so using the syntax:

WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf

Where N in ITERATIONS represents termination at an explicit count and, in UPDATES, represents termination after Ri updates more than n rows on table R.

Sent too soon :) One of the main reasons I dislike the above is that it assumes N is known. In some cases, however, you really need termination upon a condition.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

From
Fabien COELHO
Date:
Hello,

more random thoughts about syntax, semantics, and keeping it relational.

> While I'm not a huge fan of it, one of the other databases implementing
> this functionality does so using the syntax:
>
> WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf
>
> Where N in ITERATIONS represents termination at an explicit count and, in
> UPDATES, represents termination after Ri updates more than n rows on table
> R.
>
> One of the main reasons I dislike the above is that it assumes N is 
> known. In some cases, however, you really need termination upon a 
> condition.

Yes, definitely, a (boolean?) condition is really needed, but possibly 
above N could be an expression, maybe with some separator before the 
query.

ISTM that using SELECT iterations is relational and close to the currently 
existing RECURSIVE. Separating the initialization and iterations with 
ITERATE is kind of the same approach than Peter's REPLACE, somehow, i.e. a 
new marker.

The above approach bothers me because it changes the query syntax a lot. 
The inside-WITH syntax should be the same as the normal query syntax.

First try. If we go to new markers, maybe the following, which kind of 
reuse Corey explicit condition, but replacing UPDATE with SELECT which 
makes it more generic:

  WITH R AS (
    ITERATE [STARTING] FROM R0
    WHILE/UNTIL condition REPEAT Ri
  );

Ok, it is quite procedural. It is really just a reordering of the syntax 
shown above, with a boolean condition thrown in and a heavy on (key)words 
SQL-like look and feel. It seems to make sense on a simple example:

  -- 1 by 1 count
  WITH counter(n) (
    ITERATE STARTING FROM
      SELECT 1
    WHILE n < 10 REPEAT
      SELECT n+1 FROM counter
  );

However I'm very unclear about the WHILE stuff, it makes some sense here 
because there is just one row, but what if there are severals?

  -- 2 by 2 count
  WITH counter(n) (
    ITERATE [STARTING FROM? OVER? nothing?]
      SELECT 1 UNION SELECT 2 -- cannot be empty? why not?
    WHILE n < 10 REPEAT
      -- which n it is just above?
      -- shoult it add a ANY/ALL semantics?
      -- should it really be a sub-query returning a boolean?
      -- eg: WHILE TRUE = ANY/ALL (SELECT n < 10 FROM counter)
      -- which I find pretty ugly.
      -- what else could it be?
      SELECT n+2 FROM counter
  );

Also, the overall syntax does not make much sense outside a WITH because 
one cannot reference the initial query which has no name.

Hmmm. Not very convincing:-) Let us try again.

Basically iterating is a 3 select construct: one for initializing, one for 
iterating, one for the stopping condition, with naming issues, the last 
point being exactly what WITH should solve.

by restricting the syntax to normal existing selects and moving things 
out:

   WITH stuff(n) AS
     ITERATE OVER/FROM/STARTING FROM '(' initial-sub-query ')' -- or a table?
     WHILE/UNTIL '(' condition-sub-query ')'
     -- what is TRUE/FALSE? non empty? other?
     -- WHILE/UNTIL [NOT] EXISTS '(' query ')' ??
     REPEAT/DO/LOOP/... '(' sub-query-over-stuff ')'
   );

At least the 3 sub-queries are just standard queries, only the wrapping 
around (ITERATE ... WHILE/UNTIL ... REPEAT ...) is WITH specific, which is 
somehow better than having new separators in the query syntax itself. It 
is pretty relational inside, and procedural on the outside, the two levels 
are not mixed, which is the real win from my point of view.

ISTM that the key take away from the above discussion is to keep the 
overhead syntax in WITH, it should not be moved inside the query in any 
way, like adding REPLACE or WHILE or whatever there. This way there is 
minimal interference with future query syntax extensions, there is only a 
specific WITH-level 3-query construct with pretty explicit markers.

-- 
Fabien.



Re: Proposing WITH ITERATIVE

From
Jeff Davis
Date:
On Tue, 2020-04-28 at 11:57 -0400, Jonah H. Harris wrote:
> Yeah, in that specific case, one of the other implementations seems
> to carry the counters along in the executor itself. But, as not all
> uses of this functionality are iteration-count-based, I think that's
> a little limiting. Using a terminator expression (of some kind) seems
> most adaptable, I think. I'll give some examples of both types of
> cases.

In my experience, graph algorithms or other queries doing more
specialized analysis tend to get pretty complex with lots of special
cases. Users may want to express these algorithms in a more familiar
language (R, Python, etc.), and to version the code (probably in an
extension).

Have you considered taking this to the extreme and implementing
something like User-Defined Table Operators[1]? Or is there a
motivation for writing such algorithms inline in SQL?

Regards,
    Jeff Davis

[1] http://www.vldb.org/conf/1999/P47.pdf