Re: Proposing WITH ITERATIVE - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Proposing WITH ITERATIVE
Date
Msg-id 60d9dc6e7022e9d9b19701457ab4537e033fb6eb.camel@j-davis.com
Whole thread Raw
In response to Proposing WITH ITERATIVE  ("Jonah H. Harris" <jonah.harris@gmail.com>)
Responses Re: Proposing WITH ITERATIVE  ("Jonah H. Harris" <jonah.harris@gmail.com>)
Re: Proposing WITH ITERATIVE  ("Jonah H. Harris" <jonah.harris@gmail.com>)
List pgsql-hackers
Hi,

You might get better feedback in a month or so; right now we just got
into feature freeze.

On Mon, 2020-04-27 at 12:52 -0400, Jonah H. Harris wrote:
> In that it can reference a relation within its definition, this
> iterative variant has similar capabilities as recursive CTEs. In
> contrast to recursive CTEs, however, rather than appending new
> tuples, this variant performs a replacement of the intermediate
> relation. As such, the final result consists of a relation containing
> tuples computed during the last iteration only. Why would we want
> this?

Can you illustrate with some examples? I get that one is appending and
the other is modifying in-place, but how does this end up looking in
the query language?

> This type of computation pattern is often found in data and graph
> mining algorithms.

It certainly sounds useful.

> Rather than stopping when no new tuples are generated, WITH ITERATIVE
> stops when a user-defined predicate evaluates to true.

Why stop when it evaluates to true, and not false?

> First, iterative CTEs consume significantly less memory. Consider a
> CTE of N tuples and I iterations. With recursive CTEs, such a
> relation would grow to (N * I) tuples. With iterative CTEs, however,
> all prior iterations are discarded early. As such, the size of the
> relation would remain N, requiring only a maximum of (2 * N) tuples
> for comparisons of the current and the prior iteration. Furthermore,
> in queries where the number of executed iterations represents the
> stop criterion, recursive CTEs require even more additional per-tuple 
> overhead - to carry along the iteration counter.

It seems like the benefit comes from carrying information along within
tuples (by adding to scores or counters) rather than appending tuples.
Is it possible to achieve this in other ways? The recursive CTE
implementation is a very direct implementation of the standard, perhaps
there are smarter approaches?

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Next
From: James Coleman
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays