Thread: Abitity to identify the current iteration in a recursive SELECT (feature request)

Hi,

the algorithm present here for the resolution of WITH RECURSIVE queries is, broadly speaking, iterative, with the iteration sequence number indicated in the first row item of the field generated by the SEARCH BREADTH FIRST clause, i.e. 0, 1, 2 etc.

Is there a way to obtain directly this iteration sequence number within the SELECT statement following the UNION keyword in a recursive construction? I know it can by obtained by maintaining its value in a working table column, but that's suboptimal - I need it as a "magic" variable, akin, for example, the excluded variable available inside the ON CONFLICT DO UPDATE clause of an INSERT statement.

Background
I'm using the a recursive SELECT in order to join iteratively several (virtual) tables computed dinamically based on the iteration number. Why? In order to implement a set intersection, with each set extracted from a jsonb column of a table.
If the iteration number was a distinct column of the working table, the table joined in each iteration would need use LATERAL, thus building it for as many times as the number of rows in the working table instead of just once.

Workaround
I simulated the said missing variable with a sequence, which is cumbersome because;
  • the sequence is single-use, so it must be named randomly, created before and dropped after the WITH RECURSIVE ... SELECT statement;
  • nextval('sequence_name') cannot be used directly within WHERE clauses, because it is re-evaluated for each row; the next sequence value must be reached through a phrase like join (select nextval('sequence_name')) in order to ensure a single evaluation per iteration.
On Wed, Dec 18, 2024 at 5:00 AM <sulfinu@gmail.com> wrote:
Is there a way to obtain directly this iteration sequence number within the SELECT statement following the UNION keyword in a recursive construction? I know it can by obtained by maintaining its value in a working table column, but that's suboptimal - I need it as a "magic" variable, akin, for example, the excluded variable available inside the ON CONFLICT DO UPDATE clause of an INSERT statement.

Do you mean something like "... WHERE pg_magic_iteration_number < 10"? Looking at the source code, I don't see a trivial way to accomplish that. Maintaining the count as a column in your select is still the canonical way. As someone who writes a lot of recursive CTEs (especially each December!), I'm not sure how useful this feature would be, as the number of loops is rarely the criteria for ending the iterations.

I'm using the a recursive SELECT in order to join iteratively several (virtual) tables computed dinamically based on the iteration number.

Certainly the best solution is to use pl/pgsql, which gets you iterative loops, lots of introspection and ways to break out of the loop, and even true recursion.

Cheers,
Greg

Please read among your lines.

În mie., 18 dec. 2024 la 16:07, Greg Sabino Mullane <htamfids@gmail.com> a scris:
Do you mean something like "... WHERE pg_magic_iteration_number < 10"? Looking at the source code, I don't see a trivial way to accomplish that.

Oh, I disagree, just told you that the iteration number is readily available in the field computed by the SEARCH BREADTH FIRST clause.
 
Maintaining the count as a column in your select is still the canonical way. As someone who writes a lot of recursive CTEs (especially each December!), I'm not sure how useful this feature would be, as the number of loops is rarely the criteria for ending the iterations.
 
I never said I use the iteration number to end the process, I need it to pick the right table to be joined. If the iteration number was stored in a working table column, I would be forced to perform a LATERAL join, which recomputes the same joined table again and again for every row in the working table.
 

Certainly the best solution is to use pl/pgsql, which gets you iterative loops, lots of introspection and ways to break out of the loop, and even true recursion.

Thought about it, of course, but I'm pretty sure that plain JOINs are quicker than linear search loops written in pl/pgsql (remember I need to intersect an dynamic number of arrays) .
Am 16.12.24 um 11:21 schrieb sulfinu@gmail.com:
> the algorithm present here <https://www.postgresql.org/docs/current/
> queries-with.html#QUERIES-WITH-RECURSIVE> for the resolution of WITH
> RECURSIVE queries is, broadly speaking, iterative, with the
> iteration sequence number indicated in the first row item of the
> field generated by the SEARCH BREADTH FIRST clause,
> i.e. /0/, /1/, /2/ etc.
>
> Is there a way to obtain directly this iteration sequence number
> within the SELECT statement following the UNION keyword in a
> recursive construction? I know it can by obtained by maintaining its
> value in a working table column, but that's suboptimal - I need it
> as a "magic" variable, akin, for example, the excluded variable
> available inside the ON CONFLICT DO UPDATE clause of an INSERT
> statement.
>

I assume you don't want to do something like the following when you
refer to "working table column"?

with recursive cte as (
    select ..., 1 as level
    from ...
    union all
    select ...., parent.level + 1
    from ...
      join cte as parent on ...
)