On 23/08/16 09:40, Andres Freund wrote:
> Hi,
>
> as noted in [1] I started hacking on removing the current implementation
> of SRFs in the targetlist (tSRFs henceforth). IM discussion brought the
> need for a description of the problem, need and approach to light.
>
> There are several reasons for wanting to get rid of tSRFs. The primary
> ones in my opinion are that the current behaviour of several SRFs in one
> targetlist is confusing, and that the implementation burden currently is
> all over the executor. Especially the latter is what is motivating me
> working on this, because it blocks my work on making the executor faster
> for queries involving significant amounts of tuples. Batching is hard
> if random places in the querytree can icnrease the number of tuples.
>
> The basic idea, hinted at in several threads, is, at plan time, to convert a query like
> SELECT generate_series(1, 10);
> into
> SELECT generate_series FROM ROWS FROM(generate_series(1, 10));
>
> thereby avoiding the complications in the executor (c.f. execQual.c
> handling of isDone/ExprMultipleResult and supporting code in many
> executor nodes / node->*.ps.ps_TupFromTlist).
>
> There are several design questions along the way:
>
> 1) How to deal with the least-common-multiple behaviour of tSRFs. E.g.
> =# SELECT generate_series(1, 3), generate_series(1,2);
> returning
> ┌─────────────────┬─────────────────┐
> │ generate_series │ generate_series │
> ├─────────────────┼─────────────────┤
> │ 1 │ 1 │
> │ 2 │ 2 │
> │ 3 │ 1 │
> │ 1 │ 2 │
> │ 2 │ 1 │
> │ 3 │ 2 │
> └─────────────────┴─────────────────┘
> (6 rows)
> but
> =# SELECT generate_series(1, 3), generate_series(5,7);
> returning
> ┌─────────────────┬─────────────────┐
> │ generate_series │ generate_series │
> ├─────────────────┼─────────────────┤
> │ 1 │ 5 │
> │ 2 │ 6 │
> │ 3 │ 7 │
> └─────────────────┴─────────────────┘
>
> discussion in this thread came, according to my reading, to the
> conclusion that that behaviour is just confusing and that the ROWS FROM
> behaviour of
> =# SELECT * FROM ROWS FROM(generate_series(1, 3), generate_series(1,2));
> ┌─────────────────┬─────────────────┐
> │ generate_series │ generate_series │
> ├─────────────────┼─────────────────┤
> │ 1 │ 1 │
> │ 2 │ 2 │
> │ 3 │ (null) │
> └─────────────────┴─────────────────┘
> (3 rows)
>
> makes more sense.
I had always implicitly assumed that having 2 generated sequences would
act as equivalent to:
SELECT sa, sb
FROM ROWS FROM(generate_series(1, 3)) AS sa, ROWS FROM(generate_series(5, 7)) AS sb
ORDER BY sa, sb;
sa | sb
----+---- 1 | 5 1 | 6 1 | 7 2 | 5 2 | 6 2 | 7 3 | 5 3 | 6 3 | 7
Obviously I was wrong - but to me, my implicit assumption makes more sense!
[...]
Cheers,
Gavin