Thread: Undocumented array_val[generate_series(...)] functionality?
Hey,
A post over in Reddit had an expression form I've never seen before:
select (array[1,2,3,4]::integer[])[generate_series(1, 3)];
===
1
2
3
Looking at subscripting in the SQL syntax this example doesn't seem to be documented.
Is it documented somewhere else, and, regardless, shouldn't it be documented in the linked location?
I get this is basically a variant of the slice syntax [n:m], and it is working as expected, but nothing indicates I can put an SRF inside the brackets.
David J.
On Jul 11, 2021, at 5:52 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Is that expression saying something like select the first second third element from array 1,2,3,4?Hey,A post over in Reddit had an expression form I've never seen before:select (array[1,2,3,4]::integer[])[generate_series(1, 3)];===123Looking at subscripting in the SQL syntax this example doesn't seem to be documented.Is it documented somewhere else, and, regardless, shouldn't it be documented in the linked location?I get this is basically a variant of the slice syntax [n:m], and it is working as expected, but nothing indicates I can put an SRF inside the brackets.David J.
On Mon, 12 Jul 2021 at 11:52, David G. Johnston <david.g.johnston@gmail.com> wrote: > A post over in Reddit had an expression form I've never seen before: > > select (array[1,2,3,4]::integer[])[generate_series(1, 3)]; > Looking at subscripting in the SQL syntax this example doesn't seem to be documented. > > https://www.postgresql.org/docs/current/sql-expressions.html#SQL-EXPRESSIONS-SUBSCRIPTS > > Is it documented somewhere else, and, regardless, shouldn't it be documented in the linked location? Isn't this implied by "Each subscript is itself an expression"? There's nothing special here with the SRF. That just produces 3 rows and passes the subscript as 1, 2 then 3. David
On Sun, Jul 11, 2021 at 5:43 PM David Rowley <dgrowleyml@gmail.com> wrote:
Isn't this implied by "Each subscript is itself an expression"?
There's nothing special here with the SRF. That just produces 3 rows
and passes the subscript as 1, 2 then 3.
One can indeed infer that if the expression chosen for subscript is an SRF that the resultant output will also be an SRF. It seems less than helpful to force the reader to infer that, IMO. Especially when the slice syntax, which could conceivably return either a sub-array OR a set, the later being the undocumented reality (our function examples document the actual outputs, these examples probably should do the same. I would be OK if we simply added an SRF example and showed the SRF result it produces without changing the wording - though this doesn't seem like a hard improvement to make. I'll offer something up if a committer at least agrees it could use some work).
David J.
On Mon, 12 Jul 2021 at 12:58, David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Sun, Jul 11, 2021 at 5:43 PM David Rowley <dgrowleyml@gmail.com> wrote: >> >> Isn't this implied by "Each subscript is itself an expression"? >> There's nothing special here with the SRF. That just produces 3 rows >> and passes the subscript as 1, 2 then 3. >> > > One can indeed infer that if the expression chosen for subscript is an SRF that the resultant output will also be an SRF. I'd say in your example the array with the subscript does not become an SRF anymore than abs() becomes an SRF in the following: select abs(generate_series(-3,-1)); abs() is simply called once per output value of the generate_series SRF. That seems fairly equivalent to me to what's going on in your example case. David
On Sun, Jul 11, 2021 at 6:06 PM David Rowley <dgrowleyml@gmail.com> wrote:
select abs(generate_series(-3,-1));
abs() is simply called once per output value of the generate_series
SRF. That seems fairly equivalent to me to what's going on in your
example case.
Fair point. Both of these are premised on two related facts:
One, the select generate_series(1,3) function call causes multiple rows to be generated where there would usually be only one. In short, SRF function calls and non-SRF function calls exhibit different behaviors on the output.
Two, composition results in an inside-to-outside execution order: the SRF is evaluated first, the additional rows added, then the outer function (abs or the subscript function respectively in these examples) is evaluated for whatever rows are now present in the result.
Is the above something one can learn from our documentation?
Is this syntax we are discouraging users from using and thus intentionally not documenting it? I do get this impression but, frankly, given the utility of " [ generate_series(...) ] " I find it hard to recommend something different to get the same result yet don't have a place to point and say "here is how and why it works".
David J.
On Mon, 12 Jul 2021 at 02:39, David G. Johnston <david.g.johnston@gmail.com> wrote: > > One, the select generate_series(1,3) function call causes multiple rows to be generated where there would usually be onlyone. Yes. > Two, composition results in an inside-to-outside execution order: the SRF is evaluated first, the additional rows added,then the outer function (abs or the subscript function respectively in these examples) is evaluated for whatever rowsare now present in the result. Yes. > Is the above something one can learn from our documentation? Yes, but only if you know where to look. > Is this syntax we are discouraging users from using and thus intentionally not documenting it? On the contrary, I would say that this is the expected behaviour, and that it is documented, though not in the most obvious place: https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET That's probably not the first place one would go looking for it, and might also (wrongly) imply that it only works for functions written in language SQL. BTW, this is something that started working in PG10 (in 9.6, an error is thrown), and I think that it's a result of this release note item, which matches your conclusions: Change the implementation of set-returning functions appearing in a query's SELECT list (Andres Freund) Set-returning functions are now evaluated before evaluation of scalar expressions in the SELECT list, much as though they had been placed in a LATERAL FROM-clause item. This allows saner semantics for cases where multiple set-returning functions are present. If they return different numbers of rows, the shorter results are extended to match the longest result by adding nulls. Previously the results were cycled until they all terminated at the same time, producing a number of rows equal to the least common multiple of the functions' periods. In addition, set-returning functions are now disallowed within CASE and COALESCE constructs. For more information see Section 37.4.8. Another interesting consequence of that is that it's possible to do a similar thing with the array slice syntax, and a pair of generate_series() calls, for example: SELECT (array[1,2,3,4]::int[])[generate_series(1,4) : generate_series(2,4)]; array ------- {1,2} {2,3} {3,4} (4 rows) Note: there are 4 rows in that result, and the last one is NULL, which is also consistent with the documentation, and the fact that the array slice function returns NULL if either subscript is NULL. I'd agree that there's an opportunity to improve the docs here. Regards, Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > BTW, this is something that started working in PG10 (in 9.6, an error > is thrown), and I think that it's a result of this release note item, > which matches your conclusions: > Change the implementation of set-returning functions appearing in > a query's SELECT list (Andres Freund) Interesting. When I first looked at this thread I figured "oh, that's always worked, nothing to see here". But you're right, we didn't use to allow SRFs in subscripts. Still, I'm with David that no new docs are needed. IMO the former restriction was the surprising thing, and the current behavior is simply what one would expect from assembling those parts in that order. regards, tom lane
On Monday, July 12, 2021, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Still, I'm with David that no new docs
are needed. IMO the former restriction was the surprising thing, and
the current behavior is simply what one would expect from assembling
those parts in that order.
I agree the material in Extending SQL - SQL Functions, plus normal expectations, mean that we do indeed cover the topic sufficiently. The location and specifics of the material and use of cross-references could use some attention though its also not an area that gets too many questions so people are figuring this all out one way or another (or not finding a need and just don’t know what they are missing…)
It doesn’t really fit that well in a syntax chapter since the calling syntax is indeed the same. But the Queries Chapter is where I was looking this past time (I knew about and read the SQL Functions Chapter previously but didn’t get there is this skim).
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > I agree the material in Extending SQL - SQL Functions, plus normal > expectations, mean that we do indeed cover the topic sufficiently. The > location and specifics of the material and use of cross-references could > use some attention though its also not an area that gets too many questions > so people are figuring this all out one way or another (or not finding a > need and just don’t know what they are missing…) Yeah, there is a lot of material in that chapter that is about using functions, not just writing them. I don't think anyone's entirely happy about the current factorization of those parts of the manual; but it's not entirely clear how to make it better, either. regards, tom lane