Thread: [HACKERS] generate_series regression 9.6->10
The behaviour of generate_series seems to have changed a little, at least in conjunction w/ CTEs. Under 9.6 (and prior) this query returns 2127 rows, with no nulls:
with
ij as ( select i, j from generate_series(1, 10) i, generate_series(1, 10) j),
iijj as (select generate_series(1, i) as a, generate_series(1, j) b from ij)
select a, b from iijj;
Under 10, it returns only 715 rows, with many nulls.
On 2017-05-24 10:09:19 -0700, Paul Ramsey wrote: > The behaviour of generate_series seems to have changed a little, at least > in conjunction w/ CTEs. Under 9.6 (and prior) this query returns 2127 rows, > with no nulls: > > with > ij as ( select i, j from generate_series(1, 10) i, generate_series(1, 10) > j), > iijj as (select generate_series(1, i) as a, generate_series(1, j) b from ij) > select a, b from iijj; > > Under 10, it returns only 715 rows, with many nulls. Right, that's expected - we probably need to expand on that in the release notes. Before v10 targetlist with multiple SRFs were evaluated using on a "least common multiple" logic. I.e. if you have SELECT generate_series(1,2), generate_series(1,4); once the first SRFs is exhausted it was restarted. Only once all SRFs stopped returning rows at the same time, things were stopped. Going on forward, once either SRF stops returning rows, it'll return NULL until all SRFs are exhausted. Makes sense? Is that a problem for you? If so, what do you use the LCM logic for in practical terms? Greetings, Andres Freund
Paul Ramsey <pramsey@cleverelephant.ca> writes: > The behaviour of generate_series seems to have changed a little, at least > in conjunction w/ CTEs. What's changed is the behavior of multiple SRFs in a SELECT's targetlist, cf https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=69f4b9c85f168ae006929eec44fc44d569e846b9 specifically this comment: While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multipleSRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their inputat the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for alreadyexhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularlyuseful. I see the current v10 release notes have failed miserably at documenting this :-(. Will try to improve that. regards, tom lane
Thanks Tom. This showed up in a regression test of ours that built the test data using generate_series, so it's not a critical production issue or anything, just a surprise change in behaviour.
P.
On Wed, May 24, 2017 at 10:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
> The behaviour of generate_series seems to have changed a little, at least
> in conjunction w/ CTEs.
What's changed is the behavior of multiple SRFs in a SELECT's targetlist,
cf
https://git.postgresql.org/gitweb/?p=postgresql.git&a= commitdiff&h= 69f4b9c85f168ae006929eec44fc44 d569e846b9
specifically this comment:
While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e. continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones. We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.
I see the current v10 release notes have failed miserably at documenting
this :-(. Will try to improve that.
regards, tom lane