Thread: Why does the sequence skip a number with generate_series?
I expected these numbers to be in sync, but was suprised to see that the sequence skips a values after every generate series. CREATE TABLE jefftest ( id serial, num int ); INSERT INTO jefftest (num) values (generate_series(1,10)); INSERT INTO jefftest (num) values (generate_series(11,20)); INSERT INTO jefftest (num) values (generate_series(21,30)); id | num ----+----- 1 | 1 2 | 2 3 | 3 4 | 4 5 | 5 6 | 6 7 | 7 8 | 8 9 | 9 10 | 10 12 | 11 13 | 12 14| 13 15 | 14 16 | 15 17 | 16 18 | 17 19 | 18 20 | 19 21 | 20 23 | 21 24 | 22 25 | 23 26 | 24 27 | 25 28 | 26 29 | 27 30 | 28 31 | 29 32 | 30 But, if I just use single inserts, the sequence increments by one like I expect: jefftest=# INSERT INTO jefftest (num) VALUES (1); INSERT 0 1 jefftest=# INSERT INTO jefftest (num) VALUES (2); INSERT 0 1 jefftest=# INSERT INTO jefftest (num) VALUES (3); INSERT 0 1 jefftest=# INSERT INTO jefftest (num) VALUES (4); INSERT 0 1 jefftest=# INSERT INTO jefftest (num) VALUES (5); INSERT 0 1 jefftest=# select * from jefftest; id | num ----+----- 1 | 1 2 | 2 3 | 3 4 | 4 5 | 5 (5 rows) Obviously, this doesn't hurt anything, I'm just curious why it skips one after every generate_series insert? -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Tue, 2 Oct 2007, Jeff Frost wrote: > I expected these numbers to be in sync, but was suprised to see that the > sequence skips a values after every generate series. > > CREATE TABLE jefftest ( id serial, num int ); > INSERT INTO jefftest (num) values (generate_series(1,10)); > INSERT INTO jefftest (num) values (generate_series(11,20)); > INSERT INTO jefftest (num) values (generate_series(21,30)); It seems to do what you'd expect if you doINSERT INTO jefftest(num) select a from generate_series(1,10) as foo(a);INSERTINTO jefftest(num) select a from generate_series(11,20) as foo(a);INSERT INTO jefftest(num) select a from generate_series(21,30)as foo(a); I tried a function that raises a notice and called it asselect f1(1), generate_series(1,10); and got 11 notices so it looks like there's some kind of phantom involved.
Stephan Szabo wrote: > On Tue, 2 Oct 2007, Jeff Frost wrote: > >> I expected these numbers to be in sync, but was suprised to see that the >> sequence skips a values after every generate series. >> >> CREATE TABLE jefftest ( id serial, num int ); >> INSERT INTO jefftest (num) values (generate_series(1,10)); >> INSERT INTO jefftest (num) values (generate_series(11,20)); >> INSERT INTO jefftest (num) values (generate_series(21,30)); > > It seems to do what you'd expect if you do > INSERT INTO jefftest(num) select a from generate_series(1,10) as foo(a); > INSERT INTO jefftest(num) select a from generate_series(11,20) as foo(a); > INSERT INTO jefftest(num) select a from generate_series(21,30) as foo(a); > > I tried a function that raises a notice and called it as > select f1(1), generate_series(1,10); > and got 11 notices so it looks like there's some kind of phantom involved. > That's interesting - might need an answer from the core hackers. I am posting this to pgsql-hackers to get their comments and feedback. I wouldn't count it as a bug but it could be regarded as undesirable side effects. My guess is that what appears to happen is that the sequence is created by incrementing as part of the insert steps and the test to check the end of the sequence is - if last_inserted_number > end_sequence_numberrollback_last_insert This would explain the skip in sequence numbers. My thoughts are that - if last_inserted_number < end_sequence_numberinsert_again would be a better way to approach this. Of course you would also need to check that the (last_insert + step_size) isn't greater than the end_sequence_number when the step_size is given. I haven't looked at the code so I don't know if that fits easily into the flow of things. The as foo(a) test would fit this as the sequence is generated into the equivalent of a temporary table the same as a subselect, then used as insert data. The rollback would be applied during the temporary table generation so won't show when the data is copied across to fulfill the insert. Maybe the planner or the generate series function could use a temporary table to give the same results as select from generate_series() -- Shane Ambler pgSQL@Sheeky.Biz Get Sheeky @ http://Sheeky.Biz
Shane Ambler wrote: > Stephan Szabo wrote: >> On Tue, 2 Oct 2007, Jeff Frost wrote: >>> I expected these numbers to be in sync, but was suprised to see that the >>> sequence skips a values after every generate series. >>> >>> CREATE TABLE jefftest ( id serial, num int ); >>> INSERT INTO jefftest (num) values (generate_series(1,10)); >>> INSERT INTO jefftest (num) values (generate_series(11,20)); >>> INSERT INTO jefftest (num) values (generate_series(21,30)); >> It seems to do what you'd expect if you do >> INSERT INTO jefftest(num) select a from generate_series(1,10) as foo(a); >> INSERT INTO jefftest(num) select a from generate_series(11,20) as foo(a); >> INSERT INTO jefftest(num) select a from generate_series(21,30) as foo(a); >> I tried a function that raises a notice and called it as >> select f1(1), generate_series(1,10); >> and got 11 notices so it looks like there's some kind of phantom involved. > > That's interesting - might need an answer from the core hackers. > I am posting this to pgsql-hackers to get their comments and feedback. > I wouldn't count it as a bug but it could be regarded as undesirable side > effects. Don't use set-returning functions in "scalar context". If you put them in the FROM clause, as Stephan says above, it works fine. Anywhere else they have strange behavior and they are supported only because of backwards compatibility. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Shane Ambler wrote: >>> CREATE TABLE jefftest ( id serial, num int ); >>> INSERT INTO jefftest (num) values (generate_series(1,10)); >>> INSERT INTO jefftest (num) values (generate_series(11,20)); >>> INSERT INTO jefftest (num) values (generate_series(21,30)); > Don't use set-returning functions in "scalar context". I think what is actually happening is that the expanded targetlist is nextval('seq'), generate_series(1,10) On the eleventh iteration, generate_series() returns ExprEndResult to show that it's done ... but the 11th nextval() call already happened. If you switched the columns around, you wouldn't get the extra call. If you think that's bad, the behavior with multiple set-returning functions in the same targetlist is even stranger. The whole thing is a mess and certainly not something we would've invented if we hadn't inherited it from Berkeley. regards, tom lane