Thread: INSERT INTO .. SELECT nextval() ORDER BY - returns unexpectedly ordered values

INSERT INTO .. SELECT nextval() ORDER BY - returns unexpectedly ordered values

From
Sjon Hortensius
Date:
It seems I have found a bug in the way postgres combines sequences and
ORDER BY with internal data ordering.

I have a table that has an `id`, where values were inserted somewhat
randomly. I wanted to re-order the rows a assign a new `id`, so I created a
sequence and did INSERT INTO .. SELECT. What I didn't understand is the
rows came out ordered correctly, but the new id's didn't. Instead of
incrementing correctly they seemed to follow the original ordering of the
rows.

I have reduced this to the following testcase:

CREATE TABLE test (
    name character varying(4),
    id smallint NOT NULL
);
CREATE TABLE test2 (like test);
ALTER TABLE test2 ADD old_id smallint;

INSERT INTO test VALUES ('c', 13);
INSERT INTO test VALUES ('d', 14);
INSERT INTO test VALUES ('a', 11);
INSERT INTO test VALUES ('b', 12);

CREATE TEMPORARY SEQUENCE tmp START 1;
INSERT INTO test2 SELECT name, nextval('tmp'), id FROM test ORDER BY id ASC;

SELECT * FROM test2;

What I expected:

name  id  old_id
a    1    11
b    2    12
c    3    13
d    4    14

What I got:

name  id  old_id
a    3    11
b    4    12
c    1    13
d    2    14

I have worked around this by clustering the old table on the new id before
SELECTing but this behavior doesn't seem to be documented, is this a bug?

Thanks,
Sjon
Sjon Hortensius <sjon@hortensius.net> writes:
> INSERT INTO test2 SELECT name, nextval('tmp'), id FROM test ORDER BY id ASC;

> I have worked around this by clustering the old table on the new id before
> SELECTing but this behavior doesn't seem to be documented, is this a bug?

No.  You're assuming that the nextval() happens after the row ordering,
but this is not necessarily so --- indeed, a strict reading of the SQL
standard would imply that it should *never* be so, because the standard
execution model is that ORDER BY happens after computing the SELECT list.
(Without that, locutions like "ORDER BY 1" would make no sense.)  It might
accidentally work if the ORDER BY were done via an indexscan rather than
an explicit sort step, but that's not possible in your test case.

Try something like this to force the evaluation order:

INSERT INTO test2
  SELECT name, nextval('tmp'), id FROM
    (SELECT name, id FROM test ORDER BY id ASC) ss;

            regards, tom lane

Re: INSERT INTO .. SELECT nextval() ORDER BY - returns unexpectedly ordered values

From
Sjon Hortensius
Date:
Thanks, that makes sense!

On Thu, Sep 3, 2015 at 2:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Sjon Hortensius <sjon@hortensius.net> writes:
> > INSERT INTO test2 SELECT name, nextval('tmp'), id FROM test ORDER BY id
> ASC;
>
> > I have worked around this by clustering the old table on the new id
> before
> > SELECTing but this behavior doesn't seem to be documented, is this a bug?
>
> No.  You're assuming that the nextval() happens after the row ordering,
> but this is not necessarily so --- indeed, a strict reading of the SQL
> standard would imply that it should *never* be so, because the standard
> execution model is that ORDER BY happens after computing the SELECT list.
> (Without that, locutions like "ORDER BY 1" would make no sense.)  It might
> accidentally work if the ORDER BY were done via an indexscan rather than
> an explicit sort step, but that's not possible in your test case.
>
> Try something like this to force the evaluation order:
>
> INSERT INTO test2
>   SELECT name, nextval('tmp'), id FROM
>     (SELECT name, id FROM test ORDER BY id ASC) ss;
>
>                         regards, tom lane
>