Thread: SELECT do not return all rows depending on selected columns

SELECT do not return all rows depending on selected columns

From
Fabrice LEGROS
Date:

Hello,

 

In attachment, there is 2 files :

  • init.sql  to create and fill a table
  • request.sql with a select on this table

 

The request return 2122 rows, if I remove the second column (v2.journalcode) from the select I get 2801 rows (which is the expected result).

Tested on "PostgreSQL 9.6.10 on x86_64-pc-linux-gnu (Debian 9.6.10-1.pgdg90+1), compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit"

 

Also tested on "PostgreSQL 10.5 (Debian 10.5-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit"

Same behavior, but with 2133 rows for the select with 2 columns.

 

Regards,

Fabrice Legros.

Attachment

Re: SELECT do not return all rows depending on selected columns

From
Tom Lane
Date:
Fabrice LEGROS <fabrice.legros@abbd.fr> writes:
> The request return 2122 rows, if I remove the second column (v2.journalcode) from the select I get 2801 rows (which
isthe expected result). 

I think your problem here is an underspecified query.  The ecriturenum
column is not unique, hence "LAG(ecriturenum) OVER ( ORDER BY ecriturenum
ASC )" is not entirely determinate, hence the set of rows selected by
"WHERE ecriturenum - ecriturenumprev != 1" isn't determinate.  I get
slightly different counts than you mention, and I imagine the apparent
dependency on the number of output columns has to do with the volume
of data passing through the sort steps being different, causing the
sorts to do different arbitrary things with equal-keyed rows.

If I change those OVER clauses to be like

LAG(ecriturenum) OVER ( ORDER BY ecriturenum ASC, line_number )

then I get determinate results (since line_number is unique).

            regards, tom lane