Re: BUG #6131: Query Returning Incorrect Results - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #6131: Query Returning Incorrect Results
Date
Msg-id 19689.1311723714@sss.pgh.pa.us
Whole thread Raw
In response to BUG #6131: Query Returning Incorrect Results  ("David Johnston" <polobo@yahoo.com>)
Responses Re: BUG #6131: Query Returning Incorrect Results
List pgsql-bugs
"David Johnston" <polobo@yahoo.com> writes:
> The embedded script exhibits some strange behavior.  When the query is run
> the [adjustment_paid] column for [technician] "6338B" should show +/- 25.00.
>  Instead, if I run the last query immediately after creating the schema and
> inserting the data the results I get for that technician is 0.00 for both
> records.  However, if I continue to run the query, or explicitly run ANALYZE
> , I then get the correct results.

Interesting example.  I don't believe it's a Postgres bug though, but
rather indeterminism in your query.  If you pull out the subquery that
is being fed to the window aggregate (row_number()):

SELECT s_id, date_reference, accountnumber, technician, rate_paid, COUNT(*) AS rate_count
FROM laborwip_payroll_ticket
GROUP BY s_id, date_reference, accountnumber, technician, rate_paid
ORDER BY s_id, date_reference, accountnumber, technician, rate_count DESC;

you will find that it produces slightly different output row ordering
before and after the ANALYZE.  I get

      s_id       | date_reference | accountnumber | technician | rate_paid | rate_count
-----------------+----------------+---------------+------------+-----------+------------
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500A       |     15.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500B       |     15.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6016       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6120       |     10.00 |          3
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6124       |     15.00 |          5
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     20.00 |          9
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     25.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6287       |     10.00 |          4
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338       |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |      0.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7001       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7002       |     10.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7003       |     10.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |     10.00 |          4
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |      0.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600B       |      0.00 |          1
(18 rows)

versus

      s_id       | date_reference | accountnumber | technician | rate_paid | rate_count
-----------------+----------------+---------------+------------+-----------+------------
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500A       |     15.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500B       |     15.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6016       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6120       |     10.00 |          3
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6124       |     15.00 |          5
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     20.00 |          9
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     25.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6287       |     10.00 |          4
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338       |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |      0.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7001       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7002       |     10.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7003       |     10.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |     10.00 |          4
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |      0.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600B       |      0.00 |          1
(18 rows)

Note in particular that the first two rows for technician 6338B are
presented in different orders.  This is not a bug because they are not
distinct so far as the ORDER BY clause is concerned.  However, when
you feed them to the window aggregate stuff:

SELECT ...,
    (row_number() OVER (PARTITION BY s_id, date_reference, accountnumber,
     technician ORDER BY rate_count DESC)) AS rate_index
    FROM the-query-above
WHERE rate_index = 1

... you get only the first one of those two rows.  And then your upper
query's results vary depending on which rate_paid you got.  So basically
you need to add more columns to the window aggregate PARTITION/ORDER BY
clauses to make the result more deterministic.

BTW, the reason ANALYZE causes the row order to change is that it
affects the size of the hash table that the HashAggregate step uses.
That's not tremendously relevant to fixing the problem, but just in
case you were wondering.

            regards, tom lane

pgsql-bugs by date:

Previous
From: noordsij
Date:
Subject: Re: BUG #6086: [SOLVED] Segmentation fault
Next
From: "David Johnston"
Date:
Subject: Re: BUG #6131: Query Returning Incorrect Results