Thread: BUG #6131: Query Returning Incorrect Results

BUG #6131: Query Returning Incorrect Results

From
"David Johnston"
Date:
The following bug has been logged online:

Bug reference:      6131
Logged by:          David Johnston
Email address:      polobo@yahoo.com
PostgreSQL version: 9.0.4
Operating system:   Windows 7 64-bit
Description:        Query Returning Incorrect Results
Details:

The embedded script exhibits some strange behavior.  When the query is run
the [adjustment_paid] column for [technician] "6338B" should show +/- 25.00.
 Instead, if I run the last query immediately after creating the schema and
inserting the data the results I get for that technician is 0.00 for both
records.  However, if I continue to run the query, or explicitly run ANALYZE
, I then get the correct results.

BTW, the other two records (techs 500A and 600A) show the expected values
each time the query is run.  Also, only those four records are expected.

While I understand that ANALYZE should be run after loading in data I should
not get blatantly incorrect results - at worse I should suffer a performance
hit; though with only 40 or so records in the table performance is not
really a concern.

I can get this behavior to manifest consistently on my machine:

PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 64-bit

FWIW, I execute all the following from a commercial GUI called PostgreSQL
Maestro.

Thanks,

David J.

--Begin Script

DROP VIEW IF EXISTS laborwip_payroll_entry_adjusted_ticket;
DROP VIEW IF EXISTS laborwip_payroll_technician_effectiverate;
DROP TABLE IF EXISTS laborwip_payroll_ticket;

CREATE TABLE laborwip_payroll_ticket (

    s_id varchar NOT NULL,

    accountnumber varchar NOT NULL,
    date_reference date NOT NULL,
    ticketnumber varchar NOT NULL,
    technician varchar NOT NULL,
    date_paid date NOT NULL,
    detail_category varchar NOT NULL CHECK (detail_category IN
('++','--','+0','0+','-0','0-')),

    hours_paid numeric(7,2) NOT NULL,
    amount_paid numeric(7,2) NOT NULL,
    rate_paid numeric(7,2) NULL,

    CONSTRAINT laborwip_payroll_ticket_houramountconsistency
        CHECK ((amount_paid >= 0 AND hours_paid >= 0) OR
               (amount_paid <= 0 AND hours_paid <= 0)),

    PRIMARY KEY (s_id, accountnumber, date_reference, ticketnumber, technician,
date_paid, detail_category)

);



INSERT INTO laborwip_payroll_ticket (s_id, accountnumber, date_reference,
ticketnumber, technician, date_paid, detail_category, amount_paid,
hours_paid, rate_paid) VALUES
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318421', '7001',
'1976-06-04', '++', 25, 2.5, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318426', '6120',
'1976-06-01', '++', 10, 1, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318427', '6120',
'1976-06-02', '++', 15, 1.5, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318210', '6124',
'1976-06-02', '++', 30, 2, 15.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318322', '6126',
'1976-06-01', '++', 10, 0.5, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318361', '6126',
'1976-06-01', '++', 30, 1.5, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318285', '6126',
'1976-06-01', '++', 10, 0.5, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318227', '6124',
'1976-06-01', '++', 45, 3, 15.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318396', '6126',
'1976-06-02', '++', 80, 4, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318431', '6126',
'1976-06-02', '++', 50, 2.5, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '317873', '6124',
'1976-06-02', '++', 15, 1, 15.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318421', '6126',
'1976-06-02', '++', 50, 2.5, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318262', '6287',
'1976-06-01', '++', 2, 0.2, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318352', '6126',
'1976-06-01', '++', 25, 1, 25.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318364', '6126',
'1976-06-02', '++', 30, 1.5, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318210', '6124',
'1976-06-01', '++', 45, 3, 15.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318267', '7003',
'1976-06-04', '--', -10, -1, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318274', '6287',
'1976-06-01', '++', 2, 0.2, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318412', '6120',
'1976-06-02', '++', 5, 0.5, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318182', '6338',
'1976-06-02', '++', 25, 1, 25.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318279', '6338B',
'1976-06-01', '++', 25, 1, 25.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318274', '7002',
'1976-06-04', '--', -30, -3, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318379', '6126',
'1976-06-02', '++', 30, 1.5, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318349', '6126',
'1976-06-01', '++', 20, 1, 20.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318257', '6287',
'1976-06-01', '++', 4, 0.4, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318213', '6016',
'1976-06-03', '++', 20, 1, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318254', '6124',
'1976-06-01', '++', 45, 3, 15.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318431', '7002',
'1976-06-04', '++', 20, 2, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318262', '7003',
'1976-06-04', '--', -20, -2, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318182', '6338',
'1976-06-01', '++', 75, 3, 25.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318279', '6338B',
'1976-06-02', '++', 25, 1, 25.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '318267', '6287',
'1976-06-01', '++', 2, 0.2, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '400000', '6338B',
'1976-06-01', '+0', 0, 1, 0.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '400001', '6338B',
'1976-06-01', '-0', 0, -1, 0.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '400002', '6338B',
'1976-06-05', '++', 50, 5, 10.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '400003', '500A',
'1976-06-05', '++', 120, 8, 15.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '400003', '500A',
'1976-06-05', '+0', 0, 2, 15.00),
  ('[TESTSG]STORE01', 'LABORWIP', '1976-06-15', '400003', '500B',
'1976-06-05', '++', 150, 10, 15.00),

  ('[TESTSG]STORE02', 'LABORWIP', '1976-06-15', '6000', '600A',
'1976-06-05', '++', 100.00, 10, 10.00),
  ('[TESTSG]STORE02', 'LABORWIP', '1976-06-15', '6001', '600A',
'1976-06-06', '++', 20.00, 2, 10.00),
  ('[TESTSG]STORE02', 'LABORWIP', '1976-06-15', '6002', '600A',
'1976-06-07', '++', 30.00, 3, 10.00),
  ('[TESTSG]STORE02', 'LABORWIP', '1976-06-15', '6000', '600A',
'1976-06-05', '--', -30.00, -3, 10.00),
  ('[TESTSG]STORE02', 'LABORWIP', '1976-06-15', '6000', '600A',
'1976-06-05', '+0', 0.00, 1, 0.00),
  ('[TESTSG]STORE02', 'LABORWIP', '1976-06-15', '6000', '600A',
'1976-06-05', '0+', 20.00, 0, 0.00),
  ('[TESTSG]STORE02', 'LABORWIP', '1976-06-15', '6003', '600B',
'1976-06-05', '0+', 20.00, 0, 0.00)
  ;

CREATE OR REPLACE VIEW laborwip_payroll_technician_effectiverate AS
    SELECT s_id, date_reference, accountnumber, technician,
rate_paid::numeric(7,2) AS technician_rate
    FROM (
        SELECT s_id, date_reference, accountnumber, technician, rate_paid,
rate_count,
                (row_number() OVER (PARTITION BY s_id, date_reference, accountnumber,
technician ORDER BY rate_count DESC)) AS rate_index
        FROM (
            SELECT s_id, date_reference, accountnumber, technician, rate_paid,
COUNT(*) AS rate_count
            FROM laborwip_payroll_ticket
            --WHERE detail_category IN ( '++', '--' )
            GROUP BY s_id, date_reference, accountnumber, technician, rate_paid
        ) calc
    ) ranking
    WHERE rate_index = 1
    ;

;


CREATE VIEW laborwip_payroll_entry_adjusted_ticket AS
    SELECT
      s_id, date_reference, accountnumber, ticketnumber, technician,
      (SUM(hours_paid) * technician_rate)::numeric(10,2) AS
adjustment_paid,
      'ADJ' || CASE WHEN SUM(hours_paid) < 0.00 THEN '-' ELSE '+' END ||
'H'::varchar AS description

    FROM (SELECT s_id, date_reference, accountnumber, technician, ticketnumber,

                 hours_paid, detail_category
                 FROM laborwip_payroll_ticket
                 WHERE detail_category IN ('-0','+0')
    ) ticket_info
    NATURAL LEFT JOIN laborwip_payroll_technician_effectiverate
    GROUP BY s_id, date_reference, accountnumber, ticketnumber,
detail_category, technician, technician_rate

;

--DO NOT RUN ANALYZE BEFORE EXECUTING THE FOLLOWING
SELECT * FROM laborwip_payroll_entry_adjusted_ticket;
--You should see the incorrect (0.00) results here
ANALYZE;

SELECT * FROM laborwip_payroll_entry_adjusted_ticket;
--And now you should have the correct (+/- 25.00) results
--End Script

Re: BUG #6131: Query Returning Incorrect Results

From
Tom Lane
Date:
"David Johnston" <polobo@yahoo.com> writes:
> The embedded script exhibits some strange behavior.  When the query is run
> the [adjustment_paid] column for [technician] "6338B" should show +/- 25.00.
>  Instead, if I run the last query immediately after creating the schema and
> inserting the data the results I get for that technician is 0.00 for both
> records.  However, if I continue to run the query, or explicitly run ANALYZE
> , I then get the correct results.

Interesting example.  I don't believe it's a Postgres bug though, but
rather indeterminism in your query.  If you pull out the subquery that
is being fed to the window aggregate (row_number()):

SELECT s_id, date_reference, accountnumber, technician, rate_paid, COUNT(*) AS rate_count
FROM laborwip_payroll_ticket
GROUP BY s_id, date_reference, accountnumber, technician, rate_paid
ORDER BY s_id, date_reference, accountnumber, technician, rate_count DESC;

you will find that it produces slightly different output row ordering
before and after the ANALYZE.  I get

      s_id       | date_reference | accountnumber | technician | rate_paid | rate_count
-----------------+----------------+---------------+------------+-----------+------------
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500A       |     15.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500B       |     15.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6016       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6120       |     10.00 |          3
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6124       |     15.00 |          5
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     20.00 |          9
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     25.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6287       |     10.00 |          4
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338       |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |      0.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7001       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7002       |     10.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7003       |     10.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |     10.00 |          4
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |      0.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600B       |      0.00 |          1
(18 rows)

versus

      s_id       | date_reference | accountnumber | technician | rate_paid | rate_count
-----------------+----------------+---------------+------------+-----------+------------
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500A       |     15.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 500B       |     15.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6016       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6120       |     10.00 |          3
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6124       |     15.00 |          5
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     20.00 |          9
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6126       |     25.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6287       |     10.00 |          4
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338       |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     25.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |      0.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 6338B      |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7001       |     10.00 |          1
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7002       |     10.00 |          2
 [TESTSG]STORE01 | 1976-06-15     | LABORWIP      | 7003       |     10.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |     10.00 |          4
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600A       |      0.00 |          2
 [TESTSG]STORE02 | 1976-06-15     | LABORWIP      | 600B       |      0.00 |          1
(18 rows)

Note in particular that the first two rows for technician 6338B are
presented in different orders.  This is not a bug because they are not
distinct so far as the ORDER BY clause is concerned.  However, when
you feed them to the window aggregate stuff:

SELECT ...,
    (row_number() OVER (PARTITION BY s_id, date_reference, accountnumber,
     technician ORDER BY rate_count DESC)) AS rate_index
    FROM the-query-above
WHERE rate_index = 1

... you get only the first one of those two rows.  And then your upper
query's results vary depending on which rate_paid you got.  So basically
you need to add more columns to the window aggregate PARTITION/ORDER BY
clauses to make the result more deterministic.

BTW, the reason ANALYZE causes the row order to change is that it
affects the size of the hash table that the HashAggregate step uses.
That's not tremendously relevant to fixing the problem, but just in
case you were wondering.

            regards, tom lane

Re: BUG #6131: Query Returning Incorrect Results

From
"David Johnston"
Date:
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 26, 2011 7:42 PM
To: David Johnston
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #6131: Query Returning Incorrect Results

"David Johnston" <polobo@yahoo.com> writes:
> The embedded script exhibits some strange behavior.  When the query is
> run the [adjustment_paid] column for [technician] "6338B" should show +/-
25.00.
>  Instead, if I run the last query immediately after creating the
> schema and inserting the data the results I get for that technician is
> 0.00 for both records.  However, if I continue to run the query, or
> explicitly run ANALYZE , I then get the correct results.

Interesting example.  I don't believe it's a Postgres bug though, but rather
indeterminism in your query.  If you pull out the subquery that is being fed
to the window aggregate (row_number()):

SELECT s_id, date_reference, accountnumber, technician, rate_paid, COUNT(*)
AS rate_count FROM laborwip_payroll_ticket GROUP BY s_id, date_reference,
accountnumber, technician, rate_paid ORDER BY s_id, date_reference,
accountnumber, technician, rate_count DESC;

you will find that it produces slightly different output row ordering before
and after the ANALYZE.  I get

... you get only the first one of those two rows.  And then your upper
query's results vary depending on which rate_paid you got.  So basically you
need to add more columns to the window aggregate PARTITION/ORDER BY clauses
to make the result more deterministic.

-----------------------------------------

Now I feel like a schmuck...sorry for the noise.  I should/do know better
but my mind is fried.  Thank you so much for the quick response.

David J.