BUG #13592: Optimizer throws out join constraint causing incorrect result - Mailing list pgsql-bugs

From kristoffer.gustafsson@yves-rocher.se
Subject BUG #13592: Optimizer throws out join constraint causing incorrect result
Date
Msg-id 20150826195031.2091.40681@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #13592: Optimizer throws out join constraint causing incorrect result  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      13592
Logged by:          Kristoffer Gustafsson
Email address:      kristoffer.gustafsson@yves-rocher.se
PostgreSQL version: 9.4.3
Operating system:   Linux
Description:

"A program produces the wrong output for any given input."

I’m including a description of the encountered issue because I can’t provide
you with a reproducible set of data. Both because of the data in question
but also because the last step ”corrected” the problem at hand and removed
the information showcasing the issue.

Three tables:
TableA: A_PK_BI, B_PK_BI_FK, A_DEF01_BI, A_VAL01_DP, A_VAL02_DP, A_VAL03_DP,
A_TOT_DP
TableB: B_PK_BI, C_PK_BI_FK
TableC: C_PK_BI, C_VAL01_DP, C_VAL02_DP

TableA is very large and reside in schema S01.
TableB small, around 6-9 entries and reside in schema S01.
TableC small, around 30-60 entries and reside in schema S02.

PK = Primary key
FK = Foreign key
BI = BigInt
DP = Double precision

select coalesce(sum(a.A_TOT_DP) / 1000, 0) as someVal
from
    TableA a
    inner join TableB b on a.B_PK_BI = b.B_PK_BI
    inner join TableC c on b.C_PK_BI = c.C_PK_BI
where
    a.A_DEF01_BI in (:LIST_OF_VALUES)
    and a.B_PK_BI_FK = :B_PRIMARY_KEY
    and (
        (((a.A_VAL01_DP + a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_01)
    or
        (((a.A_VAL03_DP - a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_02)
    )

Described as:
The selected entries from TableA
which can be linked to an entry in TableB
which has related information in TableC
And that fulfills these criteria.

Except suddenly that resulted in division by zero. Division is done in three
places, one fixed value and twice from TableC where each entry linked by
TableB has values greater than 0. I.e. none of the supposedly included
entries should be able to result in division by zero.

But, it seems the optimizer decided to restructure the whole thing to skip
the join relationship and explode TableA with TableC before using TableB as
a filter for the final aggregate.

I.e. appear to change the from clause to this in an early step:
select *
from
    TableA a,
    TableC c
where
    a.A_DEF01_BI in (:LIST_OF_VALUES)
    and a.B_PK_BI_FK = :B_PRIMARY_KEY
    and (
        (((a.A_VAL01_DP + a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_01)
    or
        (((a.A_VAL03_DP - a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_02)
)

TableA and TableC have no direct relation to each other. Only the entries in
TableC which can be bound via the bridge of TableB contain valid values for
TableA and the query as a whole. Meaning when the optimizer throws away the
relation and combines TableA with TableC it is using values which contain
invalid values for the where criteria. TableC in this case contains multiple
entries of 0 which in the calculated threshold criteria results in division
by zero error, but those entries should not be touched.

During investigation when TableB was checked in pgAdmin it indicated it was
in need of vaccum/analyze after which the query which had been throwing
division by zero was re-arranged by the optimizer to again work as intended
by the original description. Regardless of TableB requiring vacuum/analyze,
having the optimizer basically throw out the specified relationship and then
use the incorrectly gathered result in calculations seems rather incorrect.

pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #13586: Repository/Distribution issues
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #13592: Optimizer throws out join constraint causing incorrect result