Thread: BUG #13592: Optimizer throws out join constraint causing incorrect result

BUG #13592: Optimizer throws out join constraint causing incorrect result

From
kristoffer.gustafsson@yves-rocher.se
Date:
The following bug has been logged on the website:

Bug reference:      13592
Logged by:          Kristoffer Gustafsson
Email address:      kristoffer.gustafsson@yves-rocher.se
PostgreSQL version: 9.4.3
Operating system:   Linux
Description:

"A program produces the wrong output for any given input."

I’m including a description of the encountered issue because I can’t provide
you with a reproducible set of data. Both because of the data in question
but also because the last step ”corrected” the problem at hand and removed
the information showcasing the issue.

Three tables:
TableA: A_PK_BI, B_PK_BI_FK, A_DEF01_BI, A_VAL01_DP, A_VAL02_DP, A_VAL03_DP,
A_TOT_DP
TableB: B_PK_BI, C_PK_BI_FK
TableC: C_PK_BI, C_VAL01_DP, C_VAL02_DP

TableA is very large and reside in schema S01.
TableB small, around 6-9 entries and reside in schema S01.
TableC small, around 30-60 entries and reside in schema S02.

PK = Primary key
FK = Foreign key
BI = BigInt
DP = Double precision

select coalesce(sum(a.A_TOT_DP) / 1000, 0) as someVal
from
    TableA a
    inner join TableB b on a.B_PK_BI = b.B_PK_BI
    inner join TableC c on b.C_PK_BI = c.C_PK_BI
where
    a.A_DEF01_BI in (:LIST_OF_VALUES)
    and a.B_PK_BI_FK = :B_PRIMARY_KEY
    and (
        (((a.A_VAL01_DP + a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_01)
    or
        (((a.A_VAL03_DP - a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_02)
    )

Described as:
The selected entries from TableA
which can be linked to an entry in TableB
which has related information in TableC
And that fulfills these criteria.

Except suddenly that resulted in division by zero. Division is done in three
places, one fixed value and twice from TableC where each entry linked by
TableB has values greater than 0. I.e. none of the supposedly included
entries should be able to result in division by zero.

But, it seems the optimizer decided to restructure the whole thing to skip
the join relationship and explode TableA with TableC before using TableB as
a filter for the final aggregate.

I.e. appear to change the from clause to this in an early step:
select *
from
    TableA a,
    TableC c
where
    a.A_DEF01_BI in (:LIST_OF_VALUES)
    and a.B_PK_BI_FK = :B_PRIMARY_KEY
    and (
        (((a.A_VAL01_DP + a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_01)
    or
        (((a.A_VAL03_DP - a.A_VAL02_DP) / c.C_VAL01_DP) > :THRESHOLD_02)
)

TableA and TableC have no direct relation to each other. Only the entries in
TableC which can be bound via the bridge of TableB contain valid values for
TableA and the query as a whole. Meaning when the optimizer throws away the
relation and combines TableA with TableC it is using values which contain
invalid values for the where criteria. TableC in this case contains multiple
entries of 0 which in the calculated threshold criteria results in division
by zero error, but those entries should not be touched.

During investigation when TableB was checked in pgAdmin it indicated it was
in need of vaccum/analyze after which the query which had been throwing
division by zero was re-arranged by the optimizer to again work as intended
by the original description. Regardless of TableB requiring vacuum/analyze,
having the optimizer basically throw out the specified relationship and then
use the incorrectly gathered result in calculations seems rather incorrect.

Re: BUG #13592: Optimizer throws out join constraint causing incorrect result

From
"David G. Johnston"
Date:
On Wed, Aug 26, 2015 at 3:50 PM, <kristoffer.gustafsson@yves-rocher.se>
wrote:

> The following bug has been logged on the website:
>
> Bug reference:      13592
> Logged by:          Kristoffer Gustafsson
> Email address:      kristoffer.gustafsson@yves-rocher.se
> PostgreSQL version: 9.4.3
> Operating system:   Linux
> Description:
>
> "A program produces the wrong output for any given input."
>
> I=E2=80=99m including a description of the encountered issue because I ca=
n=E2=80=99t
> provide
> you with a reproducible set of data. Both because of the data in question
> but also because the last step =E2=80=9Dcorrected=E2=80=9D the problem at=
 hand and removed
> the information showcasing the issue.
>
> =E2=80=8B[...]=E2=80=8B


>
> Except suddenly that resulted in division by zero. Division is done in
> three
> places, one fixed value and twice from TableC where each entry linked by
> TableB has values greater than 0. I.e. none of the supposedly included
> entries should be able to result in division by zero.
>
> But, it seems the optimizer decided to restructure the whole thing to ski=
p
> the join relationship and explode TableA with TableC before using TableB =
as
> a filter for the final aggregate.
>
> =E2=80=8B[...]
=E2=80=8B


> TableA and TableC have no direct relation to each other. Only the entries
> in
> TableC which can be bound via the bridge of TableB contain valid values f=
or
> TableA and the query as a whole. Meaning when the optimizer throws away t=
he
> relation and combines TableA with TableC it is using values which contain
> invalid values for the where criteria. TableC in this case contains
> multiple
> entries of 0 which in the calculated threshold criteria results in divisi=
on
> by zero error, but those entries should not be touched.
>
> During investigation when TableB was checked in pgAdmin it indicated it w=
as
> in need of vaccum/analyze after which the query which had been throwing
> division by zero was re-arranged by the optimizer to again work as intend=
ed
> by the original description. Regardless of TableB requiring vacuum/analyz=
e,
> having the optimizer basically throw out the specified relationship and
> then
> use the incorrectly gathered result in calculations seems rather incorrec=
t.
>

=E2=80=8BI do not follow but the fact that a division-by-zero exception occ=
urs in
some execution plans but not others is not a bug.=E2=80=8B

=E2=80=8BIf you do not want any rows where (c.C_VAL01_DP =3D 0) to be consi=
dered you
should alter the query so that instead of linking to everything in "c" you
only consider those rows having a non-zero C_VAL01_DP attribute.

At worse this is a performance-related issue that happens to manifest as a
division-by-zero.  however, your report is inadequate to consider that
particular dynamic.

David J.



=E2=80=8B

Re: BUG #13592: Optimizer throws out join constraint causing incorrect result

From
Kristoffer Gustafsson
Date:
REo6DQrigItJIGRvIG5vdCBmb2xsb3cgYnV0IHRoZSBmYWN0IHRoYXQgYSBkaXZpc2lvbi1ieS16
ZXJvIGV4Y2VwdGlvbiBvY2N1cnMgaW4gc29tZSBleGVjdXRpb24gcGxhbnMgYnV0IG5vdCBvdGhl
cnMgaXMgbm90IGEgYnVnLuKAiyAgSWYgeW91IGRvIG5vdCB3YW50IGFueSByb3dzIHdoZXJlIChj
LkNfVkFMMDFfRFAgPSAwKSB0byBiZSBjb25zaWRlcmVkIHlvdSBzaG91bGQgYWx0ZXIgdGhlIHF1
ZXJ5IHNvIHRoYXQgaW5zdGVhZCBvZiBsaW5raW5nIHRvIGV2ZXJ5dGhpbmcgaW4gImMiIHlvdSBv
bmx5IGNvbnNpZGVyIHRob3NlIHJvd3MgaGF2aW5nIGEgbm9uLXplcm8gQ19WQUwwMV9EUCBhdHRy
aWJ1dGUuDQoNCg0KS0c6DQpTb3JyeSwgSSBjYW4ndCBwcm92aWRlIG1vcmUgaW5mby93b3JraW5n
IHNhbXBsZS4gVGhlIGFuYWx5emUgZml4ZWQgdGhlIGV2YWx1YXRpb24gb3JkZXIuDQoNClNob3Vs
ZCBJIGludGVycHJldCB5b3VyIHJlc3BvbnNlIHRoYXQgdGhlIGlubmVyIGpvaW5zIGFyZSBvZiBu
byBhY3R1YWwgdmFsdWUgaW4gZGV0ZXJtaW5pbmcgdGhlIHJlc3VsdCBzZXRzIHVzZWQgYnkgdGhl
IG9wdGltaXplcj8NCg0KVGhlIEZST00gaGFzIHNldCBBIHdoaWNoIGlzIGpvaW5lZCB3aXRoIHNl
dCBCIHdoaWNoIGxpbmtzIHNldCBDLg0KVGhlcmUgaXMgYSBwYXRoIHdpdGgga2V5cyBnb2luZyBm
cm9tIEEgdG8gQiB0byBDLg0KVGhlcmUgaXMgbm8gbGluayBmcm9tIEEgdG8gQy4NCkMgaXMgb25s
eSB2YWxpZCBhcyBwYXJ0IG9mIEIuDQpTZWxlY3RpbmcgdGhlIGZ1bGwgc2V0IG9mIEEgYW5kIEMg
d2l0aG91dCB0aGUgbGltaXQgZnJvbSBCIGluY2x1ZGVzIGFsbCBlbnRyaWVzIGZyb20gQyBldmVu
IHdoZW4gb25seSB0aGUgc3Vic2V0IGxpbmtlZCBieSBCIHNob3VsZCBiZSBjb25zaWRlcmVkIGFz
IHZhbGlkIGFjY29yZGluZyB0byBGUk9NLg0KDQpPZiBjb3Vyc2UgeW91IGNhbiB3cml0ZSB0aGUg
V0hFUkUgY3JpdGVyaWEgdG8gbm90IGNvbnRhaW4gZXhwcmVzc2lvbnMvY2FsY3VsYXRpb25zIGFu
ZCBkdXBsaWNhdGUgdGhlIGNvbnN0cmFpbnQgZXhwcmVzc2VkIGJ5IHRoZSBGUk9NIHNlY3Rpb24s
IGJ1dCBkb2Vzbid0IHRoYXQgZGVmZWF0IHRoZSBwdXJwb3NlIG9mIGRlY2xhcmluZyB0aGUgam9p
bnMgaW4gRlJPTT8NCg0KImMuQ19WQUwwMV9EUCA9IDAiIHNob3VsZCBwZXIgZGVmaW5pdGlvbiBv
ZiB0aGUgRlJPTSBuZXZlciBiZSBpbmNsdWRlZCBzaW5jZSBub25lIG9mIHRoZSB2YWxpZCBlbnRy
aWVzIHNwZWNpZmllZCBieSB0aGUgcmVsYXRpb25zaGlwIGhhcyB0aGF0IHZhbHVlLg0KDQpEb2lu
ZyB0aGUgZXZhbHVhdGlvbiBpbiB0aGlzIG90aGVyIG9yZGVyIGlzIG9uZSB3YXkgb2YgZ2VuZXJh
dGluZyB0aGUgcmVzdWx0LCBidXQgaXQgaXMgaWdub3JpbmcgdGhlIHJlbGF0aW9uc2hpcCBleHBy
ZXNzZWQgYnkgdGhlIEZST00gc2VjdGlvbiBhbmQgdXNpbmcgaXQgYXMgYW55IG90aGVyIGZpbHRl
ci4NCg0KQnV0IGlmIHRoaXMgaXMgd29ya2luZyBhcyBpbnRlbmRlZCBzbyBiZSBpdCwgd2lsbCBq
dXN0IGhhdmUgdG8gY292ZXIgZm9yIGl0IHdpdGggYWRkaXRpb25hbCBjcml0ZXJpYSBpbiBxdWVy
aWVzLg0KDQoNCg0K4oCLDQpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KDQoNCg0K
RGV0dGEgbWVkZGVsYW5kZSBvY2ggYWxsYSBiaWxhZ29yIMOkciBrb25maWRlbnRpZWxsYSBvY2gg
YXZzZWRkYSBmw7ZyIGRlbiBuYW1uZ2l2bmEgbW90dGFnYXJlbi4gT20gZHUgaGFyIGbDpXR0IGRl
dHRhIG1lZGRlbGFuZGUgYXYgbWlzc3RhZywgdsOkbmxpZ2VuIG1lZGRlbGEgb21lZGVsYmFydCBh
dnPDpG5kYXJlbiBvY2ggdGEgc2VkYW4gYm9ydCBtZWRkZWxhbmRldC4gQWxsIG90aWxsw6V0ZW4g
bW9kaWZpZXJpbmcsIGFudsOkbmRuaW5nIGVsbGVyIHNwcmlkbmluZyDDpHIgZsO2cmJqdWRlbi4g
QXZzw6RuZGFyZW4gw6RyIGludGUgYW5zdmFyaWcgZsO2ciBkZXR0YSBtZWRkZWxhbmRlIG9tIGRl
dCBoYXIgw6RuZHJhdHMsIGbDtnJmYWxza2F0cywgcmVkaWdlcmF0cywgc21pdHRhdCBhdiBldHQg
dmlydXMgZWxsZXIgc3ByaWRpdHMgdXRhbiB0aWxsc3TDpW5kLiBTa3JpdiBpbnRlIHV0IGRldHRh
IG1lZGRlbGFuZGUgb20gZGV0IGludGUgw6RyIG7DtmR2w6RuZGlndCwgdMOkbmsgcMOlIG1pbGrD
tm4uDQoNClRoaXMgbWVzc2FnZSBhbmQgYW55IGF0dGFjaG1lbnRzIGFyZSBjb25maWRlbnRpYWwg
YW5kIGludGVuZGVkIGZvciB0aGUgbmFtZWQgYWRkcmVzc2VlKHMpIG9ubHkuIElmIHlvdSBoYXZl
IHJlY2VpdmVkIHRoaXMgbWVzc2FnZSBpbiBlcnJvciwgcGxlYXNlIG5vdGlmeSBpbW1lZGlhdGVs
eSB0aGUgc2VuZGVyLCB0aGVuIGRlbGV0ZSB0aGUgbWVzc2FnZS4gQW55IHVuYXV0aG9yaXplZCBt
b2RpZmljYXRpb24sIGVkaXRpb24sIHVzZSBvciBkaXNzZW1pbmF0aW9uIGlzIHByb2hpYml0ZWQu
IFRoZSBzZW5kZXIgaXMgbm90IGxpYWJsZSBmb3IgdGhpcyBtZXNzYWdlIGlmIGl0IGhhcyBiZWVu
IG1vZGlmaWVkLCBhbHRlcmVkLCBmYWxzaWZpZWQsIGluZmVjdGVkIGJ5IGEgdmlydXMgb3IgZXZl
biBlZGl0ZWQgb3IgZGlzc2VtaW5hdGVkIHdpdGhvdXQgYXV0aG9yaXphdGlvbi4gRG8gbm90IHBy
aW50IHRoaXMgbWVzc2FnZSB1bmxlc3MgaXQgaXMgbmVjZXNzYXJ5LCBjb25zaWRlciB0aGUgZW52
aXJvbm1lbnQuDQo=
Kristoffer Gustafsson <kristoffer.gustafsson@yves-rocher.se> writes:
> The FROM has set A which is joined with set B which links set C.
> There is a path with keys going from A to B to C.
> There is no link from A to C.

Actually, there is: the WHERE clause involving A and C is itself a join
clause, since it allows filtering out some pairs of A and C rows, albeit
only after performing a nestloop join.  It seems somewhat unlikely that
the optimizer would choose that approach in preference to equijoins, but
since you've not shown us any concrete details, it can't be ruled out.
For example, if both A and C are small and both could usefully be used in
an indexscan on a large B table, joining A to C first would make perfect
sense.

> Of course you can write the WHERE criteria to not contain expressions/calculations and duplicate the constraint
expressedby the FROM section, but doesn't that defeat the purpose of declaring the joins in FROM? 

There is *no* semantic difference between writing a join clause in WHERE
and writing it in an (inner) JOIN/ON clause.  There is certainly no
promise about the execution order.  See
http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL

Personally I'd dodge the whole problem by reformulating the WHERE to avoid
division, that is

    (
        ((a.A_VAL01_DP + a.A_VAL02_DP) > (:THRESHOLD_01 * c.C_VAL01_DP))
    or
        ((a.A_VAL03_DP - a.A_VAL02_DP) > (:THRESHOLD_02 * c.C_VAL01_DP))
    )

If you can't fix it in that sort of way, the traditional approach to
forcing the join order in Postgres is to put the desired innermost join
in a sub-SELECT with OFFSET 0, which works as an optimization fence.
(The optimizer can't push joins or WHERE clauses past a LIMIT/OFFSET
for fear of changing the set of rows returned.)  But JOIN/ON is most
certainly not an optimization fence.

            regards, tom lane

Re: BUG #13592: Optimizer throws out join constraint causing incorrect result

From
Kristoffer Gustafsson
Date:
VG9tLCBEYXZpZCwNCllvdSBhcmUgY29ycmVjdC4gSSdtIHNvcnJ5LCBnb3R0ZW4gYWNjdXN0b21l
ZCB0byB0aGUgam9pbiBjcml0ZXJpYS90YWJsZSByZWxhdGlvbnNoaXAgdGFraW5nIGZpcnN0IHNl
YXQgb3ZlciBvdGhlciBleHByZXNzaW9uLg0KDQpKdXN0IHRvIGZpbGwgaW4sDQo+IEZvciBleGFt
cGxlLCBpZiBib3RoIEEgYW5kIEMgYXJlIHNtYWxsIGFuZCBib3RoIGNvdWxkIHVzZWZ1bGx5IGJl
IHVzZWQgaW4gYW4gaW5kZXhzY2FuIG9uIGEgbGFyZ2UgQiB0YWJsZSwgam9pbmluZyBBIHRvIEMg
Zmlyc3Qgd291bGQgbWFrZSBwZXJmZWN0IHNlbnNlLg0KDQpBIGlzIG11Y2ggbGFyZ2VyIHRoYW4g
QiBhbmQgQzsgQiBjb250YWlucyB0aGUgbGVhc3QgbnVtYmVyIG9mIGVudHJpZXMgKH42IGluIHRo
aXMgREIpOyBDIHNsaWdodGx5IGxhcmdlciB0aGFuIEIgKH42MCkuIFJlcG9ydCBpbmRpY2F0ZWQg
QiByZXF1aXJlZCBWYWN1dW0vQW5hbHl6ZSBzbyBJIGd1ZXNzIG9wdGltaXplciBzYXcgbW9yZSB0
aGFuIHRob3NlIH42Lg0KDQpHdWVzcyBmaXggd2lsbCBiZSBlaXRoZXIgc2ltaWxhciB0byBleGFt
cGxlIG9yIGZvcmNlIHRoZSBCL0MgcmVsYXRpb24gdG8gZXZhbHVhdGUgZmlyc3QgaW50byBELg0K
DQpBbHNvLCBJIGFwb2xvZ2l6ZSBmb3IgdGhlIGJsb2IgbWVzc2FnZSBpbmNsdWRlZCBhdCBlbmQg
b2YgbWFpbCwgaXQgaXMgYXR0YWNoZWQgYnkgc2VydmVyIG9uIG91dGdvaW5nIG1haWwuDQoNClJl
Z2FyZHMNCktyaXN0b2ZmZXIgR3VzdGFmc3Nvbg0KDQpfX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fXw0KDQoNCg0KRGV0dGEgbWVkZGVsYW5kZSBvY2ggYWxsYSBiaWxhZ29yIMOkciBrb25m
aWRlbnRpZWxsYSBvY2ggYXZzZWRkYSBmw7ZyIGRlbiBuYW1uZ2l2bmEgbW90dGFnYXJlbi4gT20g
ZHUgaGFyIGbDpXR0IGRldHRhIG1lZGRlbGFuZGUgYXYgbWlzc3RhZywgdsOkbmxpZ2VuIG1lZGRl
bGEgb21lZGVsYmFydCBhdnPDpG5kYXJlbiBvY2ggdGEgc2VkYW4gYm9ydCBtZWRkZWxhbmRldC4g
QWxsIG90aWxsw6V0ZW4gbW9kaWZpZXJpbmcsIGFudsOkbmRuaW5nIGVsbGVyIHNwcmlkbmluZyDD
pHIgZsO2cmJqdWRlbi4gQXZzw6RuZGFyZW4gw6RyIGludGUgYW5zdmFyaWcgZsO2ciBkZXR0YSBt
ZWRkZWxhbmRlIG9tIGRldCBoYXIgw6RuZHJhdHMsIGbDtnJmYWxza2F0cywgcmVkaWdlcmF0cywg
c21pdHRhdCBhdiBldHQgdmlydXMgZWxsZXIgc3ByaWRpdHMgdXRhbiB0aWxsc3TDpW5kLiBTa3Jp
diBpbnRlIHV0IGRldHRhIG1lZGRlbGFuZGUgb20gZGV0IGludGUgw6RyIG7DtmR2w6RuZGlndCwg
dMOkbmsgcMOlIG1pbGrDtm4uDQoNClRoaXMgbWVzc2FnZSBhbmQgYW55IGF0dGFjaG1lbnRzIGFy
ZSBjb25maWRlbnRpYWwgYW5kIGludGVuZGVkIGZvciB0aGUgbmFtZWQgYWRkcmVzc2VlKHMpIG9u
bHkuIElmIHlvdSBoYXZlIHJlY2VpdmVkIHRoaXMgbWVzc2FnZSBpbiBlcnJvciwgcGxlYXNlIG5v
dGlmeSBpbW1lZGlhdGVseSB0aGUgc2VuZGVyLCB0aGVuIGRlbGV0ZSB0aGUgbWVzc2FnZS4gQW55
IHVuYXV0aG9yaXplZCBtb2RpZmljYXRpb24sIGVkaXRpb24sIHVzZSBvciBkaXNzZW1pbmF0aW9u
IGlzIHByb2hpYml0ZWQuIFRoZSBzZW5kZXIgaXMgbm90IGxpYWJsZSBmb3IgdGhpcyBtZXNzYWdl
IGlmIGl0IGhhcyBiZWVuIG1vZGlmaWVkLCBhbHRlcmVkLCBmYWxzaWZpZWQsIGluZmVjdGVkIGJ5
IGEgdmlydXMgb3IgZXZlbiBlZGl0ZWQgb3IgZGlzc2VtaW5hdGVkIHdpdGhvdXQgYXV0aG9yaXph
dGlvbi4gRG8gbm90IHByaW50IHRoaXMgbWVzc2FnZSB1bmxlc3MgaXQgaXMgbmVjZXNzYXJ5LCBj
b25zaWRlciB0aGUgZW52aXJvbm1lbnQuDQo=