Re: BUG #1878: Different execution plans for the same query. - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #1878: Different execution plans for the same query.
Date
Msg-id 7272.1127516164@sss.pgh.pa.us
Whole thread Raw
In response to BUG #1878: Different execution plans for the same query.  ("Alexei" <alexei_novakov@yahoo.com>)
List pgsql-bugs
Alexei Novakov <alexei-novakov@rogers.com> writes:
> Attaching the dumps of the tables. And here is the
> query:

> select T1.MAIN_ID, T6.LEVEL_IND, T12.MAIN_ID
> from TMP_1 T1, TMP_2 T2, TMP_2 T3, TMP_2 T4, TMP_2 T5,
> TMP_3 T6,
>      TMP_2 T7, TMP_2 T8, TMP_2 T9, TMP_2 T10, TMP_2
> T11, TMP_1 T12
> where T2.ID_2 = T1.MAIN_ID
> and T2.ID_TO_ID_TYPE = 201269
> and 990765472621084681 = T2.ID_1
> and T4.ID_1 = T12.MAIN_ID
> and T4.ID_TO_ID_TYPE = 201143
> and T3.ID_TO_ID_TYPE = 201143
> and 990765472621084681 = T3.ID_1
> and T4.ID_2 = T3.ID_2
> and T7.ID_1 = T12.MAIN_ID
> and T7.ID_TO_ID_TYPE = 201224
> and T6.ID_TO_ID_TYPE = 201209
> and T6.LEVEL_IND != 0
> and T5.ID_TO_ID_TYPE = 201224
> and 990765472621084681 = T5.ID_1
> and T6.ID_1 = T5.ID_2
> and T7.ID_2 = T6.ID_2
> and T9.ID_1 = T12.MAIN_ID
> and T9.ID_TO_ID_TYPE = 201125
> and T8.ID_TO_ID_TYPE = 201125
> and 990765472621084681 = T8.ID_1
> and T9.ID_2 = T8.ID_2
> and T11.ID_1 = T12.MAIN_ID
> and T11.ID_TO_ID_TYPE = 201071
> and T10.ID_TO_ID_TYPE = 201071
> and 990765472621084681 = T10.ID_1
> and T11.ID_2 = T10.ID_2
> and T12.ID_TYPE = 301286

I looked into this and found that the problem is bad rowcount estimates,
which lead the planner to make plans that are good only by chance.
Since you have enough tables in the query to exceed geqo_threshold, the
random aspect of the behavior isn't surprising --- but increasing
geqo_threshold isn't likely to solve your problem, it'll probably just
produce repeatable not-very-good behavior :-(

The reason why the rowcount estimates are so far off is that there's a
lot of correlation that the planner doesn't recognize.  For instance
you've got

    and T4.ID_TO_ID_TYPE = 201143
    and T3.ID_TO_ID_TYPE = 201143
    and T4.ID_2 = T3.ID_2

where T3 and T4 are actually the same table and so the number of rows
matching the above conditions is much higher than a naive guess would
suggest.  It's possible that the planner could do better if it had
cross-column correlation statistics, but it doesn't.

It's hard for me to tell whether any of these conditions are redundant
given your schema, but if they are I'd recommend removing the redundant
conditions --- they aren't doing much for you except fooling the
planner.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: suggestion: fix 'now' -> CURRENT_TIMESTAMP
Next
From: Mark Stosberg
Date:
Subject: Re: suggestion: fix 'now' -> CURRENT_TIMESTAMP