Thread: I got bit by that darn GEQO setting again...

I got bit by that darn GEQO setting again...

From
Mike Mascari
Date:
I know Tom would like a definitive and thorough testing to determine the
proper GEQO threshold limit , and that is the right thing to do, of course.

But just as a quick notice to those upgrading from 7.3 to 7.4 with fully
normalized databases requiring > 11 joins, the GEQO setting can be a
killer...

How about a TIP:

"For large number of joins, test whether the GEQO settings are right for
you"

Mike Mascari



Re: I got bit by that darn GEQO setting again...

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
> But just as a quick notice to those upgrading from 7.3 to 7.4 with fully
> normalized databases requiring > 11 joins, the GEQO setting can be a
> killer...

Uh ... dare I ask whether you think it's too high?  Or too low?

            regards, tom lane

Re: I got bit by that darn GEQO setting again...

From
David Helgason
Date:
> Mike Mascari <mascarm@mascari.com> writes:
>> But just as a quick notice to those upgrading from 7.3 to 7.4 with
>> fully
>> normalized databases requiring > 11 joins, the GEQO setting can be a
>> killer...
>
> Uh ... dare I ask whether you think it's too high?  Or too low?

Just a data point: With a fresh 7.4 and

    geqo=on
    geqo_threshold=11
    geqo_generations=0
    geqo_effort=1
    geqo_pool_size=0
    geqo_selection_bias=2

a 12 table join was taking a whole second to plan until I manually
connected the tables (now it's in the order of a few ms).

I figure geqo had kicked in at that point.



d.


Re: I got bit by that darn GEQO setting again...

From
Mike Mascari
Date:
Tom Lane wrote:

>Mike Mascari <mascarm@mascari.com> writes:
>
>
>>But just as a quick notice to those upgrading from 7.3 to 7.4 with fully
>>normalized databases requiring > 11 joins, the GEQO setting can be a
>>killer...
>>
>>
>
>Uh ... dare I ask whether you think it's too high?  Or too low?
>
>
Too low. In fact, after testing some of my queries which are a bit large
(# of tables) in size, I usually just wind up turning it off. I know
that's insufficient evidence to do anything, but I have yet to stumble
upon a query where GEQO performs better. Do the ODSL folks run their
performance tests with it on? off? both? I'm sorry I haven't had the
time to develop a rigorous test... :-(

Mike Mascari



Re: I got bit by that darn GEQO setting again...

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
> Tom Lane wrote:
>> Uh ... dare I ask whether you think it's too high?  Or too low?
>>
> Too low. In fact, after testing some of my queries which are a bit large
> (# of tables) in size, I usually just wind up turning it off.

Well, that's why it's configurable ;-).  But don't you find that it
takes a long time to plan the larger queries?  How many tables are
involved, exactly?

            regards, tom lane

Re: I got bit by that darn GEQO setting again...

From
Mike Mascari
Date:
Tom Lane wrote:

>Mike Mascari <mascarm@mascari.com> writes:
>
>
>>Tom Lane wrote:
>>
>>
>>>Uh ... dare I ask whether you think it's too high?  Or too low?
>>>
>>>
>>>
>>Too low. In fact, after testing some of my queries which are a bit large
>>(# of tables) in size, I usually just wind up turning it off.
>>
>>
>
>Well, that's why it's configurable ;-).
>
True!

>But don't you find that it
>takes a long time to plan the larger queries?  How many tables are
>involved, exactly?
>
>
Well, this particular query uses a UNION where the first SELECT query is
composed of an 11-way join, and the second 5. So it is not testing the
64-way join scenario. From memory, the most joins I execute in a single
query is in the low twenties. In those scenerios, I had used explicit
join syntax to improve planning time.

I ran a crude script  to test the differences in planning time (EXPLAIN)
and execution time (EXPLAIN ANALYZE). I wanted to do the ANALYZE as well
since the plans generated were different and I feared the GEQO generated
plan may be the actual cause of the sluggishness, instead of the actual
planning. The script just feeds the SQL to psql, so I know it is timing
the client, connection costs, psql formatting, etc...a.k.a. crude. But
as you can see, over ten runs for each category, GEQO loses big. I get
(in seconds):

      label      | count |  avg   |       stddev
-----------------+-------+--------+---------------------
 NO GEQO PLAN    |    10 | 0.8809 | 0.00564604089409752
 NO GEQO ANALYZE |    10 | 1.1534 |  0.0093594871654564
 GEQO PLAN       |    10 | 3.0127 |   0.119783183757633
 GEQO ANALYZE    |    10 | 6.0446 |    2.50524499578163

Mike Mascari



Re: I got bit by that darn GEQO setting again...

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
> as you can see, over ten runs for each category, GEQO loses big. I get
> (in seconds):

>       label      | count |  avg   |       stddev
> -----------------+-------+--------+---------------------
>  NO GEQO PLAN    |    10 | 0.8809 | 0.00564604089409752
>  NO GEQO ANALYZE |    10 | 1.1534 |  0.0093594871654564
>  GEQO PLAN       |    10 | 3.0127 |   0.119783183757633
>  GEQO ANALYZE    |    10 | 6.0446 |    2.50524499578163

Interesting.  I'd expect that GEQO might sometimes not pick a good plan
(it is a quasi-random search after all, and could miss the best plan).
The large stddev for the last row shows that you don't always get the
same plan, which is expected.

But those numbers say it is actually slower in choosing a plan than the
regular planner.  This should definitely not be the case --- there's no
point at all in GEQO if it doesn't save planning time.

Can you send me the exact query being tested here, as well as the
database schema (pg_dump -s)?

            regards, tom lane