Re: [HACKERS] nested loops in joins, ambiguous rewrite rules - Mailing list pgsql-hackers

From Charles Hornberger
Subject Re: [HACKERS] nested loops in joins, ambiguous rewrite rules
Date
Msg-id 3.0.5.32.19990130213507.00b28b60@k4azl.net
Whole thread Raw
In response to Re: [HACKERS] nested loops in joins, ambiguous rewrite rules  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: [HACKERS] nested loops in joins, ambiguous rewrite rules  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
>We turn on geqo at 8 relations.  Try:
>
>    SET GEQO TO 4
>
>and try the query again.  Let us know.

Well isn't that something!  Thanks so much for your help!

I set the GEQO variable to 4 and now the 11.5 minute query executes in 6 seconds with this query plan:

Hash Join  (cost=21.99 size=152 width=124) ->  Hash Join  (cost=17.48 size=38 width=108)       ->  Hash Join
(cost=13.48size=16 width=92)             ->  Hash Join  (cost=10.09 size=8 width=76)                   ->  Hash Join
(cost=6.66size=7 width=60)                         ->  Nested Loop  (cost=3.26 size=6 width=44)
     ->  Seq Scan on volume g  (cost=1.07 size=2 width=16)                               ->  Seq Scan on article a
(cost=1.10size=3 width=28)                         ->  Hash  (cost=0.00 size=0 width=0)
-> Seq Scan on article_text d  (cost=1.10 size=3 width=16)                   ->  Hash  (cost=0.00 size=0 width=0)
                 ->  Seq Scan on locale f  (cost=1.10 size=3 width=16)             ->  Hash  (cost=0.00 size=0 width=0)
                 ->  Seq Scan on issue e  (cost=1.07 size=2 width=16)       ->  Hash  (cost=0.00 size=0 width=0)
    ->  Seq Scan on section b  (cost=1.23 size=7 width=16) ->  Hash  (cost=0.00 size=0 width=0)       ->  Seq Scan on
article_sourcec  (cost=1.13 size=4 width=16)
 


Are there any recommendations about what value *ought* to be set for GEQO?  It seems to me like 'ON=8' is pretty high
--for us, it meant that UNLESS we explicity set that variable for every JOIN query of 6-7 tables, the joins were going
tobog down to a total crawl, while sending memory and CPU consumption through the roof (roughly 22MB and 90-95%,
respectively,for the entire query-processing period).
 

What we've done is change the default setting in /src/include/optimizer/internals.h and recompiled. (It's the very last
linein that file.)  Maybe it'd be nice to add that as a command-line option to postmaster?
 

Also, we couldn't find the GEQO README, which was mentioned several times in comments in the source code but doesn't
appearto have made its way into the distribution tarball.  (AFAIK, we don't have a copy anywhere beneath
/usr/local/pgsql/.)Maybe it got overlooked when the tarball was balled up?
 

Thanks again.  If you'd like me to submit any more information about this "problem", please let me know.

Charlie

At 10:12 PM 1/30/99 -0500, Bruce Momjian wrote:
>See the SET options of psql.
>
>test=> show geqo\g
>NOTICE:  GEQO is ON beginning with 8 relations
>SHOW VARIABLE
>test=> \q
>
>
>
>> At 04:07 PM 1/30/99 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >First, you're assuming that a merge-join plan is necessarily better than
>> >a nested-loop plan.  That should be true for large queries, but it is
>> >*not* necessarily true for small tables --- when there are only a few
>> >tuples in the tables being scanned, a simple nested loop wins because it
>> >has much less startup overhead.  (Or at least that's what our optimizer
>> >thinks; I have not tried to measure this for myself.)
>> 
>> OK, I understand that I don't understand whether merge-join plans are
>> necessarily better than nested-loop plans, and that it could make sense to
>> pick one or the other depending on the size of the tables and the number of
>> rows in them.  Also, your explanation of how 'vacuum analyze' updates the
>> statistics in pg_class and pg_statistic makes it very clear why I'm seeing
>> one query plan in one DB, and different plan in the other.  Thanks for the
>> quick lesson, and my apologies for making it happen on the hackers list.
>
>
>-- 
>  Bruce Momjian                        |  http://www.op.net/~candle
>  maillist@candle.pha.pa.us            |  (610) 853-3000
>  +  If your life is a hard drive,     |  830 Blythe Avenue
>  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>
>


pgsql-hackers by date:

Previous
From: "D'Arcy" "J.M." Cain
Date:
Subject: Patches
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Patches