Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data - Mailing list pgsql-general

From Jeff Hoffmann
Subject Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data
Date
Msg-id 36B867DD.A0EC47C@remapcorp.com
Whole thread Raw
In response to Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data
Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data
List pgsql-general
Bruce Momjian wrote:
>
> > 1. If I re-organize the data, I would be able to perform my queries
> > without executing joins on multiple tables per query.
> >
> > 2. As I re-organize the data, the database becomes less and less
> > intuitive and (seemingly) less "normal".
> >
> > So, I guess my question is:  how costly are joins?  I've heard that
> > Postgres pretty much "pukes" (in terms of speed) when you're trying
> > to do anything more than 6 table joins in one query.  This leads
> > me to believe that joins are fairly costly... ????
> >
> > Does anyone have any words of advice for me as I battle this?
>
> We are working speeding up large table joins right now.  Try doing SET
> GEQO to set the value lower and see if that helps.

i've noticed a pretty drastic slowdown going from a 4 table join
(instantaneous) to a 5 table join (15-20 seconds) i don't think i've
tried 6 tables yet with the same database.  is there a lower limit to
where using GEQO doesn't really make sense in terms of speed (i.e.,
would setting GEQO to 5 be a toss up and 4 not make sense?)  i'm
guessing that the number of plans the optimizer checks without GEQO goes
up factorially, while GEQO goes up fairly linearly (from my limited
knowledge of genetic algorithms, basically you make a series of first
guesses and gradually refine them and throw away the losers until you
end up with the right one.)  if my join is based purely on primary keys,
shouldn't just about any plan work well, or at least well enough that it
doesn't pay to make an exhaustive search of the plan space, making GEQO
the best choice?  i guess my question is "is there a rule of thumb for
setting GEQO?"  is there a reason it was set to 8 by default?  does GEQO
work better in some cases than others? (small tables, joins on
non-indexed fields, etc.)

it seems like i'm always learning something new with postgres -- i've
never thought about this before; it's just one more thing for me to play
around with...

pgsql-general by date:

Previous
From: Herouth Maoz
Date:
Subject: Re: [GENERAL] Unexpected input in plpgsql script.
Next
From: "Alex P. Rudnev"
Date:
Subject: Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data