Home > mailing lists

Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a
Date	December 3, 2009 02:24:43
Msg-id	603c8f070912021924j3893ba16wcf9538eeb9038a15@mail.gmail.com Whole thread Raw
In response to	Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Wed, Dec 2, 2009 at 9:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> One other thing I'm noticing about the current implementation is that
>> it seems to spend an entirely excessive amount of brain power
>> considering the best order in which to execute cross-joins.  If I do
>> X, A JOIN B ON Pab JOIN C ON Pac JOIN D ON Pad JOIN E ON Pae, it looks
>> to me like join_search_one_level() will try joining X to each of A-E.
>> That seems fairly pointless; why would I ever want to join X to
>> anything other than {A B C D E}?
>
> Not sure that a lot of cross joins with no conditions is the case to
> design around.  Usually queries aren't that devoid of features of
> interest, and so different join paths are actually usefully different.

Not sure what you mean.  There's already a special-case code path for
cross joins; but I think it's probably considering a lot of silly
paths.  Is there a case where it makes sense to do cross joins at some
stage of the process other than last?

>> ...  We should maybe also
>> think about raising the default value for work_mem.  It's hard for me
>> to believe that the average Postgres user wants a sort that takes more
>> than 1MB of memory to spill to disk; there certainly are people who
>> probably want that, but I doubt there are very many.  I believe we've
>> been using that value for a decade, and memory size has increased a
>> lot in that time.
>
> Maybe.  I'll certainly grant that machines have more memory, but is the
> average Postgres installation using that to run bigger sorts, or to run
> more sorts (either more concurrent queries or more complex queries
> containing more sorts)?  We know that increasing work_mem too much
> can be counterproductive, and much sooner than one might think.

A further confounding factor is that work_mem also controls memory
usage for hash tables - whereas the original sort_mem did not - and at
least in my experience it's more common to have multiple hashes in a
query than multiple sorts.   It would be nice to have some data on
this rather than just hand-waving, but I'm not sure how to get it.
For default_statistics_target, *_collapse_threshold, and
geqo_threshold, we were able to construct worst-case queries and
benchmark them.  I have no idea how to do something comparable for
work_mem.

...Robert

pgsql-hackers by date:

From: Robert Haas
Date: 03 December 2009, 02:04:49
Subject: Re: CommitFest status/management

From: Tom Lane
Date: 03 December 2009, 02:32:30
Subject: Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a

Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a - Mailing list pgsql-hackers

Previous

Next