Home > mailing lists

Re: why hash on the primary key? - Mailing list pgsql-general

From	Robert Haas
Subject	Re: why hash on the primary key?
Date	November 29, 2008 15:42:40
Msg-id	603c8f070811290842r214628a8h5f1503b44a07c037@mail.gmail.com Whole thread Raw
In response to	Re: why hash on the primary key? (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-general

Tree view

>>> Well, that just says your cost parameters need a bit of adjustment
>>> if you'd like the planner to get the crossover point exactly right.
>
>> Any sense of which ones might be worth fiddling with?
>
> random_page_cost, effective_cache_size, maybe the cpu_xxx parameters.

I fiddled with this some more on a more complex query with more than
20 joins.  It appears that the culprit is from_collapse_limit.  If I
raise from_collapse_limit, most of the hash joins turn into Nested
Loops or Nested Loop Left Joins with Index Scans.  The estimated cost
is about 20% of the cost of the original plan, but the planning takes
so much longer that the actual time is 60% larger.  This is really the
pits.

I had hoped that the work Simon Riggs was doing on join removal would
hit 8.4, but that doesn't seem likely at this point.  The problem is
that the application is a web app where the user can select which
columns they want to see.  Most of the time only ~6-8 columns of 40+
that are available are selected, and the joins needed to generate the
unselected columns can be tossed (which would hopefully both avoid
unnecessary join steps and also speed up planning the remaining
joins).

...Robert

pgsql-general by date:

From: Tom Lane
Date: 29 November 2008, 14:43:11
Subject: Re: why hash on the primary key?

From: "Andrus"
Date: 29 November 2008, 17:12:26
Subject: db backup script in gentoo

Re: why hash on the primary key? - Mailing list pgsql-general

Previous

Next