Home > mailing lists

Re: Nested loop vs merge join: inconsistencies between estimated and actual time - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Nested loop vs merge join: inconsistencies between estimated and actual time
Date	March 7, 2008 02:35:58
Msg-id	7386.1204871746@sss.pgh.pa.us Whole thread Raw
In response to	Nested loop vs merge join: inconsistencies between estimated and actual time (Vlad Arkhipov <arhipov@dc.baikal.ru>)
Responses	Re: Nested loop vs merge join: inconsistencies between estimated and actual time
List	pgsql-performance

Tree view

Vlad Arkhipov <arhipov@dc.baikal.ru> writes:
> I've came across this issue while writing report-like query for 2 not
> very large tables. I've tried several methods to resolve this one (see
> below). But now I'm really stuck...

It looks like you are wishing to optimize for all-in-memory situations,
in which case the traditional advice is to reduce random_page_cost to
something close to 1.  AFAICS all the rowcount estimates you're seeing
are spot on, or as close to spot on as you could realistically hope for,
and so the problem lies with the cost parameters.  Fooling with the
statistics is not going to help if the rowcount estimates are already
good.

(Note: the apparent undercounts you're seeing on indexscans on the outer
side of a mergejoin seem to be because the mergejoin terminates early
due to limited range of the other input join key.  The planner is
expecting this, as we can see because the predicted cost of the join is
actually much less than the predicted cost of running the input
indexscan to completion.  The cost ratio is about consistent with the
rowcount ratio, which makes me think it got these right too.)

            regards, tom lane

pgsql-performance by date:

From: Vlad Arkhipov
Date: 07 March 2008, 01:52:53
Subject: Nested loop vs merge join: inconsistencies between estimated and actual time

From: "Pavel Rotek"
Date: 07 March 2008, 04:35:47
Subject: Toast space grows

Re: Nested loop vs merge join: inconsistencies between estimated and actual time - Mailing list pgsql-performance

Previous

Next