Re: slower merge join on sorted data chosen over - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: slower merge join on sorted data chosen over
Date
Msg-id s34a8581.076@gwmta.wicourts.gov
Whole thread Raw
Responses Re: slower merge join on sorted data chosen over
List pgsql-hackers
Thanks, Tom.
I spent a few hours trying different searches in the archives, and
found three very interesting threads on the topic.  All were from
2003.  Should I keep digging for more recent threads, or would
these probably represent the current state of the issue?
These left me somewhat concerned, since people were reporting
queries which ran orders of magnitude slower using merge joins
than when they forced them to nested loop index scans.  In our
first brush with the issue it caused our query to run only two to
three times slower than it would if the planner could more
accurately cost the nested index scans, and it was in an area with
minimal impact to the organization, so this one is relatively benign.
We are looking at doing much more with PostgreSQL over the
next two years, and it seems likely that this issue will come up
again where it is more of a problem.  It sounded like there was
some agreement on HOW this was to be fixed, yet I don't see
any mention of doing it in the TODO list.  Is there any sort of
estimate for how much programming work would be involved?
Any chance of an incremental improvement, such as only
counting leaf access in a nested select?  (While that's not
perfect, it seems likely to be closer, and therefore beneficial
overall.)
Thanks,
-Kevin
>>> Tom Lane <tgl@sss.pgh.pa.us> 10/06/05 9:28 PM >>>

There's a known issue that the planner tends to overestimate the cost of
inner-index-scan nestloops, because it doesn't allow for the strong
caching effects associated with repeated scans of the same index (in
particular, that all the upper index levels tend to stay in cache).
See the archives for past inconclusive discussions about how to fix
this.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_dumpall --clean versus roles and shared dependencies
Next
From: Andrew Dunstan
Date:
Subject: Re: Need A Suggestion