Re: add_path optimization - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: add_path optimization
Date
Msg-id 201002270307.o1R370d14100@momjian.us
Whole thread Raw
In response to add_path optimization  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: add_path optimization
List pgsql-hackers
Did this ever get applied/resolved?

---------------------------------------------------------------------------

Robert Haas wrote:
> I've been doing some benchmarking and profiling on the PostgreSQL
> query analyzer, and it seems that (at least for the sorts of queries
> that I typically run) the dominant cost is add_path().  I've been able
> to find two optimizations that seem to help significantly:
> 
> 1. add_path() often calls compare_fuzzy_path_costs() twice on the same
> pair of paths, and when the paths compare equal on one criterion, some
> comparisons are duplicated.  I've refactored this function to return
> the results of both calculations without repeating any floating-point
> arithmetic.
> 
> 2. match_unsorted_outer() adds as many as 5 nested loop joins at a
> time with the same set of pathkeys.  In my tests, it tended to be ~3 -
> cheapest inner, cheapest inner materialized, and cheapest inner index.
>  Since these all have the same pathkeys, clearly only the one with the
> cheapest total cost is in the running for cheapest total cost for that
> set of pathkeys, and likewise for startup cost (and the two may be the
> same).  Yet we compare all of them against the whole pathlist, one
> after the other, including (for the most part) the rather expensive
> pathkey comparison.  I've added a function add_similar_paths() and
> refactored match_unsorted_outer() to use it.
> 
> On a couple of complex (and proprietary) queries with 12+ joins each,
> I measure a planning time improvement of 8-12% with the attached patch
> applied.  It would be interesting to try to replicate this on a
> publicly available data set, but I don't know of a good one to use.
> Suggestions welcome - results of performance testing on your own
> favorite big queries even more welcome.  Simple test harness also
> attached.  I took the approach of dropping caches, starting the
> server, and then running this 5 times each on several queries,
> dropping top and bottom results.
> 
> ...Robert

[ Attachment, skipping... ]

[ Attachment, skipping... ]

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.comPG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard
drive,Christ can be your backup. +
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Was partitioned table row estimation fixed in 9.0?
Next
From: Bruce Momjian
Date:
Subject: Re: Testing of parallel restore with current snapshot