Thread: add_path optimization

add_path optimization

From

Robert Haas

Date:

01 February 2009, 00:37:45

I've been doing some benchmarking and profiling on the PostgreSQL
query analyzer, and it seems that (at least for the sorts of queries
that I typically run) the dominant cost is add_path().  I've been able
to find two optimizations that seem to help significantly:

1. add_path() often calls compare_fuzzy_path_costs() twice on the same
pair of paths, and when the paths compare equal on one criterion, some
comparisons are duplicated.  I've refactored this function to return
the results of both calculations without repeating any floating-point
arithmetic.

2. match_unsorted_outer() adds as many as 5 nested loop joins at a
time with the same set of pathkeys.  In my tests, it tended to be ~3 -
cheapest inner, cheapest inner materialized, and cheapest inner index.
 Since these all have the same pathkeys, clearly only the one with the
cheapest total cost is in the running for cheapest total cost for that
set of pathkeys, and likewise for startup cost (and the two may be the
same).  Yet we compare all of them against the whole pathlist, one
after the other, including (for the most part) the rather expensive
pathkey comparison.  I've added a function add_similar_paths() and
refactored match_unsorted_outer() to use it.

On a couple of complex (and proprietary) queries with 12+ joins each,
I measure a planning time improvement of 8-12% with the attached patch
applied.  It would be interesting to try to replicate this on a
publicly available data set, but I don't know of a good one to use.
Suggestions welcome - results of performance testing on your own
favorite big queries even more welcome.  Simple test harness also
attached.  I took the approach of dropping caches, starting the
server, and then running this 5 times each on several queries,
dropping top and bottom results.

...Robert

Attachment

Re: add_path optimization

From

David Fetter

Date:

01 February 2009, 13:03:50

On Sat, Jan 31, 2009 at 11:37:39PM -0500, Robert Haas wrote:
> I've been doing some benchmarking and profiling on the PostgreSQL
> query analyzer, and it seems that (at least for the sorts of queries
> that I typically run) the dominant cost is add_path().  I've been
> able to find two optimizations that seem to help significantly:

Are there any cases you've found where this change significantly
impairs performance, and if so, how did you find them?  If not, would
you be up for trying to find some?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: add_path optimization

From

Robert Haas

Date:

01 February 2009, 14:12:25

On Sun, Feb 1, 2009 at 12:03 PM, David Fetter <david@fetter.org> wrote:
> On Sat, Jan 31, 2009 at 11:37:39PM -0500, Robert Haas wrote:
>> I've been doing some benchmarking and profiling on the PostgreSQL
>> query analyzer, and it seems that (at least for the sorts of queries
>> that I typically run) the dominant cost is add_path().  I've been
>> able to find two optimizations that seem to help significantly:
>
> Are there any cases you've found where this change significantly
> impairs performance, and if so, how did you find them?  If not, would
> you be up for trying to find some?

Basically, the patch is just performing the same operations with less
overhead.  For example, add_similar_path() is pretty much the same
thing as repeated calls to add_path(), but you save the cost of
unnecessary pathkey comparisons and maybe some ListCell alloc/free
cycles.  So I'm not really sure how it could make things worse, but
I'd be interested in knowing if there's a case that you're worried
about.  It's pretty low-level code, so I don't think there's room for
a lot of surprises.

...Robert

Re: add_path optimization

From

Jaime Casanova

Date:

01 February 2009, 14:34:57

On Sat, Jan 31, 2009 at 11:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I've been doing some benchmarking and profiling on the PostgreSQL
> query analyzer, and it seems that (at least for the sorts of queries
> that I typically run) the dominant cost is add_path().  I've been able
> to find two optimizations that seem to help significantly:
>
> 1. add_path() often calls compare_fuzzy_path_costs() twice on the same
>
> 2. match_unsorted_outer() adds as many as 5 nested loop joins at a

if there are two optimizations maybe two different patches are better
to test them individually

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: add_path optimization

From

Robert Haas

Date:

01 February 2009, 15:15:12

On Sun, Feb 1, 2009 at 1:34 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Sat, Jan 31, 2009 at 11:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I've been doing some benchmarking and profiling on the PostgreSQL
>> query analyzer, and it seems that (at least for the sorts of queries
>> that I typically run) the dominant cost is add_path().  I've been able
>> to find two optimizations that seem to help significantly:
>>
>> 1. add_path() often calls compare_fuzzy_path_costs() twice on the same
>>
>> 2. match_unsorted_outer() adds as many as 5 nested loop joins at a
>
> if there are two optimizations maybe two different patches are better
> to test them individually

I did test the changes independently and either one alone has a
measurable benefit (with sufficiently careful measuring), but they're
closely related changes so I think it makes more sense as one patch.
It's only 84 insertions and 46 deletions, so we're not talking about
some massive patch that will be difficult to review.   There's also
some synergy between the two changes: I don't think either works as
well without the other.  But please feel free to test it for yourself
and let me know what you find.  The changes are all very simple - the
hard part was figuring out which changes would actually produce a
benefit.

...Robert

Re: add_path optimization

From

Grzegorz Jaskiewicz

Date:

01 February 2009, 16:25:56

disclaimer: I don't know that bit of postgresql code, in fact - this  
is the first time I see it.

*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
***************
*** 473,478 **** match_unsorted_outer(PlannerInfo *root,
--- 473,481 ----
          if (nestjoinOK)          {
+             Path *paths[5];


I don't like the fact that you hardcoded that here. I know that you  
are trying to pass on few calls in one go here, but still... ugly.

static int
compare_fuzzy_path_costs(Path *path1, Path *path2, int *startup_cost)
{
....*startup_cost = (s == 0) ? t : s;

Why not *startup_cost = s, and let the caller decide which value it  
wants to use ?
or just return both, from single call (which would ?
...
return t;
}


To be fair, I don't see compare_fuzzy_path_costs change to save too  
much of time in planner.
I would myself probably convert that function into two defines/inline  
funcs, but that's just me.

Re: add_path optimization

From

Robert Haas

Date:

01 February 2009, 17:35:40

On Sun, Feb 1, 2009 at 3:25 PM, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote:
> I don't like the fact that you hardcoded that here. I know that you are
> trying to pass on few calls in one go here, but still... ugly.

Well, I think you'll find that using a dynamically sized data
structure destroys the possibility of squeezing any additional
performance out of this part of the code.  The nice thing about
fixed-size data structures is that they cost essentially nothing to
stack-allocate; you just move the stack pointer and away you go.  We
should in fact be looking for MORE places where we can avoid the use
of constructs like List, since the second-highest CPU hog in my tests
was AllocSetAlloc(), beaten out only by add_path().  With this patch
applied, AllocSetAlloc() moves up to first.

(It would really be rather better to include all the paths generated
in each pass of the loop in the call to add_similar_path(), but that
looked rather more complicated because we can't predict how many of
them there will be, and so adding a fixed-size data structure is not
so easy.  Plus, the code would all have to be rewritten not to assume
that "continue" was the right way to move on to the next iteration of
the loop.  What would potentially be better still is to try to figure
out which nested loop will be the winner without allocating all of the
NestPath nodes in the first place, but that didn't seem possible
without much more invasive changes, and it's not clear that you would
actually still have a winner by the time you got done beating on it.)

I am also somewhat mystified as to why using an array of size 5 to
hold up to 5 data structure allocated in nearly-consecutive lines of C
code would qualify as ugly (completely apart from the fact that it's a
clear performance win).

> static int
> compare_fuzzy_path_costs(Path *path1, Path *path2, int *startup_cost)
> {
> ....
>        *startup_cost = (s == 0) ? t : s;
>
> Why not *startup_cost = s, and let the caller decide which value it wants to
> use ?
> or just return both, from single call (which would ?
> ...
>
>        return t;
> }

You're essentially suggesting removing logic from
compare_fuzzy_path_costs() and duplicating it at the two call sites of
that function.
If there's a point to that, I don't see it.  You might also take a
look at the widely used function compare_path_costs().

> To be fair, I don't see compare_fuzzy_path_costs change to save too much of
> time in planner.

Hmm, well I didn't either, but there's this handy tool called gprof
that you might want to try out.  I wouldn't be wasting my time
patching this part of the code if it didn't make a difference, and in
fact if you do 10% of the amount of benchmarking that I did in the
process of creating this patch, you will find that it in fact does
make a difference.

> I would myself probably convert that function into two defines/inline funcs,
> but that's just me.

It's already static to that .c file, so the compiler likely will
inline it.  In fact, I suspect you will find that removing the static
keyword from the implementation of that function in CVS HEAD is itself
sufficient to produce a small but measurable slowdown in planning of
large join trees, exactly because it will defeat inlining.

...Robert

Re: add_path optimization

From

Grzegorz Jaskiewicz

Date:

02 February 2009, 10:42:58

On 1 Feb 2009, at 21:35, Robert Haas wrote:

> On Sun, Feb 1, 2009 at 3:25 PM, Grzegorz Jaskiewicz <gj@pointblue.com.pl 
> > wrote:
>> I don't like the fact that you hardcoded that here. I know that you  
>> are
>> trying to pass on few calls in one go here, but still... ugly.
>
> Well, I think you'll find that using a dynamically sized data
> structure destroys the possibility of squeezing any additional
> performance out of this part of the code.  The nice thing about
> fixed-size data structures is that they cost essentially nothing to
> stack-allocate; you just move the stack pointer and away you go.  We
> should in fact be looking for MORE places where we can avoid the use
> of constructs like List, since the second-highest CPU hog in my tests
> was AllocSetAlloc(), beaten out only by add_path().  With this patch
> applied, AllocSetAlloc() moves up to first.
well, true - but also, statically allocated table, without any  
predefined size (with #DEFINE) , and no boundary check - is bad as well.
I suppose , this code is easy enough to let it be with your changes,  
but I would still call it not pretty.
>

> Hmm, well I didn't either, but there's this handy tool called gprof
> that you might want to try out.  I wouldn't be wasting my time
> patching this part of the code if it didn't make a difference, and in
> fact if you do 10% of the amount of benchmarking that I did in the
> process of creating this patch, you will find that it in fact does
> make a difference.
>
To be honest, I really didn't had a time to run it down with your  
patch and gprof. I believe that you did that already, hence your  
suggestion, right ?
Actually - if you did profile postgresql with bunch of queries, I  
wouldn't mind to see results of it - I don't know whether it makes  
sense to send that to the list (I would think it does), but if it is  
too big, or something - you could send it to me in private.

> It's already static to that .c file, so the compiler likely will
> inline it.  In fact, I suspect you will find that removing the static
> keyword from the implementation of that function in CVS HEAD is itself
> sufficient to produce a small but measurable slowdown in planning of
> large join trees, exactly because it will defeat inlining.
that depends on many things, including whether optimizations are on or  
not.
Because that function basically consists of two ifs essentially - it  
could easily be turned into two separate inlines/macros - that would  
remove any function's specific overhead (stack alloc, etc, etc).

Re: add_path optimization

From

Robert Haas

Date:

02 February 2009, 10:51:24

> well, true - but also, statically allocated table, without any predefined
> size (with #DEFINE) , and no boundary check - is bad as well.
> I suppose , this code is easy enough to let it be with your changes, but I
> would still call it not pretty.

Well, it might merit a comment.

> Actually - if you did profile postgresql with bunch of queries, I wouldn't
> mind to see results of it - I don't know whether it makes sense to send that
> to the list (I would think it does), but if it is too big, or something -
> you could send it to me in private.

What I'd really like to do is develop some tests based on a publicly
available dataset.  Any suggestions?

...Robert

Re: add_path optimization

From

Grzegorz Jaskiewicz

Date:

02 February 2009, 11:58:36

On 2 Feb 2009, at 14:50, Robert Haas wrote:

>> well, true - but also, statically allocated table, without any  
>> predefined
>> size (with #DEFINE) , and no boundary check - is bad as well.
>> I suppose , this code is easy enough to let it be with your  
>> changes, but I
>> would still call it not pretty.
>
> Well, it might merit a comment.
:)

>
> What I'd really like to do is develop some tests based on a publicly
> available dataset.  Any suggestions?

I would say, it wouldn't hurt to do benchmarking/profiling regression  
tests on real hardware - but someone will have to generate quite  
substantial amount of data, so we could test it on small queries, up  
to 20+ join/sort/window function/aggregation queries, with various  
indexes, and data types. The more real the data, the better.
I could make some of my stuff public - but without the lookup tables  
(id->some real data -  like, names, surnames, mac addr, etc).

Re: add_path optimization

From

"Kevin Grittner"

Date:

02 February 2009, 21:10:44

>>> Robert Haas <robertmhaas@gmail.com> wrote: 
> running this 5 times each on several queries,
> dropping top and bottom results.
Running a complex query (posted in previous threads, runs about
300,000 time per day in a production web application), I got these
timings on a production quality machine (4 quad CPU chips, that is 16
CPUs like this: Intel(R) Xeon(R) CPU  X7350 @ 2.93GHz, 128 GB RAM, big
RAID with BBU).  I ran explain in each environment 5 times, tossed
high and low, and averaged.  The 8.4devel was from today's
(2008-02-02) snapshot, built the same way we did 8.3.5.
8.3.5, statistics target 10:  36.188 ms
8.4devel without patch, statistics target 100:  109.862 ms
8.4devel with patch, statistics target 100:  104.015 ms
After seeing that, I re-analyzed to eliminate the statistics target as
the cause of the 8.4 increase.

8.4devel with patch, statistics target 10:  99.421 ms
-Kevin

Re: add_path optimization

From

Robert Haas

Date:

02 February 2009, 22:26:29

On Mon, Feb 2, 2009 at 8:10 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>>>> Robert Haas <robertmhaas@gmail.com> wrote:
>> running this 5 times each on several queries,
>> dropping top and bottom results.
>
> Running a complex query (posted in previous threads, runs about
> 300,000 time per day in a production web application), I got these
> timings on a production quality machine (4 quad CPU chips, that is 16
> CPUs like this: Intel(R) Xeon(R) CPU  X7350 @ 2.93GHz, 128 GB RAM, big
> RAID with BBU).  I ran explain in each environment 5 times, tossed
> high and low, and averaged.  The 8.4devel was from today's
> (2008-02-02) snapshot, built the same way we did 8.3.5.
>
> 8.3.5, statistics target 10:  36.188 ms
> 8.4devel without patch, statistics target 100:  109.862 ms
> 8.4devel with patch, statistics target 100:  104.015 ms
>
> After seeing that, I re-analyzed to eliminate the statistics target as
> the cause of the 8.4 increase.
>
> 8.4devel with patch, statistics target 10:  99.421 ms

Yikes!  The impact of the patch is about what I'd expect, but the fact
that planning time has nearly tripled is... way poor.  Can you repost
the query and the EXPLAIN output for 8.3.5 and CVS HEAD?

...Robert

Re: add_path optimization

From

Robert Haas

Date:

02 February 2009, 23:19:01

>> Running a complex query (posted in previous threads, runs about
>> 300,000 time per day in a production web application), I got these
>> timings on a production quality machine (4 quad CPU chips, that is 16
>> CPUs like this: Intel(R) Xeon(R) CPU  X7350 @ 2.93GHz, 128 GB RAM, big
>> RAID with BBU).  I ran explain in each environment 5 times, tossed
>> high and low, and averaged.  The 8.4devel was from today's
>> (2008-02-02) snapshot, built the same way we did 8.3.5.
>>
>> 8.3.5, statistics target 10:  36.188 ms
>> 8.4devel without patch, statistics target 100:  109.862 ms
>> 8.4devel with patch, statistics target 100:  104.015 ms
>>
>> After seeing that, I re-analyzed to eliminate the statistics target as
>> the cause of the 8.4 increase.
>>
>> 8.4devel with patch, statistics target 10:  99.421 ms
>
> Yikes!  The impact of the patch is about what I'd expect, but the fact
> that planning time has nearly tripled is... way poor.  Can you repost
> the query and the EXPLAIN output for 8.3.5 and CVS HEAD?

FYI, I retested my queries on REL8_3_STABLE and the results were not
all that different from CVS HEAD.  So the problem is apparently
specific to something your query is doing that mine isn't., rather
than a general slowdown in planning (or else one of us goofed up the
testing).

...Robert

Re: add_path optimization

From

Tom Lane

Date:

02 February 2009, 23:26:36

Robert Haas <robertmhaas@gmail.com> writes:
> Yikes!  The impact of the patch is about what I'd expect, but the fact
> that planning time has nearly tripled is... way poor.

We're going to need to see the test case, because I don't see that in
some simple tests here.
        regards, tom lane

Re: add_path optimization

From

Stephen Frost

Date:

02 February 2009, 23:32:42

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Yikes!  The impact of the patch is about what I'd expect, but the fact
> > that planning time has nearly tripled is... way poor.
>
> We're going to need to see the test case, because I don't see that in
> some simple tests here.

A good data set, plus complex queries against it, might be the data from
the US Census, specifically the TIGER data and the TIGER geocoder.  I've
been following this thread with the intention of putting together a
large-data test set, but I just havn't found the time to yet.  Right now
there's alot of dependencies on PostGIS (which aren't really required to
just do the queries to pull out the street segment) which I figure
people would want ripped out.  It'd also be nice to include the other
Census data besides just the road data.

If people really are interested, I'll see what I can put together.  It's
*alot* of data (around 23G total in PG), though perhaps just doing 1
state would be enough for a good test, I keep the states split up
anyway using CHECK constraints.  Don't think that would change this
case, though there might be cases where it does affect things..
Thanks,
    Stephen

Re: add_path optimization

From

Robert Haas

Date:

02 February 2009, 23:45:28

> A good data set, plus complex queries against it, might be the data from
> the US Census, specifically the TIGER data and the TIGER geocoder.  I've
> been following this thread with the intention of putting together a
> large-data test set, but I just havn't found the time to yet.  Right now
> there's alot of dependencies on PostGIS (which aren't really required to
> just do the queries to pull out the street segment) which I figure
> people would want ripped out.  It'd also be nice to include the other
> Census data besides just the road data.
>
> If people really are interested, I'll see what I can put together.  It's
> *alot* of data (around 23G total in PG), though perhaps just doing 1
> state would be enough for a good test, I keep the states split up
> anyway using CHECK constraints.  Don't think that would change this
> case, though there might be cases where it does affect things..

I'm interested, but I need maybe a 1GB data set, or smaller.  The
thing that we are benchmarking is the planner, and planning times are
related to the complexity of the database and the accompanying
queries, not the raw volume of data.  (It's not size that matters,
it's how you use it?)  In fact, in a large database, one could argue
that there is less reason to care about the planner, because the
execution time will dominate anyway.  I'm interested in complex
queries in web/OLTP type applications, where you need the query to be
planned and executed in 400 ms at the outside (and preferably less
than half of that).

...Robert

Re: add_path optimization

From

Stephen Frost

Date:

02 February 2009, 23:48:53

* Robert Haas (robertmhaas@gmail.com) wrote:
> I'm interested, but I need maybe a 1GB data set, or smaller.  The
> thing that we are benchmarking is the planner, and planning times are
> related to the complexity of the database and the accompanying
> queries, not the raw volume of data.  (It's not size that matters,
> it's how you use it?)  In fact, in a large database, one could argue
> that there is less reason to care about the planner, because the
> execution time will dominate anyway.  I'm interested in complex
> queries in web/OLTP type applications, where you need the query to be
> planned and executed in 400 ms at the outside (and preferably less
> than half of that).

We prefer that our geocoding be fast... :)  Doing 1 state should give
you about the right size (half to 1G of data).  I'll try to put together
a good test set this week.
Stephen

Re: add_path optimization

From

Tom Lane

Date:

02 February 2009, 23:55:40

Robert Haas <robertmhaas@gmail.com> writes:
> I'm interested, but I need maybe a 1GB data set, or smaller.  The
> thing that we are benchmarking is the planner, and planning times are
> related to the complexity of the database and the accompanying
> queries, not the raw volume of data.

In fact, the only reason to care whether there is any data in the DB
*at all* is that you need some realistic content in pg_statistic.
So it should be possible to set up a planner test DB with very little
data bulk, which would surely make testing a lot less painful.
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 10:15:57

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> In fact, the only reason to care whether there is any data in the DB
> *at all* is that you need some realistic content in pg_statistic.
> So it should be possible to set up a planner test DB with very
little
> data bulk, which would surely make testing a lot less painful.
Can you suggest a query (or queries) which, together with a schema
dump, would give you enough to duplicate my results?
-Kevin

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 10:48:07

>>> Robert Haas <robertmhaas@gmail.com> wrote: 
> FYI, I retested my queries on REL8_3_STABLE and the results were not
> all that different from CVS HEAD.  So the problem is apparently
> specific to something your query is doing that mine isn't., rather
> than a general slowdown in planning (or else one of us goofed up the
> testing).
I know you said size doesn't matter, but just for the record, the ten
tables I loaded for this test put the database at 56G.  I'm pulling
information together to share on this, but I was wondering: is there
any possibility that the tendency to use index scans in nested loops
(given the table sizes and the availability of useful indexes)
contributes to the difference?
Other possible factors:
Most keys are multi-column and include varchar-based data types.
Most columns are defined via domains.
(More info to follow.)
-Kevin

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 11:18:12

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Yikes!  The impact of the patch is about what I'd expect, but the
fact
>> that planning time has nearly tripled is... way poor.
>
> We're going to need to see the test case, because I don't see that
in
> some simple tests here.

Built the snapshot with this:

BINDIR = /usr/local/pgsql-8.4devel-20090202/bin
DOCDIR = /usr/local/pgsql-8.4devel-20090202/share/doc
HTMLDIR = /usr/local/pgsql-8.4devel-20090202/share/doc
INCLUDEDIR = /usr/local/pgsql-8.4devel-20090202/include
PKGINCLUDEDIR = /usr/local/pgsql-8.4devel-20090202/include
INCLUDEDIR-SERVER = /usr/local/pgsql-8.4devel-20090202/include/server
LIBDIR = /usr/local/pgsql-8.4devel-20090202/lib
PKGLIBDIR = /usr/local/pgsql-8.4devel-20090202/lib
LOCALEDIR = /usr/local/pgsql-8.4devel-20090202/share/locale
MANDIR = /usr/local/pgsql-8.4devel-20090202/share/man
SHAREDIR = /usr/local/pgsql-8.4devel-20090202/share
SYSCONFDIR = /usr/local/pgsql-8.4devel-20090202/etc
PGXS =
/usr/local/pgsql-8.4devel-20090202/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--prefix=/usr/local/pgsql-8.4devel-20090202'
'--enable-integer-datetimes' '--enable-debug' '--disable-nls'
'--with-libxml'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g
CFLAGS_SL = -fpic
LDFLAGS = -Wl,--as-needed
-Wl,-rpath,'/usr/local/pgsql-8.4devel-20090202/lib'
LDFLAGS_SL =
LIBS = -lpgport -lxml2 -lz -lreadline -lcrypt -ldl -lm
VERSION = PostgreSQL 8.4devel

Ran it with this:

max_connections = 10
shared_buffers = 250MB
max_prepared_transactions = 0
work_mem = 50MB
maintenance_work_mem = 1GB
checkpoint_segments = 20
checkpoint_timeout = 30min
effective_cache_size = 100GB
geqo = off
from_collapse_limit = 20
join_collapse_limit = 20
datestyle = 'iso, mdy'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
default_text_search_config = 'pg_catalog.english'
escape_string_warning = off
sql_inheritance = off
standard_conforming_strings = on

In a new cluster the database was created with this:

create user cirowner;
create user cir;
create user viewer;
create database cir with owner cirowner;
\c cir
revoke create on database cir from public;
revoke create on schema public from public;
grant create on schema public to cirowner;
checkpoint;

Schema dump attached.

The ten tables which are used by the query were loaded through pg_dump
piped to psql.  I then ran VACUUM FREEZE ANALYZE.

I used \timing to capture run times.  Here is the query:

explain
(
SELECT
"C"."caseNo",
"C"."filingDate",
"CY"."countyName",
"S"."descr" AS "statusCodeDescr",
"P"."nameF",
"P"."nameM",
"P"."nameL",
"P"."suffix",
"P"."dob",
"C"."caption",
"CY"."countyNo",
"C"."caseType",
"C"."isSeal",
"C"."isPartySeal",
"LCM"."lastModified",
"P"."searchName",
CASE WHEN (("C"."filingDate" >= DATE '2008-11-01')
AND ("C"."wcisClsCode" = '30709'))
OR (("C"."caseType" IN ('CV','FA'))
AND ("C"."wcisClsCode" = '30711')
AND (EXISTS
(
SELECT
1
FROM
"CaseHist" "CH"
WHERE (
("C"."countyNo" = "CH"."countyNo")
AND ("C"."caseNo" = "CH"."caseNo")
AND ("CH"."eventType" IN ('FWBCA','CCTRO')))
))) THEN TRUE ELSE FALSE END AS "isPetitionerSeal"
FROM
"Case" "C"
JOIN "Party" "P"
ON (("C"."caseNo" = "P"."caseNo")
AND ("C"."countyNo" = "P"."countyNo")
AND TRUE
AND ((("P"."partyType" IN ('JV','CH'))
AND ("C"."caseType" = 'ZZ'))
OR ("P"."partyType" NOT IN ('JV','CH'))))
JOIN "StatusCode" "S"
ON ("C"."statusCode" = "S"."statusCode")
JOIN "County" "CY"
ON ("C"."countyNo" = "CY"."countyNo")
LEFT OUTER JOIN "CaseDispo" "CD"
ON (("CD"."caseNo" = "C"."caseNo")
AND ("CD"."countyNo" = "C"."countyNo")
AND (NOT (EXISTS
(
SELECT
1
FROM
"CaseDispo" "CD2"
WHERE (
("CD2"."caseNo" = "CD"."caseNo")
AND ("CD2"."countyNo" = "CD"."countyNo")
AND ("CD2"."dispoDate" > "CD"."dispoDate"))
))))
LEFT OUTER JOIN "LastCaseMod" "LCM"
ON (("LCM"."caseNo" = "C"."caseNo")
AND ("LCM"."countyNo" = "C"."countyNo")
AND ("LCM"."updateTypeId" = 0))
LEFT OUTER JOIN "WccaPermCaseType" "WPCT"
ON (("C"."caseType" = "WPCT"."caseType")
AND ("C"."countyNo" = "WPCT"."countyNo")
AND ("WPCT"."profileName" = 'PUBLIC'))
WHERE (
(("WPCT"."profileName" IS NOT NULL)
OR (("C"."caseType" IN ('PA','JD'))
AND ("C"."isConfidential" = FALSE)))
AND ("C"."isExpunge" <> TRUE)
AND (NOT (("WPCT"."profileName" = 'PUBLIC')
AND ("C"."caseType" = 'FA')
AND ("C"."wcisClsCode" = '40501')))
AND (NOT (EXISTS
(
SELECT
1
FROM
"HiddenCase" "HCA"
WHERE (
("HCA"."caseNo" = "C"."caseNo")
AND ("HCA"."countyNo" = "C"."countyNo"))
)))
AND ("P"."isSeal" = FALSE)
AND (NOT (("P"."partyType" IN ('PE','PL','JP'))
AND ((("C"."filingDate" >= DATE '2008-11-01')
AND ("C"."wcisClsCode" = '30709'))
OR (("C"."caseType" IN ('CV','FA'))
AND ("C"."wcisClsCode" = '30711')
AND (EXISTS
(
SELECT
1
FROM
"CaseHist" "CHPET"
WHERE (
("C"."countyNo" = "CHPET"."countyNo")
AND ("C"."caseNo" = "CHPET"."caseNo")
AND ("CHPET"."eventType" IN ('FWBCA','CCTRO')))
))))))
AND TRUE
AND ("P"."searchName" LIKE 'HILL,J%')
AND TRUE
AND (NOT (("C"."caseType" IN ('CF','CI','CM','CT','FO','TR'))
AND ("P"."partyType" <> 'DE')))
AND ((("C"."caseType" IN ('JA','JC','JG','JM','JO','JV','JI','TP'))
AND ("P"."partyType" IN ('CH','JV')))
OR (("C"."caseType" NOT IN ('JA','JC','JG','JM','JO','JV','JI','TP'))
AND ("P"."partyType" NOT IN ('CH','JV'))))
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE)
UNION
SELECT
"C"."caseNo",
"C"."filingDate",
"CY"."countyName",
"S"."descr" AS "statusCodeDescr",
"AL"."nameF",
"AL"."nameM",
"AL"."nameL",
"AL"."suffix",
"AL"."dob",
"C"."caption",
"CY"."countyNo",
"C"."caseType",
"C"."isSeal",
"C"."isPartySeal",
"LCM"."lastModified",
"AL"."searchName",
CASE WHEN (("C"."filingDate" >= DATE '2008-11-01')
AND ("C"."wcisClsCode" = '30709'))
OR (("C"."caseType" IN ('CV','FA'))
AND ("C"."wcisClsCode" = '30711')
AND (EXISTS
(
SELECT
1
FROM
"CaseHist" "CH"
WHERE (
("C"."countyNo" = "CH"."countyNo")
AND ("C"."caseNo" = "CH"."caseNo")
AND ("CH"."eventType" IN ('FWBCA','CCTRO')))
))) THEN TRUE ELSE FALSE END AS "isPetitionerSeal"
FROM
"Case" "C"
JOIN "Party" "P"
ON (("C"."caseNo" = "P"."caseNo")
AND ("C"."countyNo" = "P"."countyNo")
AND ((("P"."partyType" IN ('JV','CH'))
AND ("C"."caseType" = 'ZZ'))
OR ("P"."partyType" NOT IN ('JV','CH'))))
JOIN "Alias" "AL"
ON (("P"."caseNo" = "AL"."caseNo")
AND ("P"."countyNo" = "AL"."countyNo")
AND ("P"."partyNo" = "AL"."partyNo")
AND TRUE)
JOIN "StatusCode" "S"
ON ("C"."statusCode" = "S"."statusCode")
JOIN "County" "CY"
ON ("C"."countyNo" = "CY"."countyNo")
LEFT OUTER JOIN "CaseDispo" "CD"
ON (("CD"."caseNo" = "C"."caseNo")
AND ("CD"."countyNo" = "C"."countyNo")
AND (NOT (EXISTS
(
SELECT
1
FROM
"CaseDispo" "CD2"
WHERE (
("CD2"."caseNo" = "CD"."caseNo")
AND ("CD2"."countyNo" = "CD"."countyNo")
AND ("CD2"."dispoDate" > "CD"."dispoDate"))
))))
LEFT OUTER JOIN "LastCaseMod" "LCM"
ON (("LCM"."caseNo" = "C"."caseNo")
AND ("LCM"."countyNo" = "C"."countyNo")
AND ("LCM"."updateTypeId" = 0))
LEFT OUTER JOIN "WccaPermCaseType" "WPCT"
ON (("C"."caseType" = "WPCT"."caseType")
AND ("C"."countyNo" = "WPCT"."countyNo")
AND ("WPCT"."profileName" = 'PUBLIC'))
WHERE (
(("WPCT"."profileName" IS NOT NULL)
OR (("C"."caseType" IN ('PA','JD'))
AND ("C"."isConfidential" = FALSE)))
AND ("C"."isExpunge" <> TRUE)
AND (NOT (("WPCT"."profileName" = 'PUBLIC')
AND ("C"."caseType" = 'FA')
AND ("C"."wcisClsCode" = '40501')))
AND (NOT (EXISTS
(
SELECT
1
FROM
"HiddenCase" "HCA"
WHERE (
("HCA"."caseNo" = "C"."caseNo")
AND ("HCA"."countyNo" = "C"."countyNo"))
)))
AND ("P"."isSeal" = FALSE)
AND (NOT (("P"."partyType" IN ('PE','PL','JP'))
AND ((("C"."filingDate" >= DATE '2008-11-01')
AND ("C"."wcisClsCode" = '30709'))
OR (("C"."caseType" IN ('CV','FA'))
AND ("C"."wcisClsCode" = '30711')
AND (EXISTS
(
SELECT
1
FROM
"CaseHist" "CHPET"
WHERE (
("C"."countyNo" = "CHPET"."countyNo")
AND ("C"."caseNo" = "CHPET"."caseNo")
AND ("CHPET"."eventType" IN ('FWBCA','CCTRO')))
))))))
AND TRUE
AND ("AL"."searchName" LIKE 'HILL,J%')
AND TRUE
AND (NOT (("C"."caseType" IN ('CF','CI','CM','CT','FO','TR'))
AND ("P"."partyType" <> 'DE')))
AND ((("C"."caseType" IN ('JA','JC','JG','JM','JO','JV','JI','TP'))
AND ("P"."partyType" IN ('CH','JV')))
OR (("C"."caseType" NOT IN ('JA','JC','JG','JM','JO','JV','JI','TP'))
AND ("P"."partyType" NOT IN ('CH','JV'))))
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE
AND TRUE))
ORDER BY
"caseNo"
;

Plans from 8.3.5 and 8.4devel attached.

If you need something else, let me know.

-Kevin

Attachment

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 11:43:28

>>> I wrote: 
> Ran it with this:
> effective_cache_size = 100GB
Actually, the timings shown in the previous post were run with the
default for this setting.  I updated it after yesterday evening's
tests when I noticed I'd missed it, but had to leave before I could
rerun the tests.  I forgot that until now.  Sorry for the confusion. 
Surprisingly (to me), this setting affects plan time significantly. 
The 8.4devel with patch, statistics target 10, runs in an average
85.393 ms instead of the 99.421 ms I saw with the default.
The 8.3.5 settings were run with this set.  Both of the plans attached
to the previous message were run with this set.
-Kevin

Re: add_path optimization

From

Robert Haas

Date:

03 February 2009, 11:54:24

On Tue, Feb 3, 2009 at 10:17 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> We're going to need to see the test case, because I don't see that in
>> some simple tests here.
> Plans from 8.3.5 and 8.4devel attached.
>
> If you need something else, let me know.

I had a suspicion we were going to see something like this.  You're
using several NOT EXISTS clauses and 8.4devel is converting those into
Anti Joins. Aside from the longer planning time, the resulting plan
appears to have a much higher estimated cost, so I'm suspicious of a
bug in the new Semi/Anti join stuff, but I don't have time to track it
down right now (and I suspect Tom will figure it out faster than I can
anyway).

...Robert

Re: add_path optimization

From

Tom Lane

Date:

03 February 2009, 12:21:32

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> [ test case ]

It looks to me like the reason for the planning time difference is that
this query contains four NOT EXISTS subqueries, which 8.3 was not very
smart about but 8.4 has converted into antijoins.  That gives it more
flexibility to consider different join orders, which means more paths to
sort through, so it takes longer.  But in principle you are more likely
to get a good plan.  (You didn't say anything about the actual runtimes
--- I'd be interested to know about the runtimes and the quality of the
rowcount estimates in both cases.)

So as far as the fact that planning is slower is concerned, it's pretty
much "nothing to see here, move along".  I notice though that the
profile shows add_path is eating even more run-time percentage wise
than before, because it's getting called more.  (It's up from about
14% to 21%, counting subroutines --- see below.)  So there might be an
argument for installing Robert's optimization or something like it in
8.4 to buy some of that back, rather than waiting for 8.5.
        regards, tom lane

8.3:
               0.00    0.00    3700/2893200     set_rel_pathlist <cycle 5> [327]               0.00    0.00
4500/2893200    create_index_paths <cycle 5> [132]               0.51    0.17 2885000/2893200     add_paths_to_joinrel
<cycle5> [14]
 
[16]    14.7    0.51    0.18 2893200         add_path [16]               0.13    0.00 6401100/10760100
compare_pathkeys[29]               0.00    0.02  454600/621400      list_delete_cell [112]               0.01    0.00
453400/4243582    AllocSetFree [48]               0.01    0.00  453400/4242980     pfree [66]               0.00
0.00  85700/512901      lcons [98]               0.00    0.00  208700/1934900     compare_path_costs [196]
 

8.4:
               0.00    0.00    4100/10605500     set_rel_pathlist <cycle 8> [200]               0.00    0.00
4300/10605500    create_index_paths <cycle 8> [207]               2.20    0.57 10597100/10605500
add_paths_to_joinrel<cycle 8> [14]
 
[16]    21.7    2.20    0.57 10605500         add_path [16]               0.45    0.00 30231600/47490100
compare_pathkeys[24]               0.02    0.05 1584000/1909000     list_delete_cell [81]               0.03    0.00
1582800/13590386    AllocSetFree [46]               0.01    0.00 1014900/10462300     compare_path_costs [53]
   0.01    0.00 1582800/13589684     pfree [62]               0.00    0.00  169400/833901      lcons [108]

Re: add_path optimization

From

Tom Lane

Date:

03 February 2009, 12:26:01

Robert Haas <robertmhaas@gmail.com> writes:
> I had a suspicion we were going to see something like this.  You're
> using several NOT EXISTS clauses and 8.4devel is converting those into
> Anti Joins. Aside from the longer planning time, the resulting plan
> appears to have a much higher estimated cost, so I'm suspicious of a
> bug in the new Semi/Anti join stuff,

More likely the other way round --- I think 8.3's rowcounts and hence
costs are flights of fancy, and 8.4 is more likely to be in the right
ballpark.  We need to see EXPLAIN ANALYZE to find out though.
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 12:27:45

>>> Robert Haas <robertmhaas@gmail.com> wrote:
> I had a suspicion we were going to see something like this.  You're
> using several NOT EXISTS clauses and 8.4devel is converting those
into
> Anti Joins. Aside from the longer planning time, the resulting plan
> appears to have a much higher estimated cost, so I'm suspicious of a
> bug in the new Semi/Anti join stuff, but I don't have time to track
it
> down right now (and I suspect Tom will figure it out faster than I
can
> anyway).

The 8.4 plan also don't seem to run as fast.  Attached is EXPLAIN
ANALYZE output from the 8.3.5 database I dumped from for my 8.4devel
tests.  I tried the same run on 8.4devel and it is still running after
20 minutes.  I will let it cook for a while.

-Kevin

Attachment

explain-analyze-8.3.5.txt

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 12:40:17

>>> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> Attached is EXPLAIN
> ANALYZE output from the 8.3.5 database I dumped from for my 8.4devel
> tests.

Actually, that one is from the sibling machine which is in production.

Attached is the one on standby where I've been running the rest of
this.  Apparently, differences in the random samples lead to different
plans, so maybe having both will be of some use.

Apologies, but I'm having to work this in among other work.

-Kevin

Attachment

explain-analyze2-8.3.5.txt

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 13:34:51

>>> I wrote:
> I tried the same run on 8.4devel and it is still running after
> 20 minutes.  I will let it cook for a while.

It's now been an hour and 30 minutes; so, while 8.4 does a much better
job of estimating how many rows will be returned, the plan it
generates is much slower for this query.

FWIW I'll attach the results of vmstat 1 below.  The machine a
replication target and is receiving a little over 100 database
transactions per second, but is otherwise fairly idle, except for this
long-running query.

-Kevin

Attachment

vmstat1.txt

Re: add_path optimization

From

Tom Lane

Date:

03 February 2009, 13:47:33

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> It's now been an hour and 30 minutes; so, while 8.4 does a much better
> job of estimating how many rows will be returned, the plan it
> generates is much slower for this query.

Can you let it run to completion?  Without explain analyze results
it's going to be pretty difficult to isolate the problem.
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

03 February 2009, 13:50:37

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> Can you let it run to completion?  Without explain analyze results
> it's going to be pretty difficult to isolate the problem.
Barring some currently unforseen need to switch it into service to
back the web site, yes.
-Kevin

Re: add_path optimization

From

"Kevin Grittner"

Date:

04 February 2009, 11:08:08

>>> I wrote: 
>>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> Can you let it run to completion?  Without explain analyze results
>> it's going to be pretty difficult to isolate the problem.
>  
> Barring some currently unforseen need to switch it into service to
> back the web site, yes.
It's been about 23 hours and it's still running.  No apparent memory
leakage.  No significant disk activity.  One CPU pegged (of the 16 on
the machine).
Sooner or later we're likely to need to swap this box into production,
but there's a good chance it could be left this way for days or even a
couple weeks.  Unfortunately I didn't have the foresight to use nohup
when running this, so a network glitch might knock me off, too.  (And
they've been working on the network due to a move of the server room
to a new floor.)  Do I just leave it?  Would a stack trace be of any
value?  If you had row counts or dumps of the statistics tables, could
you figure anything out?
Let me know.
-Kevin

Re: add_path optimization

From

Greg Stark

Date:

04 February 2009, 11:12:44

On Wed, Feb 4, 2009 at 3:07 PM, Kevin Grittner
> It's been about 23 hours and it's still running.

@snide(Gee, it sure would be nice if we continued with that
explain-in-progress patch I had sent in earlier...:)
-- 
greg

Re: add_path optimization

From

"Jonah H. Harris"

Date:

04 February 2009, 11:19:09

On Wed, Feb 4, 2009 at 10:12 AM, Greg Stark <stark@enterprisedb.com> wrote:

On Wed, Feb 4, 2009 at 3:07 PM, Kevin Grittner
> It's been about 23 hours and it's still running.

@snide(Gee, it sure would be nice if we continued with that
explain-in-progress patch I had sent in earlier...:)

Agreed.

--
Jonah H. Harris, Senior DBA
myYearbook.com

Re: add_path optimization

From

Tom Lane

Date:

04 February 2009, 11:32:32

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> It's been about 23 hours and it's still running.  No apparent memory
> leakage.  No significant disk activity.  One CPU pegged (of the 16 on
> the machine).

Hmmm ... I wonder if it could be stuck in an infinite loop?  It's
probably just a horribly bad plan choice, but ...

> Would a stack trace be of any value?

If you can attach to the backend with gdb, try "bt", then "cont",
then wait a few seconds, then control-C and "bt" again.  Repeat
five or ten times and see if there's any consistency to the traces.
You should be able to just "quit" and leave the backend running
afterwards.
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

04 February 2009, 12:01:40

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If you can attach to the backend with gdb, try "bt", then "cont",
> then wait a few seconds, then control-C and "bt" again.  Repeat
> five or ten times and see if there's any consistency to the traces.

Attached.

-Kevin

Attachment

8.4devel-bt.txt

Re: add_path optimization

From

Tom Lane

Date:

04 February 2009, 12:36:15

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> If you can attach to the backend with gdb, try "bt", then "cont",
>> then wait a few seconds, then control-C and "bt" again.  Repeat
>> five or ten times and see if there's any consistency to the traces.
> Attached.

Hmm, it seems to be spending all its time in this part of the plan:
                      ->  Merge Anti Join  (cost=0.00..2069318.25 rows=11628008 width=15)
MergeCond: (("CD"."countyNo")::smallint = ("CD2"."countyNo")::smallint)                            Join Filter:
((("CD2"."dispoDate")::date> ("CD"."dispoDate")::date) AND (("CD2"."caseNo")::text = ("CD"."caseNo")::text))
               ->  Index Scan using "CaseDispo_pkey" on "CaseDispo" "CD"  (cost=0.00..1068306.86 rows=17442012
width=19)                           ->  Index Scan using "CaseDispo_CountyNoDispoDate" on "CaseDispo" "CD2"
(cost=0.00..913801.31rows=17442012 width=19)
 

ie

SELECT ... FROM "CaseDispo" "CD"
WHERE (NOT (EXISTS
(
SELECT
1
FROM
"CaseDispo" "CD2"
WHERE (
("CD2"."caseNo" = "CD"."caseNo")
AND ("CD2"."countyNo" = "CD"."countyNo")
AND ("CD2"."dispoDate" > "CD"."dispoDate"))
)))

which in fact the planner thought would be pretty expensive; but not 24
hours worth.  I'm not sure at the moment if the cost estimate is just
badly off, or if there's some sort of thinko in the logic.  Can you
estimate how many rows would come out of that join?  Is the rowcount for
the underlying table (17442012 rows in CaseDispo) accurate?
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

04 February 2009, 12:48:29

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> SELECT ... FROM "CaseDispo" "CD"
> WHERE (NOT (EXISTS
> (
> SELECT
> 1
> FROM
> "CaseDispo" "CD2"
> WHERE (
> ("CD2"."caseNo" = "CD"."caseNo")
> AND ("CD2"."countyNo" = "CD"."countyNo")
> AND ("CD2"."dispoDate" > "CD"."dispoDate"))
> )))
> 
> which in fact the planner thought would be pretty expensive; but not
24
> hours worth.  I'm not sure at the moment if the cost estimate is
just
> badly off, or if there's some sort of thinko in the logic.  Can you
> estimate how many rows would come out of that join?
Well, of the cases which are selected based on the other criteria,
there would be about one CaseDispo row each.  The main selection
criterion is the Party.searchName, with various security limitations
added.  Where one or more CaseDispo rows exist (it's only included
through a LEFT JOIN), we want only the one with the latest dispoDate. 
Based on the queries which ran under 8.3.5, it's pretty safe to assume
that the number of CaseDispo rows matching the join criteria to Case
would be between 2,300 and 2,400, with only a handful having multiple
matches to the same Case.  There can never be more than one related to
a specific Case for any one dispoDate.
> Is the rowcount for
> the underlying table (17442012 rows in CaseDispo) accurate?
cir=# select count(*) from "CaseDispo"; count
----------17433234
(1 row)

Re: add_path optimization

From

Tom Lane

Date:

04 February 2009, 13:11:54

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Well, of the cases which are selected based on the other criteria,
> there would be about one CaseDispo row each.  The main selection
> criterion is the Party.searchName, with various security limitations
> added.  Where one or more CaseDispo rows exist (it's only included
> through a LEFT JOIN), we want only the one with the latest dispoDate. 
> Based on the queries which ran under 8.3.5, it's pretty safe to assume
> that the number of CaseDispo rows matching the join criteria to Case
> would be between 2,300 and 2,400, with only a handful having multiple
> matches to the same Case.

Hmm ... but that's all irrelevant because this is being done as the
innermost join, ie, it's running the EXISTS test for every row of
CaseDispo regardless of whether that row will be joined to later.
Based on your comments that seems like a really stupid plan choice, so
I guess Robert is right: there's some sort of logic bug here someplace.
Will look.
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

04 February 2009, 13:15:44

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> there's some sort of logic bug here someplace.
Keep in mind that this is running with the patch that started this
thread.  I didn't try actually running this query on 8.4devel without
the patch.  Should I kill this query, revert the software to
pre-patch, and try it again to see, or are you confident that that's
not the issue?
-Kevin

Re: add_path optimization

From

Tom Lane

Date:

04 February 2009, 13:22:58

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> there's some sort of logic bug here someplace.
> Keep in mind that this is running with the patch that started this
> thread.  I didn't try actually running this query on 8.4devel without
> the patch.  Should I kill this query, revert the software to
> pre-patch, and try it again to see, or are you confident that that's
> not the issue?

Hmm.  It would be worth checking whether you get the same plan without
the patch (you should, but let's check).  If it is the same plan then
it's not going to be any faster.  I don't see any value in trying to let
the query run to conclusion either.
        regards, tom lane

Re: add_path optimization

From

Robert Haas

Date:

04 February 2009, 13:26:04

On Wed, Feb 4, 2009 at 12:15 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>>>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> there's some sort of logic bug here someplace.
>
> Keep in mind that this is running with the patch that started this
> thread.  I didn't try actually running this query on 8.4devel without
> the patch.  Should I kill this query, revert the software to
> pre-patch, and try it again to see, or are you confident that that's
> not the issue?

Oh, dear.  If this turns out to be my bug Tom will kick my ass!

...Robert

Re: add_path optimization

From

"David E. Wheeler"

Date:

04 February 2009, 13:31:56

On Feb 4, 2009, at 9:25 AM, Robert Haas wrote:

> Oh, dear.  If this turns out to be my bug Tom will kick my ass!

Can I buy tickets to this event?

Best,

David

Re: add_path optimization

From

"Kevin Grittner"

Date:

04 February 2009, 13:37:46

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> It would be worth checking whether you get the same plan without
> the patch (you should, but let's check).
Same plan.
(Robert can relax, and David can forget about those tickets.)
-Kevin

Re: add_path optimization

From

Tom Lane

Date:

04 February 2009, 13:44:50

Robert Haas <robertmhaas@gmail.com> writes:
> Oh, dear.  If this turns out to be my bug Tom will kick my ass!

Hmm ... one of the things that struck me as odd was that it was doing a
merge join on just the countyNo, which is presumably very far from
unique.  Testing the query here with Kevin's schema but no data, I get
                  ->  Merge Anti Join  (cost=0.00..102.51 rows=233 width=34)                        Merge Cond:
((("CD"."countyNo")::smallint= ("CD2"."countyNo")::smallint) AND (("CD"."caseNo")::text = ("CD2"."caseNo")::text))
                 Join Filter: (("CD2"."dispoDate")::date > ("CD"."dispoDate")::date)                        ->  Index
Scanusing "CaseDispo_pkey" on "CaseDispo" "CD"  (cost=0.00..49.50 rows=350 width=38)                        ->  Index
Scanusing "CaseDispo_pkey" on "CaseDispo" "CD2"  (cost=0.00..49.50 rows=350 width=38)
 

ie it's using the first two columns of the pkey not only the first
column as merge key (and not arbitrarily using two different indexes to
accomplish the same scan, which is another weird thing about that plan).
There's no visible reason for it not to have done that in Kevin's test,
unless there's something wrong with your patch.

There might be more than one bug here though.  The other question is
why it wants to do this join first at all, and I'm not convinced that
add_path could be at fault for that.  I'm suspecting that the logic
that considers join order restrictions for antijoins might be overly
restrictive.
        regards, tom lane

Re: add_path optimization

From

Tom Lane

Date:

04 February 2009, 14:08:56

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> It would be worth checking whether you get the same plan without
>> the patch (you should, but let's check).
> Same plan.

Yeah, I just managed to reproduce a similar behavior in unpatched HEAD.
Now, since I'm running without any stats, it might be that it's
estimating similar costs for the one-key and two-key merges; but I don't
see why that would happen for you.  Off to do some debugging.
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

04 February 2009, 14:12:07

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> Hmm ... one of the things that struck me as odd was that it was doing
a
> merge join on just the countyNo, which is presumably very far from
> unique.
There are 72 possible values for any columns in that domain.  In most
large tables, Milwaukee County (value 40) is used in about 20% of the
rows.
> ... using two different indexes to
> accomplish the same scan, which is another weird thing about that
plan).
> There's no visible reason for it not to have done that
Well, for a human who understand the data, I certainly would have
expected it to use the primary key for both.  There are never
duplicate case numbers within a county, and there can certainly be a
lot of cases disposed on a given date for a county.
It occurred to me that this had been run with the last ANALYZE having
run with a default_statistics_target of 10, but that wasn't it -- I
got new statistics with a target of 100 and this part of the plan
looks the same.  Some other aspects of the plan changed, but if this
part was the killer, a higher target doesn't help.
-Kevin

Re: add_path optimization

From

Tom Lane

Date:

04 February 2009, 14:38:53

I wrote:
> Now, since I'm running without any stats, it might be that it's
> estimating similar costs for the one-key and two-key merges; but I don't
> see why that would happen for you.  Off to do some debugging.

... well, actually, it's because I blew off applying any cost correction
for this case in cost_mergejoin:
 * For SEMI and ANTI joins, only one inner tuple need be rescanned for * each group of same-keyed outer tuples
(assumingthat all joinquals * are merge quals).  This makes the effect small enough to ignore, * so we just set
rescannedtuples= 0.

Obviously that's not going to be good enough for production --- the
parenthetical assumption here is just wrong in the case at hand.

That doesn't seem to be the only issue in your example, but it's
definitely one of 'em.
        regards, tom lane

Re: add_path optimization

From

"Kevin Grittner"

Date:

04 February 2009, 14:58:50

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> That doesn't seem to be the only issue in your example, but it's
> definitely one of 'em.
FWIW, the pattern causing the problem here is a pretty common one in
court business logic: join (often outer join or an exists test) to a
set of candidate rows WHERE NOT EXISTS the same set of candidate rows
with tests to see if there is a better choice from among the
candidates.  In the simplest cases (like the current example) this
could be rewritten using a correlated subquery with a MAX function;
however, the test for a better candidate often is too complex for that
technique to work, that technique has (until now) been slower, and
programmers were much more prone to writing code with logic errors
that way.
I don't know if that matters to you in your current debugging, but I
just wanted to point out that we do it in hundreds of places in the
code.
-Kevin

Re: add_path optimization

From

Bruce Momjian

Date:

07 February 2009, 15:44:49

Robert Haas wrote:
> On Mon, Feb 2, 2009 at 8:10 PM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
> >>>> Robert Haas <robertmhaas@gmail.com> wrote:
> >> running this 5 times each on several queries,
> >> dropping top and bottom results.
> >
> > Running a complex query (posted in previous threads, runs about
> > 300,000 time per day in a production web application), I got these
> > timings on a production quality machine (4 quad CPU chips, that is 16
> > CPUs like this: Intel(R) Xeon(R) CPU  X7350 @ 2.93GHz, 128 GB RAM, big
> > RAID with BBU).  I ran explain in each environment 5 times, tossed
> > high and low, and averaged.  The 8.4devel was from today's
> > (2008-02-02) snapshot, built the same way we did 8.3.5.
> >
> > 8.3.5, statistics target 10:  36.188 ms
> > 8.4devel without patch, statistics target 100:  109.862 ms
> > 8.4devel with patch, statistics target 100:  104.015 ms
> >
> > After seeing that, I re-analyzed to eliminate the statistics target as
> > the cause of the 8.4 increase.
> >
> > 8.4devel with patch, statistics target 10:  99.421 ms
> 
> Yikes!  The impact of the patch is about what I'd expect, but the fact
> that planning time has nearly tripled is... way poor.  Can you repost
> the query and the EXPLAIN output for 8.3.5 and CVS HEAD?

Where are we on this: the original patch, and Kevin's slow queries?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: add_path optimization

From

Robert Haas

Date:

07 February 2009, 17:05:19

On Sat, Feb 7, 2009 at 2:44 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> Yikes!  The impact of the patch is about what I'd expect, but the fact
>> that planning time has nearly tripled is... way poor.  Can you repost
>> the query and the EXPLAIN output for 8.3.5 and CVS HEAD?
>
> Where are we on this: the original patch, and Kevin's slow queries?

The original patch is still in the queue for 8.5, and I'll probably do
more work on it between now and the next CommitFest.  If someone
decides to commit some portion of the existing patch first, I never
complain about getting to cut in front of the line - but I don't ask
for it, either.

Tom started a separate thread to discuss Kevin's planner issues:

http://archives.postgresql.org/message-id/21139.1233853501@sss.pgh.pa.us

and has committed a fix for one of the two problems Kevin found in
testing my patch:

http://archives.postgresql.org/message-id/20090206234324.11A417559ED@cvs.postgresql.org

...Robert

Re: add_path optimization

From

"Kevin Grittner"

Date:

09 February 2009, 12:40:21

>>> Bruce Momjian <bruce@momjian.us> wrote:
> Where are we on this: the original patch, and Kevin's slow queries?
Robert's patch is not the cause of the 8.4 problems with my queries,
and (as Robert pointed out) a separate thread has been started to
discuss those issues.
From my perspective, Robert's patch has improved plan time in every
test of a complex query that I've run.  I have compared plans for some
queries with and without the patch, and in when I have done so the
EXPLAIN output has been byte-for-byte identical, it just got to that
plan faster.
In this post:
http://archives.postgresql.org/pgsql-hackers/2009-02/msg00118.php
Tom points out that the additional optimizations included in 8.4 can
increase plan time a bit, "So there might be an argument for
installing Robert's optimization or something like it in 8.4 to buy
some of that back, rather than waiting for 8.5."
-Kevin

Re: add_path optimization

From

Tom Lane

Date:

27 February 2009, 19:39:43

After a lot of distractions, I've finished applying the planner fixes
that seem necessary in view of your report about poorer planning in 8.4
than 8.3.  When you have a chance, it would be useful to do a thorough
test of CVS HEAD on your data and query mix --- are there any other
places where we have regressed relative to 8.3?
        regards, tom lane

Re: add_path optimization

From

Tom Lane

Date:

27 February 2009, 22:35:42

[ This patch is on the 2009-First commitfest list, but there was earlier
discussion to the effect that it might be a good idea to include in 8.4,
so I made some time to look at it. ]

Robert Haas <robertmhaas@gmail.com> writes:
> I've been doing some benchmarking and profiling on the PostgreSQL
> query analyzer, and it seems that (at least for the sorts of queries
> that I typically run) the dominant cost is add_path().  I've been able
> to find two optimizations that seem to help significantly:

I got around to testing this patch a bit today, and I'm afraid it
doesn't look very good from here.  The test case I used was a psql
script with 10000 repetitions of

explain select * from information_schema.table_constraints join information_schema.referential_constraints using
(constraint_catalog,constraint_schema,constraint_name);

which isn't terribly creative, but those two views incorporate multiple
joins so I figured this was a reasonably middle-of-the-road planner
exercise.

I first tried just the compare_fuzzy_path_costs() change and really
couldn't measure any reliable difference.  oprofile output for CVS HEAD
looks like

PU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
149294    9.7519  postgres                 AllocSetAlloc
85437     5.5808  postgres                 SearchCatCache
45520     2.9734  postgres                 copyObject
39525     2.5818  postgres                 add_path
35878     2.3436  postgres                 expression_tree_walker
29733     1.9422  libc-2.8.so              memcpy
25560     1.6696  postgres                 MemoryContextAllocZeroAligned
20879     1.3638  libc-2.8.so              strlen
20521     1.3404  postgres                 add_paths_to_joinrel
20054     1.3099  postgres                 AllocSetFree
20023     1.3079  postgres                 cost_mergejoin
19363     1.2648  postgres                 nocachegetattr
17689     1.1554  libc-2.8.so              vfprintf
17335     1.1323  postgres                 generate_join_implied_equalities
16085     1.0507  postgres                 MemoryContextAlloc

and with the compare_fuzzy_path_costs() change

CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
144882    9.6951  postgres                 AllocSetAlloc
84205     5.6348  postgres                 SearchCatCache
43956     2.9414  postgres                 copyObject
37418     2.5039  postgres                 add_path
34865     2.3331  postgres                 expression_tree_walker
28718     1.9217  libc-2.8.so              memcpy
23643     1.5821  postgres                 MemoryContextAllocZeroAligned
20835     1.3942  libc-2.8.so              strlen
20065     1.3427  postgres                 cost_mergejoin
19672     1.3164  postgres                 AllocSetFree
18992     1.2709  postgres                 add_paths_to_joinrel
18802     1.2582  postgres                 nocachegetattr
17109     1.1449  libc-2.8.so              __printf_fp
16901     1.1310  libc-2.8.so              vfprintf
16159     1.0813  postgres                 hash_search_with_hash_value
16039     1.0733  postgres                 generate_join_implied_equalities
15554     1.0408  postgres                 MemoryContextAlloc
14981     1.0025  postgres                 expression_tree_mutator

Note that the compiler is inlining compare_fuzzy_path_costs in both
cases.

I then added the add_similar_paths change, and got this:

CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
142645    9.6668  postgres                 AllocSetAlloc
83161     5.6357  postgres                 SearchCatCache
44660     3.0265  postgres                 copyObject
35762     2.4235  postgres                 expression_tree_walker
28427     1.9264  postgres                 add_path
27970     1.8955  libc-2.8.so              memcpy
23317     1.5801  postgres                 MemoryContextAllocZeroAligned
20685     1.4018  libc-2.8.so              strlen
19646     1.3314  postgres                 add_paths_to_joinrel
19129     1.2963  postgres                 AllocSetFree
18065     1.2242  postgres                 cost_mergejoin
17768     1.2041  postgres                 nocachegetattr
17249     1.1689  postgres                 generate_join_implied_equalities
16784     1.1374  libc-2.8.so              vfprintf
15740     1.0667  libc-2.8.so              __printf_fp
15642     1.0600  postgres                 MemoryContextAlloc
15194     1.0297  postgres                 expression_tree_mutator
15009     1.0171  postgres                 hash_search_with_hash_value
...
7746      0.5249  postgres                 add_similar_paths

where at least you can see a change in add_path's percentage, although
overall runtime did not change measurably; and in any case whatever was
saved seems to have gone into add_similar_paths.

It gets worse though: add_similar_paths is flat *wrong*.  It compares
each successive path to only the current cheapest-total candidate,
and thus is quite likely to seize on the wrong cheapest-startup path.
I tried fixing its loop logic like so:
   for (i = 1; i < npath; ++i)   {       if (compare_fuzzy_path_costs(cheapest_total, paths[i], TOTAL_COST) > 0)
  cheapest_total = paths[i];       if (compare_fuzzy_path_costs(cheapest_startup, paths[i], STARTUP_COST) > 0)
cheapest_startup = paths[i];   }
 

(obviously, this is with compare_fuzzy_path_costs reverted to its form
in HEAD) and got this:

CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) 
with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
143675    9.6105  postgres                 AllocSetAlloc
83809     5.6060  postgres                 SearchCatCache
43837     2.9323  postgres                 copyObject
35790     2.3940  postgres                 expression_tree_walker
29870     1.9980  postgres                 compare_fuzzy_path_costs
28521     1.9078  libc-2.8.so              memcpy
23448     1.5685  postgres                 MemoryContextAllocZeroAligned
21049     1.4080  libc-2.8.so              strlen
18343     1.2270  postgres                 add_path
18154     1.2143  postgres                 AllocSetFree
18119     1.2120  postgres                 cost_mergejoin
18036     1.2064  postgres                 add_paths_to_joinrel
17791     1.1901  postgres                 nocachegetattr
17277     1.1557  postgres                 generate_join_implied_equalities
16956     1.1342  libc-2.8.so              vfprintf
16125     1.0786  postgres                 hash_search_with_hash_value
15733     1.0524  libc-2.8.so              __printf_fp
15387     1.0292  postgres                 MemoryContextAlloc
...
5759      0.3852  postgres                 add_similar_paths

Here we have a total of 3.6% spent in code that had used ~2.5% before
we started.  Not very promising.

Now, maybe if we could force the compiler to inline
compare_fuzzy_path_costs we could buy some of that back.  Another
possibility is to contort add_similar_path some more so that it avoids
double comparisons so long as cheapest_total and cheapest_startup are
the same path; I have a feeling that's not going to win much though.

So it's kind of looking like a dead end from here.  But in any case the
patch isn't acceptable as-is with that fundamental logic error in
add_similar_paths, so I'm bouncing it back for rework.
        regards, tom lane

Re: add_path optimization

From

Robert Haas

Date:

27 February 2009, 23:39:03

Thanks for taking a look at it.

> I first tried just the compare_fuzzy_path_costs() change and really
> couldn't measure any reliable difference.  oprofile output for CVS HEAD
> looks like

Well, there's obviously something different between your case and
mine, because in my query add_path was the #1 CPU hog and it wasn't
close.  I suspect my query had more joins - I'll take a look at yours.

> It gets worse though: add_similar_paths is flat *wrong*.  It compares
> each successive path to only the current cheapest-total candidate,
> and thus is quite likely to seize on the wrong cheapest-startup path.
> I tried fixing its loop logic like so:

Nuts.  That's probably a fatal defect.

> Now, maybe if we could force the compiler to inline
> compare_fuzzy_path_costs we could buy some of that back.  Another
> possibility is to contort add_similar_path some more so that it avoids
> double comparisons so long as cheapest_total and cheapest_startup are
> the same path; I have a feeling that's not going to win much though.

Agreed.

> So it's kind of looking like a dead end from here.  But in any case the
> patch isn't acceptable as-is with that fundamental logic error in
> add_similar_paths, so I'm bouncing it back for rework.

...Robert

Re: add_path optimization

From

Tom Lane

Date:

28 February 2009, 00:06:19

Robert Haas <robertmhaas@gmail.com> writes:
>> I first tried just the compare_fuzzy_path_costs() change and really
>> couldn't measure any reliable difference. �oprofile output for CVS HEAD
>> looks like

> Well, there's obviously something different between your case and
> mine, because in my query add_path was the #1 CPU hog and it wasn't
> close.  I suspect my query had more joins - I'll take a look at yours.

Interesting.  I'd like to see your example too, if you can publish it.
You had noted that add_path seemed to iterate clear to the end of the
list in most cases, which it's designed not to do, so there might be
something else going on in your example.

One other thought to roll around in your head: at the time that the
current add_path logic was designed, compare_pathkeys was ungodly
expensive, which is why the code tries to compare costs first.
We've since introduced the "canonical pathkey" representation, which
makes compare_pathkeys enough cheaper that it might not be insane
to do it in the other order.  I don't immediately see an easy win
there, because as things are structured we typically want the cost
comparison result anyway for list sorting purposes.  But maybe there's
a way around that.
        regards, tom lane

Re: add_path optimization

From

Robert Haas

Date:

01 March 2009, 00:22:56

On Fri, Feb 27, 2009 at 11:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> One other thought to roll around in your head: at the time that the
> current add_path logic was designed, compare_pathkeys was ungodly
> expensive, which is why the code tries to compare costs first.
> We've since introduced the "canonical pathkey" representation, which
> makes compare_pathkeys enough cheaper that it might not be insane
> to do it in the other order.  I don't immediately see an easy win
> there, because as things are structured we typically want the cost
> comparison result anyway for list sorting purposes.  But maybe there's
> a way around that.

I think the best think to do with the pathkeys comparison is avoid it
altogether as frequently as possible.  One thing I've been thinking
about is trying to refactor add_paths_to_joinrel() so that it doesn't
need to be called twice for every pair of input relations.  This saves
some cycles in a few different places: you avoid recomputing the merge
clauses and hash clauses twice, and because you now generate all of
the unsorted hash-paths in the same piece of code, you can do an
add_similar_paths()-type thing where you only compare costs first and
then throw just the winners into the main list.  I think there are
some minor economies in sort_inner_and_outer() as well.  I haven't
been able to convince myself that all of this adds up to a measurable
win, though.

The problem is sort_inner_and_outer and hash_inner_and_outer actually
generate only a relatively small number of paths.  The latter
generates at most two, and the former generates
length(mergeclause_list), which at least for the kinds of queries that
I usually write is most often one.  Maybe two?  On the other hand,
match_unsorted_outer generates up to 5 nestloops plus potentially
several merge joins PER OUTER PATH.  That's a much bigger number.
Ergo, if you want to make join planning fast, you need to make
match_unsorted_outer() consider fewer paths or consider them more
quickly.

And, unfortunately, it's not clear how to do that.  I have a nagging
feeling that the logic around when to try materializing the inner path
could be improved.  We skip it when the cheapest total inner path is
an index scan, bitmap heap scan, tid scan, material path, function
scan, cte scan, or work table scan (which is an irritatingly long
list... is there a better way to do this?), but cost_nestloop() only
costs the thing differently if the inner side is a material path or
hash path. I sort of wonder if cost_nestloop() could be made to decide
when materialization of the inner path is warranted.  That might help
some, but I still think the fact that match_unsorted_outer() generates
a number of paths that is linear in the size of the outer pathlist
(rather than a constant) is going to tend to make it the most
expensive part of join planning.

...Robert

Re: add_path optimization

From

"Kevin Grittner"

Date:

02 March 2009, 19:09:15

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> After a lot of distractions, I've finished applying the planner
fixes
> that seem necessary in view of your report about poorer planning in
8.4
> than 8.3.  When you have a chance, it would be useful to do a
thorough
> test of CVS HEAD on your data and query mix --- are there any other
> places where we have regressed relative to 8.3?

You probably already know this, but on the query referenced earlier in
the thread, a great plan is now generated!  Even when not cached, this
executed in just under seven seconds.  (I chose these values for
testing this query because it had been logged as exceeding 20 seconds
under 8.3.)  Cached, EXPLAIN ANALYZE runs between 225 and 250 ms.
Running it without EXPLAIN the psql \timing is between 265 and 277 ms.
EXPLAIN gives a \timing averaging 80 ms.

I will see what kind of testing I can put together to try to shake out
any remaining issues.

-Kevin

Attachment

plan.txt

Re: add_path optimization

From

Bruce Momjian

Date:

26 February 2010, 23:07:37

Did this ever get applied/resolved?

---------------------------------------------------------------------------

Robert Haas wrote:
> I've been doing some benchmarking and profiling on the PostgreSQL
> query analyzer, and it seems that (at least for the sorts of queries
> that I typically run) the dominant cost is add_path().  I've been able
> to find two optimizations that seem to help significantly:
> 
> 1. add_path() often calls compare_fuzzy_path_costs() twice on the same
> pair of paths, and when the paths compare equal on one criterion, some
> comparisons are duplicated.  I've refactored this function to return
> the results of both calculations without repeating any floating-point
> arithmetic.
> 
> 2. match_unsorted_outer() adds as many as 5 nested loop joins at a
> time with the same set of pathkeys.  In my tests, it tended to be ~3 -
> cheapest inner, cheapest inner materialized, and cheapest inner index.
>  Since these all have the same pathkeys, clearly only the one with the
> cheapest total cost is in the running for cheapest total cost for that
> set of pathkeys, and likewise for startup cost (and the two may be the
> same).  Yet we compare all of them against the whole pathlist, one
> after the other, including (for the most part) the rather expensive
> pathkey comparison.  I've added a function add_similar_paths() and
> refactored match_unsorted_outer() to use it.
> 
> On a couple of complex (and proprietary) queries with 12+ joins each,
> I measure a planning time improvement of 8-12% with the attached patch
> applied.  It would be interesting to try to replicate this on a
> publicly available data set, but I don't know of a good one to use.
> Suggestions welcome - results of performance testing on your own
> favorite big queries even more welcome.  Simple test harness also
> attached.  I took the approach of dropping caches, starting the
> server, and then running this 5 times each on several queries,
> dropping top and bottom results.
> 
> ...Robert

[ Attachment, skipping... ]

[ Attachment, skipping... ]

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.comPG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard
drive,Christ can be your backup. +

Re: add_path optimization

From

Robert Haas

Date:

27 February 2010, 21:02:42

On Fri, Feb 26, 2010 at 10:07 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Did this ever get applied/resolved?

No, I never went back and tried to fix the brokenness Tom found.

...Robert