Home > mailing lists

Re: benchmarking the query planner - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: benchmarking the query planner
Date	December 12, 2008 01:13:31
Msg-id	15327.1229047970@sss.pgh.pa.us Whole thread Raw
In response to	Re: benchmarking the query planner ("Robert Haas" <robertmhaas@gmail.com>)
Responses	Re: benchmarking the query planner
List	pgsql-hackers

Tree view

"Robert Haas" <robertmhaas@gmail.com> writes:
>> It might be best to stop when the frequency drops below some threshold,
>> rather than taking a fixed number of entries.

> OK, I'll bite.  How do we decide where to put the cutoff?  If we make
> it too high, it will have a negative effect on join selectivity
> estimates; if it's too low, it won't really address the problem we're
> trying to fix.  I randomly propose p = 0.001, which should limit
> eqjoinsel() to about a million equality tests in the worst case.  In
> the synthetic example we were just benchmarking, that causes the
> entire MCV array to be tossed out the window (which feels about
> right).

Yeah.  One idle thought I had was that maybe the cutoff needs to
consider both probabilities: if the high-frequency MCVs on one side
chance to match to not-so-high-frequency MCVs on the other, you
would like to know about that.  As long as we keep the lists in
frequency order, it seems easy to implement this: for each MCV
examined by the outer loop, you run the inner loop until the product of
the outer and current inner frequency drops below whatever your
threshold is.  This doesn't immediately suggest what the threshold
ought to be, but it seems like it ought to be possible to determine
that given a desired maximum error in the overall estimate.  I'm also
not very clear on what the "total frequency" computations (matchfreq2
and unmatchfreq2 in the current code) ought to look like if we are using
a variable subset of the inner list.
        regards, tom lane

pgsql-hackers by date:

From: KaiGai Kohei
Date: 12 December 2008, 01:04:52
Subject: Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)

From: "Greg Stark"
Date: 12 December 2008, 01:30:22
Subject: Re: benchmarking the query planner

Re: benchmarking the query planner - Mailing list pgsql-hackers

Previous

Next