Re: benchmarking the query planner - Mailing list pgsql-hackers

From Tom Lane
Subject Re: benchmarking the query planner
Date
Msg-id 8020.1229105929@sss.pgh.pa.us
Whole thread Raw
In response to Re: benchmarking the query planner  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: benchmarking the query planner  (Simon Riggs <simon@2ndQuadrant.com>)
Re: benchmarking the query planner  ("Greg Stark" <stark@enterprisedb.com>)
List pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> writes:
> As I said, we would only increase sample for ndistinct, not for others.

How will you do that?  Keep in mind that one of the things we have to do
to compute ndistinct is to sort the sample.  ISTM that the majority of
the cost of a larger sample is going to get expended anyway ---
certainly we could form the histogram using the more accurate data at
precisely zero extra cost, and I think we have also pretty much done all
the work for MCV collection by the time we finish counting the number of
distinct values.

I seem to recall Greg suggesting that there were ways to estimate
ndistinct without sorting, but short of a fundamental algorithm change
there's not going to be a win here.

> Right now we may as well use a random number generator.

Could we skip the hyperbole please?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: benchmarking the query planner
Next
From: Tom Lane
Date:
Subject: Re: benchmarking the query planner