Thread: Correlation in cost_index()

Correlation in cost_index()

From

Manfred Koizar

Date:

02 October 2002, 15:50:33

You all know this FAQ: "Why does Postgres not use my index?"  Half of
the time this problem can easily be solved by casting a literal to the
type of the respective column;  this is not my topic here.

In many other cases it turns out that the planner over-estimates the
cost of an index scan.  Sometimes this can be worked around by
lowering random_page_cost.  "Of course, that's a hack that is quite
unrelated to the real problem."  I strongly agree ;-)

AFAICS (part of) the real problem is in costsize.c:cost_index() where
IO_cost is calculated from min_IO_cost, pages_fetched,
random_page_cost, and indexCorrelation.  The current implementation
uses indexCorrelation^2 to interpolate between min_IO_cost and
max_IO_cost, which IMHO gives results that are too close to
max_IO_cost.  This conjecture is supported by the fact, that often
actual run times are much lower than estimated, when seqscans are
disabled.

So we have to find a cost function, so that
 . min_IO_cost <= cost <= max_IO_cost                        for  -1 <= indexCorrelation <= 1 . cost --> min_IO_cost
for indexCorrelation --> +/- 1 . cost --> max_IO_cost  for  indexCorrelation --> 0 . cost tends more towards
min_IO_costthan current implementation
 

After playing around a bit I propose three functions satisfying above
conditions.  All proposals use absC = abs(indexCorrelation).

Proposal 1:  Use absC for interpolation.
IO_cost = absC * min_IO_cost + (1 - absC) * max_IO_cost;


Proposal 2:  First calculate estimates for numbers of pages and cost
per page, then multiply the results.
estPages = absC * minPages + (1 - absC) * maxPages;estPCost = absC * 1 + (1 - absC) * random_page_cost;              /*
^                  sequential_page_cost */IO_cost = estPages * estPCost;
 


Proposal 3:  Interpolate "geometrically", using absC.
IO-cost = exp(   absC    * ln(min_IO_Cost) +              (1 - absC) * ln(max_IO_cost));


Here are some numbers for   seq_page_cost = 1   (constant)   random_page_cost = 4  (GUC)   minPages = 61   maxPages =
1440
       corr  current      p1        p2        p3       0     5760.00   5760.00   5760.00   5760.00       0.1   5703.01
5190.10   4817.77   3655.22       0.2   5532.04   4620.20   3958.28   2319.55       0.3   5247.09   4050.30   3181.53
1471.96      0.4   4848.16   3480.40   2487.52    934.08       0.5   4335.25   2910.50   1876.25    592.76       0.6
3708.36  2340.60   1347.72    376.16       0.7   2967.49   1770.70    901.93    238.70       0.8   2112.64   1200.80
538.88   151.48       0.9   1143.81    630.90    258.57     96.13       0.95   616.65    345.95    149.44     76.57
 0.99   174.41    117.99     77.03     63.84       0.995  117.85     89.50     68.91     62.40       0.999   72.39
66.70    62.57     61.28       1       61.00     61.00     61.00     61.00
 

Another example for   seq_page_cost = 1   (constant)   random_page_cost = 10  (GUC)   minPages = 20   maxPages =
938.58
       corr  current      p1        p2        p3       0     9385.79   9385.79   9385.79   9385.79       0.1   9292.14
8449.21   7705.17   5073.72       0.2   9011.16   7512.64   6189.88   2742.73       0.3   8542.87   6576.06   4839.94
1482.65      0.4   7887.27   5639.48   3655.34    801.48       0.5   7044.35   4702.90   2636.09    433.26       0.6
6014.11  3766.32   1782.19    234.21       0.7   4796.56   2829.74   1093.62    126.61       0.8   3391.69   1893.16
570.40    68.44       0.9   1799.50    956.58    212.53     37.00       0.95   933.16    488.29     95.60     27.20
 0.99   206.38    113.66     31.81     21.27       0.995  113.42     66.83     25.70     20.62       0.999   38.72
29.37    21.11     20.12       1       20.00     20.00     20.00     20.00
 

(If you want to play around with your own numbers, I can send my OOo
spreadsheet privately or to the list.)

The second example shows that especially with proposal 3 we could
afford to set random_page_cost to a *higher* value, which in contrast
to previous recommendations seems to be appropriate, IIRC that
benchmark results posted here showed values of up to 60.

As nobody knows how each of these proposals performs in real life
under different conditions, I suggest to leave the current
implementation in, add all three algorithms, and supply a GUC variable
to select a cost function.

Comments?  Ideas?  Objections?

ServusManfred

Re: Correlation in cost_index()

From

"scott.marlowe"

Date:

02 October 2002, 16:13:04

On Wed, 2 Oct 2002, Manfred Koizar wrote:

> As nobody knows how each of these proposals performs in real life
> under different conditions, I suggest to leave the current
> implementation in, add all three algorithms, and supply a GUC variable
> to select a cost function.

I'd certainly be willing to do some testing on my own data with them.  
Gotta patch?  I've found that when the planner misses, sometimes it misses 
by HUGE amounts on large tables, and I have been running random page cost 
at 1 lately, as well as running cpu_index_cost at 1/10th the default 
setting to get good results.

Re: Correlation in cost_index()

From

Tom Lane

Date:

02 October 2002, 18:48:51

Manfred Koizar <mkoi-pg@aon.at> writes:
> AFAICS (part of) the real problem is in costsize.c:cost_index() where
> IO_cost is calculated from min_IO_cost, pages_fetched,
> random_page_cost, and indexCorrelation.  The current implementation
> uses indexCorrelation^2 to interpolate between min_IO_cost and
> max_IO_cost, which IMHO gives results that are too close to
> max_IO_cost.

The indexCorrelation^2 algorithm was only a quick hack with no theory
behind it :-(.  I've wanted to find some better method to put in there,
but have not had any time to research the problem.

> As nobody knows how each of these proposals performs in real life
> under different conditions, I suggest to leave the current
> implementation in, add all three algorithms, and supply a GUC variable
> to select a cost function.

I don't think it's really a good idea to expect users to pick among
multiple cost functions that *all* have no guiding theory behind them.
I'd prefer to see us find a better cost function and use it.  Has anyone
trawled the database literature on the subject?
        regards, tom lane

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

03 October 2002, 03:07:33

On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>I don't think it's really a good idea to expect users to pick among
>multiple cost functions

The idea is that PG is shipped with a default representing the best of
our knowledge and users are not encouraged to change it.  When a user
sends a "PG does not use my index" or "Why doesn't it scan
sequentially?" message to one of the support lists, we advise her/him
to set index_cost_algorithm to 3 (or whatever we feel appropriate) and
watch the feedback we get.

We don't risk anything, if the default is the current behaviour.

ServusManfred

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

03 October 2002, 03:26:26

On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe"
<scott.marlowe@ihs.com> wrote:
>I'd certainly be willing to do some testing on my own data with them.  

Great!

>Gotta patch?

Not yet.

>  I've found that when the planner misses, sometimes it misses 
>by HUGE amounts on large tables, and I have been running random page cost 
>at 1 lately, as well as running cpu_index_cost at 1/10th the default 
>setting to get good results.

May I ask for more information?  What are your settings for
effective_cache_size and shared_buffers?  And which version are you
running?

ServusManfred

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

03 October 2002, 06:38:02

On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe"
<scott.marlowe@ihs.com> wrote:
>I'd certainly be willing to do some testing on my own data with them.
>Gotta patch?

Yes, see below.  Disclaimer:  Apart from "make; make check" this is
completely untested.  Use at your own risk.  Have fun!

Servus
 Manfred
diff -ruN ../base/src/backend/optimizer/path/costsize.c src/backend/optimizer/path/costsize.c
--- ../base/src/backend/optimizer/path/costsize.c    2002-07-04 18:04:57.000000000 +0200
+++ src/backend/optimizer/path/costsize.c    2002-10-03 09:53:06.000000000 +0200
@@ -72,6 +72,7 @@
 double        cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
 double        cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
 double        cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
+int            index_cost_algorithm = DEFAULT_INDEX_COST_ALGORITHM;

 Cost        disable_cost = 100000000.0;

@@ -213,8 +214,8 @@
     Cost        indexStartupCost;
     Cost        indexTotalCost;
     Selectivity indexSelectivity;
-    double        indexCorrelation,
-                csquared;
+    double        indexCorrelation;
+    Cost        IO_cost;
     Cost        min_IO_cost,
                 max_IO_cost;
     Cost        cpu_per_tuple;
@@ -329,13 +330,62 @@
     min_IO_cost = ceil(indexSelectivity * T);
     max_IO_cost = pages_fetched * random_page_cost;

-    /*
-     * Now interpolate based on estimated index order correlation to get
-     * total disk I/O cost for main table accesses.
-     */
-    csquared = indexCorrelation * indexCorrelation;
+    switch (index_cost_algorithm) {
+    case 1: {
+        /*
+        ** Use abs(correlation) for linear interpolation
+        */
+        double absC = fabs(indexCorrelation);
+
+        IO_cost = absC * min_IO_cost + (1 - absC) * max_IO_cost;
+    }
+
+    case 2: {
+        /*
+        ** First estimate number of pages and cost per page,
+        ** then multiply the results.  min_IO_cost is used for
+        ** min_pages, because seq_page_cost = 1.
+        */
+        double absC = fabs(indexCorrelation);
+
+        double estPages = absC * min_IO_cost + (1 - absC) * pages_fetched;
+        double estPCost = absC * 1 + (1 - absC) * random_page_cost;
+        IO_cost = estPages * estPCost;
+    }
+
+    case 3: {
+        /*
+        ** Interpolate based on independence squared, thus forcing the
+        ** result to be closer to min_IO_cost
+        */
+        double independence = 1 - fabs(indexCorrelation);
+        double isquared = independence * independence;
+
+        IO_cost = (1 - isquared) * min_IO_cost + isquared * max_IO_cost;
+    }
+
+    case 4: {
+        /*
+        ** Interpolate geometrically
+        */
+        double absC = fabs(indexCorrelation);
+
+        IO_cost = exp(absC * log(min_IO_cost) +
+                      (1 - absC) * log(max_IO_cost));
+    }
+
+    default: {
+        /*
+         * Interpolate based on estimated index order correlation
+         * to get total disk I/O cost for main table accesses.
+         */
+        double csquared = indexCorrelation * indexCorrelation;
+
+        IO_cost = max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
+    }
+    }

-    run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
+    run_cost += IO_cost;

     /*
      * Estimate CPU costs per tuple.
diff -ruN ../base/src/backend/utils/misc/guc.c src/backend/utils/misc/guc.c
--- ../base/src/backend/utils/misc/guc.c    2002-07-20 17:27:19.000000000 +0200
+++ src/backend/utils/misc/guc.c    2002-10-03 10:03:37.000000000 +0200
@@ -644,6 +644,11 @@
     },

     {
+        { "index_cost_algorithm", PGC_USERSET }, &index_cost_algorithm,
+        DEFAULT_INDEX_COST_ALGORITHM, 0, INT_MAX, NULL, NULL
+    },
+
+    {
         { NULL, 0 }, NULL, 0, 0, 0, NULL, NULL
     }
 };
diff -ruN ../base/src/include/optimizer/cost.h src/include/optimizer/cost.h
--- ../base/src/include/optimizer/cost.h    2002-06-21 02:12:29.000000000 +0200
+++ src/include/optimizer/cost.h    2002-10-03 09:56:28.000000000 +0200
@@ -24,6 +24,7 @@
 #define DEFAULT_CPU_TUPLE_COST    0.01
 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.001
 #define DEFAULT_CPU_OPERATOR_COST  0.0025
+#define DEFAULT_INDEX_COST_ALGORITHM 3

 /* defaults for function attributes used for expensive function calculations */
 #define BYTE_PCT 100
@@ -43,6 +44,7 @@
 extern double cpu_tuple_cost;
 extern double cpu_index_tuple_cost;
 extern double cpu_operator_cost;
+extern int index_cost_algorithm;
 extern Cost disable_cost;
 extern bool enable_seqscan;
 extern bool enable_indexscan;

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

03 October 2002, 11:55:44

On Thu, 03 Oct 2002 12:40:20 +0200, I wrote:
>>Gotta patch?
>
>Yes, see below.

Oh, did I mention that inserting some break statements after the
switch cases helps a lot? :-(

Cavus venter non laborat libenter ...

ServusManfred

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

03 October 2002, 12:24:48

On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe"
<scott.marlowe@ihs.com> wrote:
>I've found that when the planner misses, sometimes it misses 
>by HUGE amounts on large tables,

Scott,

yet another question: are multicolunm indices involved in your
estimator problems?

ServusManfred

Re: Correlation in cost_index()

From

"scott.marlowe"

Date:

03 October 2002, 12:57:27

On Thu, 3 Oct 2002, Manfred Koizar wrote:

> On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe"
> <scott.marlowe@ihs.com> wrote:
> >I'd certainly be willing to do some testing on my own data with them.  
> 
> Great!
> 
> >Gotta patch?
> 
> Not yet.
> 
> >  I've found that when the planner misses, sometimes it misses 
> >by HUGE amounts on large tables, and I have been running random page cost 
> >at 1 lately, as well as running cpu_index_cost at 1/10th the default 
> >setting to get good results.
> 
> May I ask for more information?  What are your settings for
> effective_cache_size and shared_buffers?  And which version are you
> running?

I'm running 7.2.2 in production and 7.3b2 in testing.effective cache size is the default (i.e. commented out)
shared buffers are at 4000.

I've found that increasing shared buffers past 4000 (32 megs) to 16384 
(128 Megs) has no great effect on my machine's performance, but I've never 
really played with effective cache size.

I've got a couple of queries that join a 1M+row table to itself and to a 
50k row table, and the result sets are usually <100 rows at a time.  

Plus some other smaller datasets that return larger amounts (i.e. 
sometimes all rows) of data generally.

Re: Correlation in cost_index()

From

"scott.marlowe"

Date:

03 October 2002, 13:07:57

On Thu, 3 Oct 2002, Manfred Koizar wrote:

> On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe"
> <scott.marlowe@ihs.com> wrote:
> >I've found that when the planner misses, sometimes it misses 
> >by HUGE amounts on large tables,
> 
> Scott,
> 
> yet another question: are multicolunm indices involved in your
> estimator problems?

No.  Although I use them a fair bit, none of the problems I've encountered 
so far have involved them.  But I'd be willing to setup some test indexes 
and get some numbers on them.

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

03 October 2002, 13:37:26

On Thu, 3 Oct 2002 10:59:54 -0600 (MDT), "scott.marlowe"
<scott.marlowe@ihs.com> wrote:
>>are multicolunm indices involved in your estimator problems?
>
>No.  Although I use them a fair bit, none of the problems I've encountered 
>so far have involved them.  But I'd be willing to setup some test indexes 
>and get some numbers on them.

Never mind!  I just stumbled over those lines in selfuncs.c where
indexCorrelation is calculated by dividing the correlation of the
first index column by the number of index columns.

I have a use case here where this clearly is not the right choice and
was hoping to find some examples that help me investigate whether my
case is somewhat uncommon ...

ServusManfred

Re: Correlation in cost_index()

From

Tom Lane

Date:

03 October 2002, 14:50:30

Manfred Koizar <mkoi-pg@aon.at> writes:
> Never mind!  I just stumbled over those lines in selfuncs.c where
> indexCorrelation is calculated by dividing the correlation of the
> first index column by the number of index columns.

Yeah, I concluded later that that was bogus.  I've been thinking of
just using the correlation of the first index column and ignoring
the rest; that would not be great, but it's probably better than what's
there.  Have you got a better formula?
        regards, tom lane

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

04 October 2002, 11:31:10

On Thu, 03 Oct 2002 14:50:00 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>> indexCorrelation is calculated by dividing the correlation of the
>> first index column by the number of index columns.
>
>Yeah, I concluded later that that was bogus.  I've been thinking of
>just using the correlation of the first index column and ignoring
>the rest; that would not be great, but it's probably better than what's
>there.  Have you got a better formula?

Unfortunately not.  I think such a formula does not exist for the
information we have.  What we'd need is a notion of correlation of the
nth (n > 1) index column for constant values of the first n-1 index
columns; or a combined correlation for the first n index columns (1 <
n <= number of index columns).

I try to understand the problem with the help of use cases.
[ Jump to the end, if this looks to long-winded. ]

1)  Have a look at invoice items with an index on (fyear, invno,
itemno).  Invoice numbers start at 1 for each financial year, item
numbers start at 1 for each invoice.  In a typical scenario
correlations for fyear, (fyear, invno), and (fyear, invno, itemno) are
close to 1;  invno correlation is expected to be low; and itemno looks
totally chaotic to the analyzer.

When we SELECT * FROM item WHERE fyear = 2001 AND invno < 1000
dividing the correlation of the first column by the number of columns
gives 1/3 which has nothing to do with what we want.  (And then the
current implementation of cost_index() squares this and gets 1/9 which
is even farther away from the truth.)  Just using the correlation of
the first index column seems right here.

2)  OTOH consider bookkeeping with enties identified by (fyear,
account, entryno).  Again fyear has a correlation near 1.  For account
we can expect something near 0, and entryno has a distribution
comparable to invno in the first use case, i.e. starting at 1 for each
year.
SELECT * from entry WHERE fyear = 2001 AND account = whatever
Taking the correlation of fyear would imply that the tuples we are
looking for are close to each other, which again can turn out to be
wrong.

So what do we know now?  Even less than before :-(

I have just one idea that might help a bit:  If correlation of the
first index column is near +/1, cost_index() should not use
baserel->pages, but baserel->pages * selectivity of the first column.

ServusManfred

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

04 October 2002, 13:55:32

On Thu, 3 Oct 2002 10:45:08 -0600 (MDT), "scott.marlowe"
<scott.marlowe@ihs.com> wrote:
> effective cache size is the default (i.e. commented out)

The default is 1000, meaning ca. 8 MB, which seems to be way too low.
If your server is (almost) exclusively used by Postgres, try setting
it to represent most of your OS cache (as reported by free on Linux).
Otherwise you have to estimate the fraction of the OS cache that gets
used by Postgres.

I'm still trying to get a feeling for how these settings play
together, so I'd be grateful if you report back the effects this has
on your cost estimates.

Caveat:  effective_cache_size is supposed to be the number of cache
pages available to one query (or is it one scan?).  So if you have
several concurrent queries (or complex queries with several scans),
you should choose a lower value.  OTOH if most of your queries operate
on the same data, one query could benefit from pages cached by other
queries ...  You have to experiment a little.

ServusManfred

Re: Correlation in cost_index()

From

Sean Chittenden

Date:

07 August 2003, 17:45:58

> > AFAICS (part of) the real problem is in costsize.c:cost_index() where
> > IO_cost is calculated from min_IO_cost, pages_fetched,
> > random_page_cost, and indexCorrelation.  The current implementation
> > uses indexCorrelation^2 to interpolate between min_IO_cost and
> > max_IO_cost, which IMHO gives results that are too close to
> > max_IO_cost.
> 
> The indexCorrelation^2 algorithm was only a quick hack with no theory
> behind it :-(.  I've wanted to find some better method to put in there,
> but have not had any time to research the problem.

Could we "quick hack" it to a geometric mean instead since a mean
seemed to yield better results than indexCorrelation^2?

> > As nobody knows how each of these proposals performs in real life
> > under different conditions, I suggest to leave the current
> > implementation in, add all three algorithms, and supply a GUC variable
> > to select a cost function.
> 
> I don't think it's really a good idea to expect users to pick among
> multiple cost functions that *all* have no guiding theory behind them.
> I'd prefer to see us find a better cost function and use it.  Has anyone
> trawled the database literature on the subject?

Hrm, after an hour of searching and reading, I think one of the better
papers on the subject can be found here:

http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf

Page 13, figure 3-12 is the ticket you were looking for Tom.  It's an
interesting read with a pretty good analysis and conclusion.  The
author notes that his formula begins to fall apart when the number of
dimensions reaches 10 and suggests the use of a high dimension
random cost estimate algo, but that the use of those comes at great
expense to the CPU (which is inline with a few other papers that I
read).  The idea of precomputing values piqued my interest and I
thought was very clever, albeit space intensive.  *shrug*

-sc


-- 
Sean Chittenden

Re: Correlation in cost_index()

From

Tom Lane

Date:

07 August 2003, 19:51:21

Sean Chittenden <sean@chittenden.org> writes:
> Hrm, after an hour of searching and reading, I think one of the better
> papers on the subject can be found here:
> http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf

Interesting paper, but I don't see the connection to index order
correlation?
        regards, tom lane

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

08 August 2003, 13:32:51

On Thu, 7 Aug 2003 13:44:19 -0700, Sean Chittenden
<sean@chittenden.org> wrote:
>> The indexCorrelation^2 algorithm was only a quick hack with no theory
>> behind it :-(.  I've wanted to find some better method to put in there,
>> but have not had any time to research the problem.
>
>Could we "quick hack" it to a geometric mean instead since a mean
>seemed to yield better results than indexCorrelation^2?

Linear interpolation on (1-indexCorrelation)^2 (algorithm 3 in
http://members.aon.at/pivot/pg/16-correlation-732.diff) is almost as
good as geometric interpolation (algorithm 4 in the patch, proposal 3
in this thread), and its computation is much cheaper because it does
not call exp() and log().  Download
http://members.aon.at/pivot/pg/cost_index.sxc and play around with
your own numbers to get a feeling.

(1-indexCorrelation)^2 suffers from the same lack of theory behind it
as indexCorrelation^2.  But the results look much more plausible.
Well, at least to me ;-)

ServusManfred

Re: Correlation in cost_index()

From

Sean Chittenden

Date:

08 August 2003, 15:22:39

> > Hrm, after an hour of searching and reading, I think one of the
> > better papers on the subject can be found here:
> > http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf
> 
> Interesting paper, but I don't see the connection to index order
> correlation?

Nothing that I found was nearly that specific, as close as I could
find was the paper above on calculating the cost of fetching data from
a disk, which I thought was the bigger problem at hand, but I
digress...

In one paper about large dimension index searches, they did suggest
that cost was cumulative for the number of disk reads or nodes in the
tree that weren't held in cache, which was the biggest hint that I had
found on this specific topic.  With that as a guiding light (or
something faintly resembling it), it'd seem as though an avg depth of
nodes in index * tuples_fetched * (random_io_cost * indexCorrelation)
would be closer than where we are now... but now also think I/we're
barking up the right tree with this thread.

It's very possible that cost_index() is wrong, but it seems as though
after some testing as if PostgreSQL _overly_ _favors_ the use of
indexes:

# SET enable_seqscan = true; SET enable_indexscan = true;
SET
SET
# EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE;
INFO:  cost_seqscan: run_cost: 21472.687500       startup_cost: 0.000000

INFO:  cost_index: run_cost: 21154.308116       startup_cost: 0.000000       indexCorrelation: 0.999729
                                                                 QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index
Scanusing report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc  (cost=0.00..21154.31 rows=705954
width=64)(actual time=91.36..6625.79 rows=704840 loops=1)  Index Cond: (utc_date > '2002-10-01 00:00:00-07'::timestamp
withtime zone)Total runtime: 11292.68 msec
 
(3 rows)

# SET enable_seqscan = true; SET enable_indexscan = false;
SET
SET
# EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE;
INFO:  cost_seqscan: run_cost: 21472.687500       startup_cost: 0.000000

INFO:  cost_index: run_cost: 21154.308116       startup_cost: 100000000.000000       indexCorrelation: 0.999729
                                                   QUERY PLAN
 

---------------------------------------------------------------------------------------------------------------------------------------Seq
Scanon report_user_cat_count rucc  (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19 rows=704840
loops=1) Filter: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone)Total runtime: 10506.44 msec
 
(3 rows)


Which I find surprising and humorous given the popular belief is, mine
included, contrary to those results.  I can say with pretty high
confidence that the patch to use a geometric mean isn't correct after
having done real world testing as its break even point is vastly
incorrect and only uses an index when there are less than 9,000 rows
to fetch, a far cry from the 490K break even I found while testing.
What I did find interesting, however, was that it does work better at
determining the use of multi-column indexes, but I think that's
because the geometric mean pessimizes the value of indexCorrelation,
which gets pretty skewed when using a multi-column index.

# CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date);
# CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count;
# ANALYZE report_user_cat_count;
# SET enable_seqscan = true; SET enable_indexscan = true;
SET
SET
# EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date >
'2002-01-01'::TIMESTAMPWITH TIME ZONE;
 
INFO:  cost_seqscan: run_cost: 23685.025000       startup_cost: 0.000000

INFO:  cost_index: run_cost: 366295.018684       startup_cost: 0.000000       indexCorrelation: 0.500000
                                           QUERY PLAN
 

------------------------------------------------------------------------------------------------------------------------------------Seq
Scanon report_user_cat_count rucc  (cost=0.00..23685.03 rows=133918 width=64) (actual time=0.28..6100.85 rows=129941
loops=1) Filter: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestamp with time zone))Total runtime:
6649.21msec
 
(3 rows)

# SET enable_seqscan = false; SET enable_indexscan = true;
SET
SET
# EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date >
'2002-01-01'::TIMESTAMPWITH TIME ZONE;
 
INFO:  cost_seqscan: run_cost: 23685.025000       startup_cost: 100000000.000000

INFO:  cost_index: run_cost: 366295.018684       startup_cost: 0.000000       indexCorrelation: 0.500000
                                                                     QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index
Scanusing report_user_cat_count_utc_date_user_id_idx on report_user_cat_count rucc  (cost=0.00..366295.02 rows=133918
width=64)(actual time=53.91..3110.42 rows=129941 loops=1)  Index Cond: ((user_id < 1000) AND (utc_date > '2002-01-01
00:00:00-08'::timestampwith time zone))Total runtime: 3667.47 msec
 
(3 rows)


If I manually set the indexCorrelation to 1.0, however, the planner
chooses the right plan on its own, which is in effect what setting a
higher random_page_cost had been compensating for, a poorly determined
indexCorrelation.

# SET enable_seqscan = true; SET enable_indexscan = true;
SET
SET
# EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date >
'2002-01-01'::TIMESTAMPWITH TIME ZONE;
 
INFO:  cost_seqscan: run_cost: 23685.025000       startup_cost: 0.000000

INFO:  cost_index: run_cost: 4161.684248       startup_cost: 0.000000       indexCorrelation: 1.000000
                                                                  QUERY PLAN
 

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index
Scanusing report_user_cat_count_utc_date_user_id_idx on report_user_cat_count rucc  (cost=0.00..4161.68 rows=133918
width=64)(actual time=0.67..1176.63 rows=129941 loops=1)  Index Cond: ((user_id < 1000) AND (utc_date > '2002-01-01
00:00:00-08'::timestampwith time zone))Total runtime: 1705.40 msec
 
(3 rows)


Which suggests to me that line 3964 in
./src/backend/utils/adt/selfuncs.c isn't right for multi-column
indexes, esp for indexes that are clustered.  I don't know how to
address this though...  Tom, any hints?

FWIW, this is an old data/schema from a defunct project that I can
give people access to if they'd like. -sc

-- 
Sean Chittenden

Re: Correlation in cost_index()

From

Tom Lane

Date:

08 August 2003, 16:38:05

Sean Chittenden <sean@chittenden.org> writes:
> Which suggests to me that line 3964 in
> ./src/backend/utils/adt/selfuncs.c isn't right for multi-column
> indexes, esp for indexes that are clustered.  I don't know how to
> address this though...  Tom, any hints?

Yes, we knew that already.  Oliver had suggested simply dropping the
division by nKeys, thus pretending that the first-column correlation
is close enough.  That seems to me to be going too far in the other
direction, but clearly dividing by nKeys is far too pessimistic.
I'd change this in a moment if someone could point me to a formula
with any basis at all ...
        regards, tom lane

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

08 August 2003, 18:22:58

On Fri, 8 Aug 2003 11:06:56 -0700, Sean Chittenden
<sean@chittenden.org> wrote:
>[...] it'd seem as though an avg depth of
>nodes in index * tuples_fetched * (random_io_cost * indexCorrelation)
>would be closer than where we are now...

Index depth does not belong here because we walk down the index only
once per index scan not once per tuple.  It might be part of the
startup cost.

The rest of your formula doesn't seem right, too, because you get
higher costs for better correlation.  Did you meanrandom_io_cost * (1 - abs(indexCorrelation))?

FWIW, for small effective_cache_size max_IO_cost is almost equal to
tuples_fetched * random_page_cost.  So your formula (with the
corrections presumed above) boils down to ignoring
effective_cache_size and linear interpolation between 0 and
max_IO_cost.

>It's very possible that cost_index() is wrong, but it seems as though
>after some testing as if PostgreSQL _overly_ _favors_ the use of
>indexes:

Was this an unpatched backend?  What were the values of
effective_cache_size and random_page_cost?

># SET enable_seqscan = true; SET enable_indexscan = true;
>SET
>SET
># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME
ZONE;
>INFO:  cost_seqscan: run_cost: 21472.687500
>        startup_cost: 0.000000
>
>INFO:  cost_index: run_cost: 21154.308116
>        startup_cost: 0.000000
>        indexCorrelation: 0.999729
>                                                                                    QUERY PLAN

>-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc  (cost=0.00..21154.31
rows=705954width=64) (actual time=91.36..6625.79 rows=704840 loops=1)
 
>   Index Cond: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone)
> Total runtime: 11292.68 msec
>(3 rows)

"actual time=91.36..6625.79" but "Total runtime: 11292.68 msec"!
Where did those 4.7 seconds go?

># SET enable_seqscan = true; SET enable_indexscan = false;
>SET
>SET
># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME
ZONE;
>INFO:  cost_seqscan: run_cost: 21472.687500
>        startup_cost: 0.000000
>
>INFO:  cost_index: run_cost: 21154.308116
>        startup_cost: 100000000.000000
>        indexCorrelation: 0.999729
>                                                              QUERY PLAN

>---------------------------------------------------------------------------------------------------------------------------------------
> Seq Scan on report_user_cat_count rucc  (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19
rows=704840loops=1)
 
>   Filter: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone)
> Total runtime: 10506.44 msec
>(3 rows)

Same here: "actual time=1091.45..7441.19" but "Total runtime: 10506.44
msec"  -  more than 3 seconds lost.

When we ignore total runtime and look at actual time we get
            seq    idx
estimated   21473  21154
actual       7441   6626

This doesn't look too bad, IMHO.

BTW, I believe that with your example (single-column index, almost
perfect correlation, index cond selects almost all tuples) all
interpolation methods give an index cost estimation that is very close
to seq scan cost, and the actual runtimes show that this is correct.

>Which I find surprising and humorous given the popular belief is, mine
>included, contrary to those results.

How many tuples are in report_user_cat_count?  What are the stats for
report_user_cat_count.utc_date?

>  I can say with pretty high
>confidence that the patch to use a geometric mean isn't correct after
>having done real world testing as its break even point is vastly
>incorrect and only uses an index when there are less than 9,000 rows
>to fetch, a far cry from the 490K break even I found while testing.

Could you elaborate, please.  The intention of my patch was to favour
index scans more than the current implementation.  If it does not, you
have found a bug in my patch.  Did you test the other interpolation
methods?

>What I did find interesting, however, was that it does work better at
>determining the use of multi-column indexes,

Yes, because it computes the correlation for a two-column-index ascorrelation_of_first_index_column * 0.95
instead ofcorrelation_of_first_index_column / 2

> but I think that's
>because the geometric mean pessimizes the value of indexCorrelation,
>which gets pretty skewed when using a multi-column index.

I don't understand this.

># CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date);
># CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count;
># ANALYZE report_user_cat_count;
># SET enable_seqscan = true; SET enable_indexscan = true;
>SET
>SET
># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date >
'2002-01-01'::TIMESTAMPWITH TIME ZONE;
 
>INFO:  cost_seqscan: run_cost: 23685.025000
>        startup_cost: 0.000000
>
>INFO:  cost_index: run_cost: 366295.018684
>        startup_cost: 0.000000
>        indexCorrelation: 0.500000                          ^^^
This is certainly not with my patch.  The current implementation gives
ca. 366000 for pages_fetched = 122000 and random_page_cost = 4, which
looks plausible for 133000 tuples and (too?) small
effective_cache_size.

>                                                             QUERY PLAN

>------------------------------------------------------------------------------------------------------------------------------------
> Seq Scan on report_user_cat_count rucc  (cost=0.00..23685.03 rows=133918 width=64) (actual time=0.28..6100.85
rows=129941loops=1)
 
>   Filter: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestamp with time zone))
> Total runtime: 6649.21 msec
>(3 rows)
>
># SET enable_seqscan = false; SET enable_indexscan = true;
>SET
>SET
># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date >
'2002-01-01'::TIMESTAMPWITH TIME ZONE;
 
>INFO:  cost_seqscan: run_cost: 23685.025000
>        startup_cost: 100000000.000000
>
>INFO:  cost_index: run_cost: 366295.018684
>        startup_cost: 0.000000
>        indexCorrelation: 0.500000
>                                                                                       QUERY PLAN

>-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using report_user_cat_count_utc_date_user_id_idx on report_user_cat_count rucc  (cost=0.00..366295.02
rows=133918width=64) (actual time=53.91..3110.42 rows=129941 loops=1)
 
>   Index Cond: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestamp with time zone))
> Total runtime: 3667.47 msec
>(3 rows)

Which shows that Postgres does not "_overly_ _favor_ the use of
indexes".

>If I manually set the indexCorrelation to 1.0, however, the planner
>chooses the right plan on its own

Ok, with indexCorrelation == 1.0 we dont have to discuss interpolation
methods, because they all return min_IO_cost.

>Which suggests to me that line 3964 in
>./src/backend/utils/adt/selfuncs.c isn't right for multi-column
>indexes, esp for indexes that are clustered.

Agreed.

>  I don't know how to
>address this though...

I guess there is no chance without index statistics.

>FWIW, this is an old data/schema from a defunct project that I can
>give people access to if they'd like. 

Is there a dump available for download?

ServusManfred

Re: Correlation in cost_index()

From

Sean Chittenden

Date:

08 August 2003, 19:10:18

> > Which suggests to me that line 3964 in
> > ./src/backend/utils/adt/selfuncs.c isn't right for multi-column
> > indexes, esp for indexes that are clustered.  I don't know how to
> > address this though...  Tom, any hints?
> 
> Yes, we knew that already.  Oliver had suggested simply dropping the
> division by nKeys, thus pretending that the first-column correlation
> is close enough.  That seems to me to be going too far in the other
> direction,

But is it really?

> xbut clearly dividing by nKeys is far too pessimistic.  I'd change
> this in a moment if someone could point me to a formula with any
> basis at all ...

Got it, alright.  I'd never paid attention to prior discussions as the
planner had generally did the right thing (with a lowered
random_page_cost ::grin::).  In terms of statistics and setting
indexCorrelation correctly, something like Spearman's rho calculation
comes to mind, though I don't know how applicable that is to database
theory.

indexCorrelation is 1.0 for the 1st key in a multi-column index.  The
only thing different about a multi-column index and a single column
index is the multi-column index takes up more space per key, resulting
in fewer index entries per page and more pages being fetched than
would be in a single column index, but the current cost_index()
function takes increased number of page fetches into account when
calculating cost.  As things stand, however, if a multi-column key is
used, the indexCorrelation is penalized by the size of the number of
keys found in the multi-column index.  As things stand the qual
user_id = 42, on a CLUSTER'ed multi-column index (user_id,utc_date)
has an indexCorrelation of 0.5, when in fact the correlation is 1.0.
indexCorrelation == number of random page fetches, which could be next
to free on a solid state drive, in this case, the page fetches aren't
random, they're perfectly sequential.  If it were 'user_id = 42 AND
utc_date = NOW()', the correlation of a lookup of the user_id would
still be 1.0 and the utc_date would be 1.0 because both values are
looked up in the index key.  A lookup of just the utc_date can never
use the index and the planner correctly uses a sequential scan.  Cost
!= Correlation.  They're proportional, but not the same and
indexCorrelation is the wrong place to handle cost as that's done by
the Mackert and Lohman formula.  Under what circumstances would it be
correct to pessimize the use of indexCorrelation?  An indexCorrelation
of 0.0 doesn't mean that the index is useless either, just that we
take the full hit of a completely random page read as opposed to some
fraction of a random page cost.

I tossed a different index on my test table to see how well things
fare with a low correlation, and this was a bit disturbing:

# EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes > 8000000::BIGINT;
INFO:  cost_seqscan: run_cost: 21472.687500       startup_cost: 0.000000

INFO:  cost_index: run_cost: 112165.065458       startup_cost: 0.000000       indexCorrelation: 0.183380
                                          QUERY PLAN
 

----------------------------------------------------------------------------------------------------------------------------------Seq
Scanon report_user_cat_count rucc  (cost=0.00..21472.69 rows=31893 width=64) (actual time=444.25..2489.27 rows=514
loops=1) Filter: (html_bytes > 8000000::bigint)Total runtime: 2492.36 msec
 
(3 rows)

# SET enable_seqscan = false;
SET
# EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes > 8000000::BIGINT;
INFO:  cost_seqscan: run_cost: 21472.687500       startup_cost: 100000000.000000

INFO:  cost_index: run_cost: 112165.065458       startup_cost: 0.000000       indexCorrelation: 0.183380
                                                               QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index
Scanusing report_user_cat_count_html_bytes_idx on report_user_cat_count rucc  (cost=0.00..112165.07 rows=31893
width=64)(actual time=68.64..85.75 rows=514 loops=1)  Index Cond: (html_bytes > 8000000::bigint)Total runtime: 97.75
msec
(3 rows)


*shrug* A low indexCorrelation overly pessimizes the cost of an index,
but I'm not sure where to attribute this too.  :-/

-sc

-- 
Sean Chittenden

Re: Correlation in cost_index()

From

Tom Lane

Date:

08 August 2003, 19:26:00

Sean Chittenden <sean@chittenden.org> writes:
> indexCorrelation is 1.0 for the 1st key in a multi-column index.

... only if it's perfectly correlated.

> As things stand, however, if a multi-column key is
> used, the indexCorrelation is penalized by the size of the number of
> keys found in the multi-column index.  As things stand the qual
> user_id = 42, on a CLUSTER'ed multi-column index (user_id,utc_date)
> has an indexCorrelation of 0.5, when in fact the correlation is 1.0.

Right, in the perfectly-correlated case this calculation is clearly
wrong.  However, what of cases where the first column shows good
correlation with the physical ordering, but the second does not?

The nasty part of this is that the correlation stat that ANALYZE
computed for the second column is of no value to us.  Two examples:
X    Y                X    Y
A    A                A    BA    B                A    CA    C                A    AB    A                B    AB    B
             B    CB    C                B    BC    A                C    CC    B                C    AC    C
    C    B

In both cases ANALYZE will calculate correlation 1.0 for column X,
and something near zero for column Y.  We would like to come out with
index correlation 1.0 for the left-hand case and something much less
(but, perhaps, not zero) for the right-hand case.  I don't really see
a way to do this without actually examining the multi-column ordering
relationship during ANALYZE.

> I tossed a different index on my test table to see how well things
> fare with a low correlation, and this was a bit disturbing:

Seems like most of the error in that estimate has to do with the poor
rowcount estimation.  There's very little percentage in trying to
analyze the effect of index correlation in examples where we don't have
the first-order stats correct ...
        regards, tom lane

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

08 August 2003, 20:18:43

On Fri, 8 Aug 2003 15:10:06 -0700, Sean Chittenden
<sean@chittenden.org> wrote:
>> Yes, we knew that already.  Oliver had suggested simply dropping the
>> division by nKeys, thus pretending that the first-column correlation
>> is close enough.  That seems to me to be going too far in the other
>> direction,
>
>But is it really?

Yes.  I once posted two use cases showing that one side is as wrong as
the other.  Wait a momemt ...  Here it is:
http://archives.postgresql.org/pgsql-hackers/2002-10/msg00229.php.

>indexCorrelation is 1.0 for the 1st key in a multi-column index. [...]

Unfortunately there are cases, where the correlation of the first
column doesn't tell us anything about the real index correlation.

>  An indexCorrelation
>of 0.0 doesn't mean that the index is useless either, just that we
>take the full hit of a completely random page read as opposed to some
>fraction of a random page cost.

... and that we cannot expect the tuples we are looking for to be on
the same page or on pages near to each other.  So we have to read a
higher number of pages.

>        indexCorrelation: 0.183380
> Seq Scan on report_user_cat_count rucc  (cost=0.00..21472.69 rows=31893 width=64) (actual time=444.25..2489.27
rows=514loops=1)

>
> Index Scan using report_user_cat_count_html_bytes_idx on report_user_cat_count rucc  (cost=0.00..112165.07 rows=31893
width=64)(actual time=68.64..85.75 rows=514 loops=1)

The main problem here is the bad estimation for number of rows: 31893
vs. 514.  The seq scan cost depends on the size of the heap relation
and is always the same (~ 21000) for different search conditions.
With low correlation, index scan cost is roughly proportional to the
number of tuples fetched (ignoring the effects of effective_cache_size
for now).  So for a correct guess of number of rows we'd get
            seq    idx
estimated   21000  1800
actual       2500    86
ratio         8.4  20.4

As estimated and actual are not given in the same units, we look at
the estimated/actual ratio and see that index scan cost is
over-estimated by a factor 2.5.  index_cost_algorithm 4 (geometric
interpolation) should get this right.  Can you ANALYSE with a higher
statistics_target and then test with different index_cost_algorithms? 

ServusManfred

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

08 August 2003, 20:48:40

On Fri, 08 Aug 2003 18:25:41 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>  Two examples: [...]

One more example:X    Y
A    Aa    BA    Cb    AB    Bb    CC    Ac    BC    C

Correlation for column X is something less than 1.0, OTOH correlation
for an index on upper(X) is 1.0.

>I don't really see
>a way to do this without actually examining the multi-column ordering
>relationship during ANALYZE.

So did we reach consensus to add a TODO item?
* Compute index correlation on CREATE INDEX and ANALYZE,  use it for index scan cost estimation

ServusManfred

Re: Correlation in cost_index()

From

Sean Chittenden

Date:

08 August 2003, 20:53:59

> >[...] it'd seem as though an avg depth of
> >nodes in index * tuples_fetched * (random_io_cost * indexCorrelation)
> >would be closer than where we are now...
> 
> Index depth does not belong here because we walk down the index only
> once per index scan not once per tuple.  It might be part of the
> startup cost.
> 
> The rest of your formula doesn't seem right, too, because you get
> higher costs for better correlation.  Did you mean
>     random_io_cost * (1 - abs(indexCorrelation))?

Yeah...  this was just some off the cuff dribble, don't pay it much
attention.

> FWIW, for small effective_cache_size max_IO_cost is almost equal to
> tuples_fetched * random_page_cost.  So your formula (with the
> corrections presumed above) boils down to ignoring
> effective_cache_size and linear interpolation between 0 and
> max_IO_cost.

Interesting... 

> >It's very possible that cost_index() is wrong, but it seems as though
> >after some testing as if PostgreSQL _overly_ _favors_ the use of
> >indexes:
> 
> Was this an unpatched backend?  What were the values of
> effective_cache_size and random_page_cost?

For all intents and purposes, yes.

# SHOW effective_cache_size ;effective_cache_size
----------------------4456
(1 row)

# SHOW random_page_cost ;random_page_cost
------------------4
(1 row)

As opposed to setting random_page_cost and avoiding these
discussions/touring through the code, I'm trying to play by the
accepted rules...

> ># SET enable_seqscan = true; SET enable_indexscan = true;
> >SET
> >SET
> ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME
ZONE;
> >INFO:  cost_seqscan: run_cost: 21472.687500
> >        startup_cost: 0.000000
> >
> >INFO:  cost_index: run_cost: 21154.308116
> >        startup_cost: 0.000000
> >        indexCorrelation: 0.999729
> >                                                                                    QUERY PLAN
>
>-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > Index Scan using report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc  (cost=0.00..21154.31
rows=705954width=64) (actual time=91.36..6625.79 rows=704840 loops=1)
 
> >   Index Cond: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone)
> > Total runtime: 11292.68 msec
> >(3 rows)
> 
> "actual time=91.36..6625.79" but "Total runtime: 11292.68 msec"!
> Where did those 4.7 seconds go?

*shrug*  Don't ask me.

> ># SET enable_seqscan = true; SET enable_indexscan = false;
> >SET
> >SET
> ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME
ZONE;
> >INFO:  cost_seqscan: run_cost: 21472.687500
> >        startup_cost: 0.000000
> >
> >INFO:  cost_index: run_cost: 21154.308116
> >        startup_cost: 100000000.000000
> >        indexCorrelation: 0.999729
> >                                                              QUERY PLAN
>
>---------------------------------------------------------------------------------------------------------------------------------------
> > Seq Scan on report_user_cat_count rucc  (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19
rows=704840loops=1)
 
> >   Filter: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone)
> > Total runtime: 10506.44 msec
> >(3 rows)
> 
> Same here: "actual time=1091.45..7441.19" but "Total runtime:
> 10506.44 msec" - more than 3 seconds lost.
> 
> When we ignore total runtime and look at actual time we get
> 
>              seq    idx
> estimated   21473  21154
> actual       7441   6626
> 
> This doesn't look too bad, IMHO.
> 
> BTW, I believe that with your example (single-column index, almost
> perfect correlation, index cond selects almost all tuples) all
> interpolation methods give an index cost estimation that is very close
> to seq scan cost, and the actual runtimes show that this is correct.
> 
> >Which I find surprising and humorous given the popular belief is, mine
> >included, contrary to those results.
> 
> How many tuples are in report_user_cat_count?  What are the stats for
> report_user_cat_count.utc_date?

# SELECT COUNT(*) FROM report_user_cat_count ;
count
--------884906
(1 row)

The stats are attached && bzip2 compressed.

> >I can say with pretty high confidence that the patch to use a
> >geometric mean isn't correct after having done real world testing
> >as its break even point is vastly incorrect and only uses an index
> >when there are less than 9,000 rows to fetch, a far cry from the
> >490K break even I found while testing.
> 
> Could you elaborate, please.  The intention of my patch was to
> favour index scans more than the current implementation.  If it does
> not, you have found a bug in my patch.  Did you test the other
> interpolation methods?

I didn't, only the geometric mean... the problem with your patch was
that it picked an index less often than the current code when there
was low correlation.  I think you're onto something, I'm just not
convinced it's right.  I actually think that costs should be converted
into estimated msec's of execution and there should be hardware GUCs
for CPU speed, RAM access, and basic HDD stats, but that's something
different.

> >What I did find interesting, however, was that it does work better at
> >determining the use of multi-column indexes,
> 
> Yes, because it computes the correlation for a two-column-index as
>     correlation_of_first_index_column * 0.95
> instead of
>     correlation_of_first_index_column / 2

*nods*

> > but I think that's because the geometric mean pessimizes the value
> >of indexCorrelation, which gets pretty skewed when using a
> >multi-column index.
> 
> I don't understand this.
> 
> ># CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date);
> ># CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count;
> ># ANALYZE report_user_cat_count;
> ># SET enable_seqscan = true; SET enable_indexscan = true;
> >SET
> >SET
> ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date >
'2002-01-01'::TIMESTAMPWITH TIME ZONE;
 
> >INFO:  cost_seqscan: run_cost: 23685.025000
> >        startup_cost: 0.000000
> >
> >INFO:  cost_index: run_cost: 366295.018684
> >        startup_cost: 0.000000
> >        indexCorrelation: 0.500000
>                            ^^^
> This is certainly not with my patch.  The current implementation gives
> ca. 366000 for pages_fetched = 122000 and random_page_cost = 4, which
> looks plausible for 133000 tuples and (too?) small
> effective_cache_size.

*nods* I manually applied bits of it and didn't change the correlation
code in adt/selfuncs.c because it wasn't any more founded than what's
there.... though I do think it'd yield better results in my case, it
isn't very adaptive given it uses arbitrary magic numbers.

> >If I manually set the indexCorrelation to 1.0, however, the planner
> >chooses the right plan on its own
> 
> Ok, with indexCorrelation == 1.0 we dont have to discuss interpolation
> methods, because they all return min_IO_cost.
> 
> >Which suggests to me that line 3964 in
> >./src/backend/utils/adt/selfuncs.c isn't right for multi-column
> >indexes, esp for indexes that are clustered.
> 
> Agreed.
> 
> >  I don't know how to
> >address this though...
> 
> I guess there is no chance without index statistics.
> 
> >FWIW, this is an old data/schema from a defunct project that I can
> >give people access to if they'd like. 
> 
> Is there a dump available for download?

Sure, let me post it.

http://people.FreeBSD.org/~seanc/pgsql/rucc.sql.bz2

-sc

-- 
Sean Chittenden

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

08 August 2003, 21:27:34

On Fri, 8 Aug 2003 16:53:48 -0700, Sean Chittenden
<sean@chittenden.org> wrote:
># SHOW effective_cache_size ;
> effective_cache_size
>----------------------
> 4456
>(1 row)

Only 35 MB?  Are you testing on such a small machine?

>The stats are attached && bzip2 compressed.

Nothing was attached.  Did you upload it to your web site?

>> >I can say with pretty high confidence that the patch to use a
>> >geometric mean isn't correct

>... the problem with your patch was
>that it picked an index less often than the current code when there
>was low correlation.

In cost_index.sxc I get lower estimates for *all* proposed new
interpolation methods.  Either my C code doesn't implement the same
calculations as the spreadsheet, or ... 

>I manually applied bits of it [...]

... could this explain the unexpected behaviour?

I'm currently downloading your dump.  Can you post the query you
mentioned above?

ServusManfred

Re: Correlation in cost_index()

From

Sean Chittenden

Date:

09 August 2003, 02:08:03

> ># SHOW effective_cache_size ;
> > effective_cache_size
> >----------------------
> > 4456
> >(1 row)
> 
> Only 35 MB?  Are you testing on such a small machine?

Testing on my laptop right now... can't hack on my production DBs the
same way I can my laptop.

> >The stats are attached && bzip2 compressed.
> 
> Nothing was attached.  Did you upload it to your web site?

Gah, not yet, forgot to send it.

http://people.FreeBSD.org/~seanc/pg_statistic.txt.bz2

> >> >I can say with pretty high confidence that the patch to use a
> >> >geometric mean isn't correct
> 
> >... the problem with your patch was that it picked an index less
> >often than the current code when there was low correlation.
> 
> In cost_index.sxc I get lower estimates for *all* proposed new
> interpolation methods.  Either my C code doesn't implement the same
> calculations as the spreadsheet, or ...
> 
> >I manually applied bits of it [...]
> 
> ... could this explain the unexpected behaviour?

Don't think so...  the run_cost was correct, I didn't modify the
indexCorrelation behavior beyond forcing it to 1.0.

> I'm currently downloading your dump.  Can you post the query you
> mentioned above?

SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes > 20000000::BIGINT;
SELECT * FROM report_user_cat_count AS rucc WHERE user_id = 42 AND utc_date = NOW();
SELECT * FROM report_user_cat_count AS rucc WHERE user_id = 42;
SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2003-01-01'::TIMESTAMP WITH TIME
ZONE;

And various timestamps back to 2002-09-19 and user_id's IN(1,42).

-sc

-- 
Sean Chittenden

Re: Correlation in cost_index()

From

"Zeugswetter Andreas SB SD"

Date:

11 August 2003, 10:10:20

> In both cases ANALYZE will calculate correlation 1.0 for column X,
> and something near zero for column Y.  We would like to come out with
> index correlation 1.0 for the left-hand case and something much less
> (but, perhaps, not zero) for the right-hand case.  I don't really see
> a way to do this without actually examining the multi-column ordering
> relationship during ANALYZE.

The only way the second column correlation will be irrelevant is if
the first column is already (nearly) unique (enough so, that the second
column wont scatter fetches enough to fill the buffer before seeing cache hits).
Thus I think when merging correlations you could take nunique into account.

corr = corr_1 * (corr_2 * ( 1 - nunique_1 / nrows))
But, I think one (new) correlation metric for the whole index (whole key) and the
data pages would actually be sufficient. This metric could imho always be used
instead of the per column correlations to calculate index cost. This holds true
as long as you walk an index range, and that is what it is all about, no ?

???
Andreas

Re: Correlation in cost_index()

From

Manfred Koizar

Date:

20 August 2003, 14:56:40

On Fri, 8 Aug 2003 16:53:48 -0700, Sean Chittenden
<sean@chittenden.org> wrote:
>the problem with your patch was
>that it picked an index less often than the current code when there
>was low correlation.

Maybe bit rot?  What version did you apply the patch against?  Here is
a new version for Postgres 7.3.4:
http://www.pivot.at/pg/16d-correlation_734.diff

The only difference to the previous version is that
for (nKeys = 1; index->indexkeys[nKeys] != 0; nKeys++)

is now replaced with
for (nKeys = 1; nKeys < index->ncolumns; nKeys++)

Don't know whether the former just worked by chance when I tested the
7.3.2 version :-(.  Tests with 7.4Beta1 showed that index correlation
comes out too low with the old loop termination condition.  Anyway,
the latter version seems more robust.

In my tests the new index_cost_algorithms (1, 2, 3, 4) gave
consistently lower cost estimates than the old method (set
index_cost_algorithm = 0), except of course for correlations of 1.0 or
0.0, because in these border cases you get always min_IO_cost or
max_IO_cost, respectively.

Care to re-evaluate?  BTW, there's a version of the patch for 7.4Beta1
(http://www.pivot.at/pg/16d-correlation_74b1.diff) which also applies
cleanly against cvs snapshot from 2003-08-17.

ServusManfred