Thread: Correlation in cost_index()
You all know this FAQ: "Why does Postgres not use my index?" Half of the time this problem can easily be solved by casting a literal to the type of the respective column; this is not my topic here. In many other cases it turns out that the planner over-estimates the cost of an index scan. Sometimes this can be worked around by lowering random_page_cost. "Of course, that's a hack that is quite unrelated to the real problem." I strongly agree ;-) AFAICS (part of) the real problem is in costsize.c:cost_index() where IO_cost is calculated from min_IO_cost, pages_fetched, random_page_cost, and indexCorrelation. The current implementation uses indexCorrelation^2 to interpolate between min_IO_cost and max_IO_cost, which IMHO gives results that are too close to max_IO_cost. This conjecture is supported by the fact, that often actual run times are much lower than estimated, when seqscans are disabled. So we have to find a cost function, so that . min_IO_cost <= cost <= max_IO_cost for -1 <= indexCorrelation <= 1 . cost --> min_IO_cost for indexCorrelation --> +/- 1 . cost --> max_IO_cost for indexCorrelation --> 0 . cost tends more towards min_IO_costthan current implementation After playing around a bit I propose three functions satisfying above conditions. All proposals use absC = abs(indexCorrelation). Proposal 1: Use absC for interpolation. IO_cost = absC * min_IO_cost + (1 - absC) * max_IO_cost; Proposal 2: First calculate estimates for numbers of pages and cost per page, then multiply the results. estPages = absC * minPages + (1 - absC) * maxPages;estPCost = absC * 1 + (1 - absC) * random_page_cost; /* ^ sequential_page_cost */IO_cost = estPages * estPCost; Proposal 3: Interpolate "geometrically", using absC. IO-cost = exp( absC * ln(min_IO_Cost) + (1 - absC) * ln(max_IO_cost)); Here are some numbers for seq_page_cost = 1 (constant) random_page_cost = 4 (GUC) minPages = 61 maxPages = 1440 corr current p1 p2 p3 0 5760.00 5760.00 5760.00 5760.00 0.1 5703.01 5190.10 4817.77 3655.22 0.2 5532.04 4620.20 3958.28 2319.55 0.3 5247.09 4050.30 3181.53 1471.96 0.4 4848.16 3480.40 2487.52 934.08 0.5 4335.25 2910.50 1876.25 592.76 0.6 3708.36 2340.60 1347.72 376.16 0.7 2967.49 1770.70 901.93 238.70 0.8 2112.64 1200.80 538.88 151.48 0.9 1143.81 630.90 258.57 96.13 0.95 616.65 345.95 149.44 76.57 0.99 174.41 117.99 77.03 63.84 0.995 117.85 89.50 68.91 62.40 0.999 72.39 66.70 62.57 61.28 1 61.00 61.00 61.00 61.00 Another example for seq_page_cost = 1 (constant) random_page_cost = 10 (GUC) minPages = 20 maxPages = 938.58 corr current p1 p2 p3 0 9385.79 9385.79 9385.79 9385.79 0.1 9292.14 8449.21 7705.17 5073.72 0.2 9011.16 7512.64 6189.88 2742.73 0.3 8542.87 6576.06 4839.94 1482.65 0.4 7887.27 5639.48 3655.34 801.48 0.5 7044.35 4702.90 2636.09 433.26 0.6 6014.11 3766.32 1782.19 234.21 0.7 4796.56 2829.74 1093.62 126.61 0.8 3391.69 1893.16 570.40 68.44 0.9 1799.50 956.58 212.53 37.00 0.95 933.16 488.29 95.60 27.20 0.99 206.38 113.66 31.81 21.27 0.995 113.42 66.83 25.70 20.62 0.999 38.72 29.37 21.11 20.12 1 20.00 20.00 20.00 20.00 (If you want to play around with your own numbers, I can send my OOo spreadsheet privately or to the list.) The second example shows that especially with proposal 3 we could afford to set random_page_cost to a *higher* value, which in contrast to previous recommendations seems to be appropriate, IIRC that benchmark results posted here showed values of up to 60. As nobody knows how each of these proposals performs in real life under different conditions, I suggest to leave the current implementation in, add all three algorithms, and supply a GUC variable to select a cost function. Comments? Ideas? Objections? ServusManfred
On Wed, 2 Oct 2002, Manfred Koizar wrote: > As nobody knows how each of these proposals performs in real life > under different conditions, I suggest to leave the current > implementation in, add all three algorithms, and supply a GUC variable > to select a cost function. I'd certainly be willing to do some testing on my own data with them. Gotta patch? I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, and I have been running random page cost at 1 lately, as well as running cpu_index_cost at 1/10th the default setting to get good results.
Manfred Koizar <mkoi-pg@aon.at> writes: > AFAICS (part of) the real problem is in costsize.c:cost_index() where > IO_cost is calculated from min_IO_cost, pages_fetched, > random_page_cost, and indexCorrelation. The current implementation > uses indexCorrelation^2 to interpolate between min_IO_cost and > max_IO_cost, which IMHO gives results that are too close to > max_IO_cost. The indexCorrelation^2 algorithm was only a quick hack with no theory behind it :-(. I've wanted to find some better method to put in there, but have not had any time to research the problem. > As nobody knows how each of these proposals performs in real life > under different conditions, I suggest to leave the current > implementation in, add all three algorithms, and supply a GUC variable > to select a cost function. I don't think it's really a good idea to expect users to pick among multiple cost functions that *all* have no guiding theory behind them. I'd prefer to see us find a better cost function and use it. Has anyone trawled the database literature on the subject? regards, tom lane
On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: >I don't think it's really a good idea to expect users to pick among >multiple cost functions The idea is that PG is shipped with a default representing the best of our knowledge and users are not encouraged to change it. When a user sends a "PG does not use my index" or "Why doesn't it scan sequentially?" message to one of the support lists, we advise her/him to set index_cost_algorithm to 3 (or whatever we feel appropriate) and watch the feedback we get. We don't risk anything, if the default is the current behaviour. ServusManfred
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe" <scott.marlowe@ihs.com> wrote: >I'd certainly be willing to do some testing on my own data with them. Great! >Gotta patch? Not yet. > I've found that when the planner misses, sometimes it misses >by HUGE amounts on large tables, and I have been running random page cost >at 1 lately, as well as running cpu_index_cost at 1/10th the default >setting to get good results. May I ask for more information? What are your settings for effective_cache_size and shared_buffers? And which version are you running? ServusManfred
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe" <scott.marlowe@ihs.com> wrote: >I'd certainly be willing to do some testing on my own data with them. >Gotta patch? Yes, see below. Disclaimer: Apart from "make; make check" this is completely untested. Use at your own risk. Have fun! Servus Manfred diff -ruN ../base/src/backend/optimizer/path/costsize.c src/backend/optimizer/path/costsize.c --- ../base/src/backend/optimizer/path/costsize.c 2002-07-04 18:04:57.000000000 +0200 +++ src/backend/optimizer/path/costsize.c 2002-10-03 09:53:06.000000000 +0200 @@ -72,6 +72,7 @@ double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST; double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST; double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST; +int index_cost_algorithm = DEFAULT_INDEX_COST_ALGORITHM; Cost disable_cost = 100000000.0; @@ -213,8 +214,8 @@ Cost indexStartupCost; Cost indexTotalCost; Selectivity indexSelectivity; - double indexCorrelation, - csquared; + double indexCorrelation; + Cost IO_cost; Cost min_IO_cost, max_IO_cost; Cost cpu_per_tuple; @@ -329,13 +330,62 @@ min_IO_cost = ceil(indexSelectivity * T); max_IO_cost = pages_fetched * random_page_cost; - /* - * Now interpolate based on estimated index order correlation to get - * total disk I/O cost for main table accesses. - */ - csquared = indexCorrelation * indexCorrelation; + switch (index_cost_algorithm) { + case 1: { + /* + ** Use abs(correlation) for linear interpolation + */ + double absC = fabs(indexCorrelation); + + IO_cost = absC * min_IO_cost + (1 - absC) * max_IO_cost; + } + + case 2: { + /* + ** First estimate number of pages and cost per page, + ** then multiply the results. min_IO_cost is used for + ** min_pages, because seq_page_cost = 1. + */ + double absC = fabs(indexCorrelation); + + double estPages = absC * min_IO_cost + (1 - absC) * pages_fetched; + double estPCost = absC * 1 + (1 - absC) * random_page_cost; + IO_cost = estPages * estPCost; + } + + case 3: { + /* + ** Interpolate based on independence squared, thus forcing the + ** result to be closer to min_IO_cost + */ + double independence = 1 - fabs(indexCorrelation); + double isquared = independence * independence; + + IO_cost = (1 - isquared) * min_IO_cost + isquared * max_IO_cost; + } + + case 4: { + /* + ** Interpolate geometrically + */ + double absC = fabs(indexCorrelation); + + IO_cost = exp(absC * log(min_IO_cost) + + (1 - absC) * log(max_IO_cost)); + } + + default: { + /* + * Interpolate based on estimated index order correlation + * to get total disk I/O cost for main table accesses. + */ + double csquared = indexCorrelation * indexCorrelation; + + IO_cost = max_IO_cost + csquared * (min_IO_cost - max_IO_cost); + } + } - run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost); + run_cost += IO_cost; /* * Estimate CPU costs per tuple. diff -ruN ../base/src/backend/utils/misc/guc.c src/backend/utils/misc/guc.c --- ../base/src/backend/utils/misc/guc.c 2002-07-20 17:27:19.000000000 +0200 +++ src/backend/utils/misc/guc.c 2002-10-03 10:03:37.000000000 +0200 @@ -644,6 +644,11 @@ }, { + { "index_cost_algorithm", PGC_USERSET }, &index_cost_algorithm, + DEFAULT_INDEX_COST_ALGORITHM, 0, INT_MAX, NULL, NULL + }, + + { { NULL, 0 }, NULL, 0, 0, 0, NULL, NULL } }; diff -ruN ../base/src/include/optimizer/cost.h src/include/optimizer/cost.h --- ../base/src/include/optimizer/cost.h 2002-06-21 02:12:29.000000000 +0200 +++ src/include/optimizer/cost.h 2002-10-03 09:56:28.000000000 +0200 @@ -24,6 +24,7 @@ #define DEFAULT_CPU_TUPLE_COST 0.01 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.001 #define DEFAULT_CPU_OPERATOR_COST 0.0025 +#define DEFAULT_INDEX_COST_ALGORITHM 3 /* defaults for function attributes used for expensive function calculations */ #define BYTE_PCT 100 @@ -43,6 +44,7 @@ extern double cpu_tuple_cost; extern double cpu_index_tuple_cost; extern double cpu_operator_cost; +extern int index_cost_algorithm; extern Cost disable_cost; extern bool enable_seqscan; extern bool enable_indexscan;
On Thu, 03 Oct 2002 12:40:20 +0200, I wrote: >>Gotta patch? > >Yes, see below. Oh, did I mention that inserting some break statements after the switch cases helps a lot? :-( Cavus venter non laborat libenter ... ServusManfred
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe" <scott.marlowe@ihs.com> wrote: >I've found that when the planner misses, sometimes it misses >by HUGE amounts on large tables, Scott, yet another question: are multicolunm indices involved in your estimator problems? ServusManfred
On Thu, 3 Oct 2002, Manfred Koizar wrote: > On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe" > <scott.marlowe@ihs.com> wrote: > >I'd certainly be willing to do some testing on my own data with them. > > Great! > > >Gotta patch? > > Not yet. > > > I've found that when the planner misses, sometimes it misses > >by HUGE amounts on large tables, and I have been running random page cost > >at 1 lately, as well as running cpu_index_cost at 1/10th the default > >setting to get good results. > > May I ask for more information? What are your settings for > effective_cache_size and shared_buffers? And which version are you > running? I'm running 7.2.2 in production and 7.3b2 in testing.effective cache size is the default (i.e. commented out) shared buffers are at 4000. I've found that increasing shared buffers past 4000 (32 megs) to 16384 (128 Megs) has no great effect on my machine's performance, but I've never really played with effective cache size. I've got a couple of queries that join a 1M+row table to itself and to a 50k row table, and the result sets are usually <100 rows at a time. Plus some other smaller datasets that return larger amounts (i.e. sometimes all rows) of data generally.
On Thu, 3 Oct 2002, Manfred Koizar wrote: > On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), "scott.marlowe" > <scott.marlowe@ihs.com> wrote: > >I've found that when the planner misses, sometimes it misses > >by HUGE amounts on large tables, > > Scott, > > yet another question: are multicolunm indices involved in your > estimator problems? No. Although I use them a fair bit, none of the problems I've encountered so far have involved them. But I'd be willing to setup some test indexes and get some numbers on them.
On Thu, 3 Oct 2002 10:59:54 -0600 (MDT), "scott.marlowe" <scott.marlowe@ihs.com> wrote: >>are multicolunm indices involved in your estimator problems? > >No. Although I use them a fair bit, none of the problems I've encountered >so far have involved them. But I'd be willing to setup some test indexes >and get some numbers on them. Never mind! I just stumbled over those lines in selfuncs.c where indexCorrelation is calculated by dividing the correlation of the first index column by the number of index columns. I have a use case here where this clearly is not the right choice and was hoping to find some examples that help me investigate whether my case is somewhat uncommon ... ServusManfred
Manfred Koizar <mkoi-pg@aon.at> writes: > Never mind! I just stumbled over those lines in selfuncs.c where > indexCorrelation is calculated by dividing the correlation of the > first index column by the number of index columns. Yeah, I concluded later that that was bogus. I've been thinking of just using the correlation of the first index column and ignoring the rest; that would not be great, but it's probably better than what's there. Have you got a better formula? regards, tom lane
On Thu, 03 Oct 2002 14:50:00 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> indexCorrelation is calculated by dividing the correlation of the >> first index column by the number of index columns. > >Yeah, I concluded later that that was bogus. I've been thinking of >just using the correlation of the first index column and ignoring >the rest; that would not be great, but it's probably better than what's >there. Have you got a better formula? Unfortunately not. I think such a formula does not exist for the information we have. What we'd need is a notion of correlation of the nth (n > 1) index column for constant values of the first n-1 index columns; or a combined correlation for the first n index columns (1 < n <= number of index columns). I try to understand the problem with the help of use cases. [ Jump to the end, if this looks to long-winded. ] 1) Have a look at invoice items with an index on (fyear, invno, itemno). Invoice numbers start at 1 for each financial year, item numbers start at 1 for each invoice. In a typical scenario correlations for fyear, (fyear, invno), and (fyear, invno, itemno) are close to 1; invno correlation is expected to be low; and itemno looks totally chaotic to the analyzer. When we SELECT * FROM item WHERE fyear = 2001 AND invno < 1000 dividing the correlation of the first column by the number of columns gives 1/3 which has nothing to do with what we want. (And then the current implementation of cost_index() squares this and gets 1/9 which is even farther away from the truth.) Just using the correlation of the first index column seems right here. 2) OTOH consider bookkeeping with enties identified by (fyear, account, entryno). Again fyear has a correlation near 1. For account we can expect something near 0, and entryno has a distribution comparable to invno in the first use case, i.e. starting at 1 for each year. SELECT * from entry WHERE fyear = 2001 AND account = whatever Taking the correlation of fyear would imply that the tuples we are looking for are close to each other, which again can turn out to be wrong. So what do we know now? Even less than before :-( I have just one idea that might help a bit: If correlation of the first index column is near +/1, cost_index() should not use baserel->pages, but baserel->pages * selectivity of the first column. ServusManfred
On Thu, 3 Oct 2002 10:45:08 -0600 (MDT), "scott.marlowe" <scott.marlowe@ihs.com> wrote: > effective cache size is the default (i.e. commented out) The default is 1000, meaning ca. 8 MB, which seems to be way too low. If your server is (almost) exclusively used by Postgres, try setting it to represent most of your OS cache (as reported by free on Linux). Otherwise you have to estimate the fraction of the OS cache that gets used by Postgres. I'm still trying to get a feeling for how these settings play together, so I'd be grateful if you report back the effects this has on your cost estimates. Caveat: effective_cache_size is supposed to be the number of cache pages available to one query (or is it one scan?). So if you have several concurrent queries (or complex queries with several scans), you should choose a lower value. OTOH if most of your queries operate on the same data, one query could benefit from pages cached by other queries ... You have to experiment a little. ServusManfred
> > AFAICS (part of) the real problem is in costsize.c:cost_index() where > > IO_cost is calculated from min_IO_cost, pages_fetched, > > random_page_cost, and indexCorrelation. The current implementation > > uses indexCorrelation^2 to interpolate between min_IO_cost and > > max_IO_cost, which IMHO gives results that are too close to > > max_IO_cost. > > The indexCorrelation^2 algorithm was only a quick hack with no theory > behind it :-(. I've wanted to find some better method to put in there, > but have not had any time to research the problem. Could we "quick hack" it to a geometric mean instead since a mean seemed to yield better results than indexCorrelation^2? > > As nobody knows how each of these proposals performs in real life > > under different conditions, I suggest to leave the current > > implementation in, add all three algorithms, and supply a GUC variable > > to select a cost function. > > I don't think it's really a good idea to expect users to pick among > multiple cost functions that *all* have no guiding theory behind them. > I'd prefer to see us find a better cost function and use it. Has anyone > trawled the database literature on the subject? Hrm, after an hour of searching and reading, I think one of the better papers on the subject can be found here: http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf Page 13, figure 3-12 is the ticket you were looking for Tom. It's an interesting read with a pretty good analysis and conclusion. The author notes that his formula begins to fall apart when the number of dimensions reaches 10 and suggests the use of a high dimension random cost estimate algo, but that the use of those comes at great expense to the CPU (which is inline with a few other papers that I read). The idea of precomputing values piqued my interest and I thought was very clever, albeit space intensive. *shrug* -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > Hrm, after an hour of searching and reading, I think one of the better > papers on the subject can be found here: > http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf Interesting paper, but I don't see the connection to index order correlation? regards, tom lane
On Thu, 7 Aug 2003 13:44:19 -0700, Sean Chittenden <sean@chittenden.org> wrote: >> The indexCorrelation^2 algorithm was only a quick hack with no theory >> behind it :-(. I've wanted to find some better method to put in there, >> but have not had any time to research the problem. > >Could we "quick hack" it to a geometric mean instead since a mean >seemed to yield better results than indexCorrelation^2? Linear interpolation on (1-indexCorrelation)^2 (algorithm 3 in http://members.aon.at/pivot/pg/16-correlation-732.diff) is almost as good as geometric interpolation (algorithm 4 in the patch, proposal 3 in this thread), and its computation is much cheaper because it does not call exp() and log(). Download http://members.aon.at/pivot/pg/cost_index.sxc and play around with your own numbers to get a feeling. (1-indexCorrelation)^2 suffers from the same lack of theory behind it as indexCorrelation^2. But the results look much more plausible. Well, at least to me ;-) ServusManfred
> > Hrm, after an hour of searching and reading, I think one of the > > better papers on the subject can be found here: > > http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf > > Interesting paper, but I don't see the connection to index order > correlation? Nothing that I found was nearly that specific, as close as I could find was the paper above on calculating the cost of fetching data from a disk, which I thought was the bigger problem at hand, but I digress... In one paper about large dimension index searches, they did suggest that cost was cumulative for the number of disk reads or nodes in the tree that weren't held in cache, which was the biggest hint that I had found on this specific topic. With that as a guiding light (or something faintly resembling it), it'd seem as though an avg depth of nodes in index * tuples_fetched * (random_io_cost * indexCorrelation) would be closer than where we are now... but now also think I/we're barking up the right tree with this thread. It's very possible that cost_index() is wrong, but it seems as though after some testing as if PostgreSQL _overly_ _favors_ the use of indexes: # SET enable_seqscan = true; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.000000 INFO: cost_index: run_cost: 21154.308116 startup_cost: 0.000000 indexCorrelation: 0.999729 QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index Scanusing report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc (cost=0.00..21154.31 rows=705954 width=64)(actual time=91.36..6625.79 rows=704840 loops=1) Index Cond: (utc_date > '2002-10-01 00:00:00-07'::timestamp withtime zone)Total runtime: 11292.68 msec (3 rows) # SET enable_seqscan = true; SET enable_indexscan = false; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.000000 INFO: cost_index: run_cost: 21154.308116 startup_cost: 100000000.000000 indexCorrelation: 0.999729 QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------Seq Scanon report_user_cat_count rucc (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19 rows=704840 loops=1) Filter: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone)Total runtime: 10506.44 msec (3 rows) Which I find surprising and humorous given the popular belief is, mine included, contrary to those results. I can say with pretty high confidence that the patch to use a geometric mean isn't correct after having done real world testing as its break even point is vastly incorrect and only uses an index when there are less than 9,000 rows to fetch, a far cry from the 490K break even I found while testing. What I did find interesting, however, was that it does work better at determining the use of multi-column indexes, but I think that's because the geometric mean pessimizes the value of indexCorrelation, which gets pretty skewed when using a multi-column index. # CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date); # CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count; # ANALYZE report_user_cat_count; # SET enable_seqscan = true; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2002-01-01'::TIMESTAMPWITH TIME ZONE; INFO: cost_seqscan: run_cost: 23685.025000 startup_cost: 0.000000 INFO: cost_index: run_cost: 366295.018684 startup_cost: 0.000000 indexCorrelation: 0.500000 QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------Seq Scanon report_user_cat_count rucc (cost=0.00..23685.03 rows=133918 width=64) (actual time=0.28..6100.85 rows=129941 loops=1) Filter: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestamp with time zone))Total runtime: 6649.21msec (3 rows) # SET enable_seqscan = false; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2002-01-01'::TIMESTAMPWITH TIME ZONE; INFO: cost_seqscan: run_cost: 23685.025000 startup_cost: 100000000.000000 INFO: cost_index: run_cost: 366295.018684 startup_cost: 0.000000 indexCorrelation: 0.500000 QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index Scanusing report_user_cat_count_utc_date_user_id_idx on report_user_cat_count rucc (cost=0.00..366295.02 rows=133918 width=64)(actual time=53.91..3110.42 rows=129941 loops=1) Index Cond: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestampwith time zone))Total runtime: 3667.47 msec (3 rows) If I manually set the indexCorrelation to 1.0, however, the planner chooses the right plan on its own, which is in effect what setting a higher random_page_cost had been compensating for, a poorly determined indexCorrelation. # SET enable_seqscan = true; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2002-01-01'::TIMESTAMPWITH TIME ZONE; INFO: cost_seqscan: run_cost: 23685.025000 startup_cost: 0.000000 INFO: cost_index: run_cost: 4161.684248 startup_cost: 0.000000 indexCorrelation: 1.000000 QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index Scanusing report_user_cat_count_utc_date_user_id_idx on report_user_cat_count rucc (cost=0.00..4161.68 rows=133918 width=64)(actual time=0.67..1176.63 rows=129941 loops=1) Index Cond: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestampwith time zone))Total runtime: 1705.40 msec (3 rows) Which suggests to me that line 3964 in ./src/backend/utils/adt/selfuncs.c isn't right for multi-column indexes, esp for indexes that are clustered. I don't know how to address this though... Tom, any hints? FWIW, this is an old data/schema from a defunct project that I can give people access to if they'd like. -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > Which suggests to me that line 3964 in > ./src/backend/utils/adt/selfuncs.c isn't right for multi-column > indexes, esp for indexes that are clustered. I don't know how to > address this though... Tom, any hints? Yes, we knew that already. Oliver had suggested simply dropping the division by nKeys, thus pretending that the first-column correlation is close enough. That seems to me to be going too far in the other direction, but clearly dividing by nKeys is far too pessimistic. I'd change this in a moment if someone could point me to a formula with any basis at all ... regards, tom lane
On Fri, 8 Aug 2003 11:06:56 -0700, Sean Chittenden <sean@chittenden.org> wrote: >[...] it'd seem as though an avg depth of >nodes in index * tuples_fetched * (random_io_cost * indexCorrelation) >would be closer than where we are now... Index depth does not belong here because we walk down the index only once per index scan not once per tuple. It might be part of the startup cost. The rest of your formula doesn't seem right, too, because you get higher costs for better correlation. Did you meanrandom_io_cost * (1 - abs(indexCorrelation))? FWIW, for small effective_cache_size max_IO_cost is almost equal to tuples_fetched * random_page_cost. So your formula (with the corrections presumed above) boils down to ignoring effective_cache_size and linear interpolation between 0 and max_IO_cost. >It's very possible that cost_index() is wrong, but it seems as though >after some testing as if PostgreSQL _overly_ _favors_ the use of >indexes: Was this an unpatched backend? What were the values of effective_cache_size and random_page_cost? ># SET enable_seqscan = true; SET enable_indexscan = true; >SET >SET ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE; >INFO: cost_seqscan: run_cost: 21472.687500 > startup_cost: 0.000000 > >INFO: cost_index: run_cost: 21154.308116 > startup_cost: 0.000000 > indexCorrelation: 0.999729 > QUERY PLAN >----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc (cost=0.00..21154.31 rows=705954width=64) (actual time=91.36..6625.79 rows=704840 loops=1) > Index Cond: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone) > Total runtime: 11292.68 msec >(3 rows) "actual time=91.36..6625.79" but "Total runtime: 11292.68 msec"! Where did those 4.7 seconds go? ># SET enable_seqscan = true; SET enable_indexscan = false; >SET >SET ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE; >INFO: cost_seqscan: run_cost: 21472.687500 > startup_cost: 0.000000 > >INFO: cost_index: run_cost: 21154.308116 > startup_cost: 100000000.000000 > indexCorrelation: 0.999729 > QUERY PLAN >--------------------------------------------------------------------------------------------------------------------------------------- > Seq Scan on report_user_cat_count rucc (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19 rows=704840loops=1) > Filter: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone) > Total runtime: 10506.44 msec >(3 rows) Same here: "actual time=1091.45..7441.19" but "Total runtime: 10506.44 msec" - more than 3 seconds lost. When we ignore total runtime and look at actual time we get seq idx estimated 21473 21154 actual 7441 6626 This doesn't look too bad, IMHO. BTW, I believe that with your example (single-column index, almost perfect correlation, index cond selects almost all tuples) all interpolation methods give an index cost estimation that is very close to seq scan cost, and the actual runtimes show that this is correct. >Which I find surprising and humorous given the popular belief is, mine >included, contrary to those results. How many tuples are in report_user_cat_count? What are the stats for report_user_cat_count.utc_date? > I can say with pretty high >confidence that the patch to use a geometric mean isn't correct after >having done real world testing as its break even point is vastly >incorrect and only uses an index when there are less than 9,000 rows >to fetch, a far cry from the 490K break even I found while testing. Could you elaborate, please. The intention of my patch was to favour index scans more than the current implementation. If it does not, you have found a bug in my patch. Did you test the other interpolation methods? >What I did find interesting, however, was that it does work better at >determining the use of multi-column indexes, Yes, because it computes the correlation for a two-column-index ascorrelation_of_first_index_column * 0.95 instead ofcorrelation_of_first_index_column / 2 > but I think that's >because the geometric mean pessimizes the value of indexCorrelation, >which gets pretty skewed when using a multi-column index. I don't understand this. ># CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date); ># CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count; ># ANALYZE report_user_cat_count; ># SET enable_seqscan = true; SET enable_indexscan = true; >SET >SET ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2002-01-01'::TIMESTAMPWITH TIME ZONE; >INFO: cost_seqscan: run_cost: 23685.025000 > startup_cost: 0.000000 > >INFO: cost_index: run_cost: 366295.018684 > startup_cost: 0.000000 > indexCorrelation: 0.500000 ^^^ This is certainly not with my patch. The current implementation gives ca. 366000 for pages_fetched = 122000 and random_page_cost = 4, which looks plausible for 133000 tuples and (too?) small effective_cache_size. > QUERY PLAN >------------------------------------------------------------------------------------------------------------------------------------ > Seq Scan on report_user_cat_count rucc (cost=0.00..23685.03 rows=133918 width=64) (actual time=0.28..6100.85 rows=129941loops=1) > Filter: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestamp with time zone)) > Total runtime: 6649.21 msec >(3 rows) > ># SET enable_seqscan = false; SET enable_indexscan = true; >SET >SET ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2002-01-01'::TIMESTAMPWITH TIME ZONE; >INFO: cost_seqscan: run_cost: 23685.025000 > startup_cost: 100000000.000000 > >INFO: cost_index: run_cost: 366295.018684 > startup_cost: 0.000000 > indexCorrelation: 0.500000 > QUERY PLAN >----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using report_user_cat_count_utc_date_user_id_idx on report_user_cat_count rucc (cost=0.00..366295.02 rows=133918width=64) (actual time=53.91..3110.42 rows=129941 loops=1) > Index Cond: ((user_id < 1000) AND (utc_date > '2002-01-01 00:00:00-08'::timestamp with time zone)) > Total runtime: 3667.47 msec >(3 rows) Which shows that Postgres does not "_overly_ _favor_ the use of indexes". >If I manually set the indexCorrelation to 1.0, however, the planner >chooses the right plan on its own Ok, with indexCorrelation == 1.0 we dont have to discuss interpolation methods, because they all return min_IO_cost. >Which suggests to me that line 3964 in >./src/backend/utils/adt/selfuncs.c isn't right for multi-column >indexes, esp for indexes that are clustered. Agreed. > I don't know how to >address this though... I guess there is no chance without index statistics. >FWIW, this is an old data/schema from a defunct project that I can >give people access to if they'd like. Is there a dump available for download? ServusManfred
> > Which suggests to me that line 3964 in > > ./src/backend/utils/adt/selfuncs.c isn't right for multi-column > > indexes, esp for indexes that are clustered. I don't know how to > > address this though... Tom, any hints? > > Yes, we knew that already. Oliver had suggested simply dropping the > division by nKeys, thus pretending that the first-column correlation > is close enough. That seems to me to be going too far in the other > direction, But is it really? > xbut clearly dividing by nKeys is far too pessimistic. I'd change > this in a moment if someone could point me to a formula with any > basis at all ... Got it, alright. I'd never paid attention to prior discussions as the planner had generally did the right thing (with a lowered random_page_cost ::grin::). In terms of statistics and setting indexCorrelation correctly, something like Spearman's rho calculation comes to mind, though I don't know how applicable that is to database theory. indexCorrelation is 1.0 for the 1st key in a multi-column index. The only thing different about a multi-column index and a single column index is the multi-column index takes up more space per key, resulting in fewer index entries per page and more pages being fetched than would be in a single column index, but the current cost_index() function takes increased number of page fetches into account when calculating cost. As things stand, however, if a multi-column key is used, the indexCorrelation is penalized by the size of the number of keys found in the multi-column index. As things stand the qual user_id = 42, on a CLUSTER'ed multi-column index (user_id,utc_date) has an indexCorrelation of 0.5, when in fact the correlation is 1.0. indexCorrelation == number of random page fetches, which could be next to free on a solid state drive, in this case, the page fetches aren't random, they're perfectly sequential. If it were 'user_id = 42 AND utc_date = NOW()', the correlation of a lookup of the user_id would still be 1.0 and the utc_date would be 1.0 because both values are looked up in the index key. A lookup of just the utc_date can never use the index and the planner correctly uses a sequential scan. Cost != Correlation. They're proportional, but not the same and indexCorrelation is the wrong place to handle cost as that's done by the Mackert and Lohman formula. Under what circumstances would it be correct to pessimize the use of indexCorrelation? An indexCorrelation of 0.0 doesn't mean that the index is useless either, just that we take the full hit of a completely random page read as opposed to some fraction of a random page cost. I tossed a different index on my test table to see how well things fare with a low correlation, and this was a bit disturbing: # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes > 8000000::BIGINT; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.000000 INFO: cost_index: run_cost: 112165.065458 startup_cost: 0.000000 indexCorrelation: 0.183380 QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------Seq Scanon report_user_cat_count rucc (cost=0.00..21472.69 rows=31893 width=64) (actual time=444.25..2489.27 rows=514 loops=1) Filter: (html_bytes > 8000000::bigint)Total runtime: 2492.36 msec (3 rows) # SET enable_seqscan = false; SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes > 8000000::BIGINT; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 100000000.000000 INFO: cost_index: run_cost: 112165.065458 startup_cost: 0.000000 indexCorrelation: 0.183380 QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------Index Scanusing report_user_cat_count_html_bytes_idx on report_user_cat_count rucc (cost=0.00..112165.07 rows=31893 width=64)(actual time=68.64..85.75 rows=514 loops=1) Index Cond: (html_bytes > 8000000::bigint)Total runtime: 97.75 msec (3 rows) *shrug* A low indexCorrelation overly pessimizes the cost of an index, but I'm not sure where to attribute this too. :-/ -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > indexCorrelation is 1.0 for the 1st key in a multi-column index. ... only if it's perfectly correlated. > As things stand, however, if a multi-column key is > used, the indexCorrelation is penalized by the size of the number of > keys found in the multi-column index. As things stand the qual > user_id = 42, on a CLUSTER'ed multi-column index (user_id,utc_date) > has an indexCorrelation of 0.5, when in fact the correlation is 1.0. Right, in the perfectly-correlated case this calculation is clearly wrong. However, what of cases where the first column shows good correlation with the physical ordering, but the second does not? The nasty part of this is that the correlation stat that ANALYZE computed for the second column is of no value to us. Two examples: X Y X Y A A A BA B A CA C A AB A B AB B B CB C B BC A C CC B C AC C C B In both cases ANALYZE will calculate correlation 1.0 for column X, and something near zero for column Y. We would like to come out with index correlation 1.0 for the left-hand case and something much less (but, perhaps, not zero) for the right-hand case. I don't really see a way to do this without actually examining the multi-column ordering relationship during ANALYZE. > I tossed a different index on my test table to see how well things > fare with a low correlation, and this was a bit disturbing: Seems like most of the error in that estimate has to do with the poor rowcount estimation. There's very little percentage in trying to analyze the effect of index correlation in examples where we don't have the first-order stats correct ... regards, tom lane
On Fri, 8 Aug 2003 15:10:06 -0700, Sean Chittenden <sean@chittenden.org> wrote: >> Yes, we knew that already. Oliver had suggested simply dropping the >> division by nKeys, thus pretending that the first-column correlation >> is close enough. That seems to me to be going too far in the other >> direction, > >But is it really? Yes. I once posted two use cases showing that one side is as wrong as the other. Wait a momemt ... Here it is: http://archives.postgresql.org/pgsql-hackers/2002-10/msg00229.php. >indexCorrelation is 1.0 for the 1st key in a multi-column index. [...] Unfortunately there are cases, where the correlation of the first column doesn't tell us anything about the real index correlation. > An indexCorrelation >of 0.0 doesn't mean that the index is useless either, just that we >take the full hit of a completely random page read as opposed to some >fraction of a random page cost. ... and that we cannot expect the tuples we are looking for to be on the same page or on pages near to each other. So we have to read a higher number of pages. > indexCorrelation: 0.183380 > Seq Scan on report_user_cat_count rucc (cost=0.00..21472.69 rows=31893 width=64) (actual time=444.25..2489.27 rows=514loops=1) > > Index Scan using report_user_cat_count_html_bytes_idx on report_user_cat_count rucc (cost=0.00..112165.07 rows=31893 width=64)(actual time=68.64..85.75 rows=514 loops=1) The main problem here is the bad estimation for number of rows: 31893 vs. 514. The seq scan cost depends on the size of the heap relation and is always the same (~ 21000) for different search conditions. With low correlation, index scan cost is roughly proportional to the number of tuples fetched (ignoring the effects of effective_cache_size for now). So for a correct guess of number of rows we'd get seq idx estimated 21000 1800 actual 2500 86 ratio 8.4 20.4 As estimated and actual are not given in the same units, we look at the estimated/actual ratio and see that index scan cost is over-estimated by a factor 2.5. index_cost_algorithm 4 (geometric interpolation) should get this right. Can you ANALYSE with a higher statistics_target and then test with different index_cost_algorithms? ServusManfred
On Fri, 08 Aug 2003 18:25:41 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Two examples: [...] One more example:X Y A Aa BA Cb AB Bb CC Ac BC C Correlation for column X is something less than 1.0, OTOH correlation for an index on upper(X) is 1.0. >I don't really see >a way to do this without actually examining the multi-column ordering >relationship during ANALYZE. So did we reach consensus to add a TODO item? * Compute index correlation on CREATE INDEX and ANALYZE, use it for index scan cost estimation ServusManfred
> >[...] it'd seem as though an avg depth of > >nodes in index * tuples_fetched * (random_io_cost * indexCorrelation) > >would be closer than where we are now... > > Index depth does not belong here because we walk down the index only > once per index scan not once per tuple. It might be part of the > startup cost. > > The rest of your formula doesn't seem right, too, because you get > higher costs for better correlation. Did you mean > random_io_cost * (1 - abs(indexCorrelation))? Yeah... this was just some off the cuff dribble, don't pay it much attention. > FWIW, for small effective_cache_size max_IO_cost is almost equal to > tuples_fetched * random_page_cost. So your formula (with the > corrections presumed above) boils down to ignoring > effective_cache_size and linear interpolation between 0 and > max_IO_cost. Interesting... > >It's very possible that cost_index() is wrong, but it seems as though > >after some testing as if PostgreSQL _overly_ _favors_ the use of > >indexes: > > Was this an unpatched backend? What were the values of > effective_cache_size and random_page_cost? For all intents and purposes, yes. # SHOW effective_cache_size ;effective_cache_size ----------------------4456 (1 row) # SHOW random_page_cost ;random_page_cost ------------------4 (1 row) As opposed to setting random_page_cost and avoiding these discussions/touring through the code, I'm trying to play by the accepted rules... > ># SET enable_seqscan = true; SET enable_indexscan = true; > >SET > >SET > ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE; > >INFO: cost_seqscan: run_cost: 21472.687500 > > startup_cost: 0.000000 > > > >INFO: cost_index: run_cost: 21154.308116 > > startup_cost: 0.000000 > > indexCorrelation: 0.999729 > > QUERY PLAN > >----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Index Scan using report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc (cost=0.00..21154.31 rows=705954width=64) (actual time=91.36..6625.79 rows=704840 loops=1) > > Index Cond: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone) > > Total runtime: 11292.68 msec > >(3 rows) > > "actual time=91.36..6625.79" but "Total runtime: 11292.68 msec"! > Where did those 4.7 seconds go? *shrug* Don't ask me. > ># SET enable_seqscan = true; SET enable_indexscan = false; > >SET > >SET > ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date > '2002-10-01'::TIMESTAMP WITH TIME ZONE; > >INFO: cost_seqscan: run_cost: 21472.687500 > > startup_cost: 0.000000 > > > >INFO: cost_index: run_cost: 21154.308116 > > startup_cost: 100000000.000000 > > indexCorrelation: 0.999729 > > QUERY PLAN > >--------------------------------------------------------------------------------------------------------------------------------------- > > Seq Scan on report_user_cat_count rucc (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19 rows=704840loops=1) > > Filter: (utc_date > '2002-10-01 00:00:00-07'::timestamp with time zone) > > Total runtime: 10506.44 msec > >(3 rows) > > Same here: "actual time=1091.45..7441.19" but "Total runtime: > 10506.44 msec" - more than 3 seconds lost. > > When we ignore total runtime and look at actual time we get > > seq idx > estimated 21473 21154 > actual 7441 6626 > > This doesn't look too bad, IMHO. > > BTW, I believe that with your example (single-column index, almost > perfect correlation, index cond selects almost all tuples) all > interpolation methods give an index cost estimation that is very close > to seq scan cost, and the actual runtimes show that this is correct. > > >Which I find surprising and humorous given the popular belief is, mine > >included, contrary to those results. > > How many tuples are in report_user_cat_count? What are the stats for > report_user_cat_count.utc_date? # SELECT COUNT(*) FROM report_user_cat_count ; count --------884906 (1 row) The stats are attached && bzip2 compressed. > >I can say with pretty high confidence that the patch to use a > >geometric mean isn't correct after having done real world testing > >as its break even point is vastly incorrect and only uses an index > >when there are less than 9,000 rows to fetch, a far cry from the > >490K break even I found while testing. > > Could you elaborate, please. The intention of my patch was to > favour index scans more than the current implementation. If it does > not, you have found a bug in my patch. Did you test the other > interpolation methods? I didn't, only the geometric mean... the problem with your patch was that it picked an index less often than the current code when there was low correlation. I think you're onto something, I'm just not convinced it's right. I actually think that costs should be converted into estimated msec's of execution and there should be hardware GUCs for CPU speed, RAM access, and basic HDD stats, but that's something different. > >What I did find interesting, however, was that it does work better at > >determining the use of multi-column indexes, > > Yes, because it computes the correlation for a two-column-index as > correlation_of_first_index_column * 0.95 > instead of > correlation_of_first_index_column / 2 *nods* > > but I think that's because the geometric mean pessimizes the value > >of indexCorrelation, which gets pretty skewed when using a > >multi-column index. > > I don't understand this. > > ># CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date); > ># CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count; > ># ANALYZE report_user_cat_count; > ># SET enable_seqscan = true; SET enable_indexscan = true; > >SET > >SET > ># EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2002-01-01'::TIMESTAMPWITH TIME ZONE; > >INFO: cost_seqscan: run_cost: 23685.025000 > > startup_cost: 0.000000 > > > >INFO: cost_index: run_cost: 366295.018684 > > startup_cost: 0.000000 > > indexCorrelation: 0.500000 > ^^^ > This is certainly not with my patch. The current implementation gives > ca. 366000 for pages_fetched = 122000 and random_page_cost = 4, which > looks plausible for 133000 tuples and (too?) small > effective_cache_size. *nods* I manually applied bits of it and didn't change the correlation code in adt/selfuncs.c because it wasn't any more founded than what's there.... though I do think it'd yield better results in my case, it isn't very adaptive given it uses arbitrary magic numbers. > >If I manually set the indexCorrelation to 1.0, however, the planner > >chooses the right plan on its own > > Ok, with indexCorrelation == 1.0 we dont have to discuss interpolation > methods, because they all return min_IO_cost. > > >Which suggests to me that line 3964 in > >./src/backend/utils/adt/selfuncs.c isn't right for multi-column > >indexes, esp for indexes that are clustered. > > Agreed. > > > I don't know how to > >address this though... > > I guess there is no chance without index statistics. > > >FWIW, this is an old data/schema from a defunct project that I can > >give people access to if they'd like. > > Is there a dump available for download? Sure, let me post it. http://people.FreeBSD.org/~seanc/pgsql/rucc.sql.bz2 -sc -- Sean Chittenden
On Fri, 8 Aug 2003 16:53:48 -0700, Sean Chittenden <sean@chittenden.org> wrote: ># SHOW effective_cache_size ; > effective_cache_size >---------------------- > 4456 >(1 row) Only 35 MB? Are you testing on such a small machine? >The stats are attached && bzip2 compressed. Nothing was attached. Did you upload it to your web site? >> >I can say with pretty high confidence that the patch to use a >> >geometric mean isn't correct >... the problem with your patch was >that it picked an index less often than the current code when there >was low correlation. In cost_index.sxc I get lower estimates for *all* proposed new interpolation methods. Either my C code doesn't implement the same calculations as the spreadsheet, or ... >I manually applied bits of it [...] ... could this explain the unexpected behaviour? I'm currently downloading your dump. Can you post the query you mentioned above? ServusManfred
> ># SHOW effective_cache_size ; > > effective_cache_size > >---------------------- > > 4456 > >(1 row) > > Only 35 MB? Are you testing on such a small machine? Testing on my laptop right now... can't hack on my production DBs the same way I can my laptop. > >The stats are attached && bzip2 compressed. > > Nothing was attached. Did you upload it to your web site? Gah, not yet, forgot to send it. http://people.FreeBSD.org/~seanc/pg_statistic.txt.bz2 > >> >I can say with pretty high confidence that the patch to use a > >> >geometric mean isn't correct > > >... the problem with your patch was that it picked an index less > >often than the current code when there was low correlation. > > In cost_index.sxc I get lower estimates for *all* proposed new > interpolation methods. Either my C code doesn't implement the same > calculations as the spreadsheet, or ... > > >I manually applied bits of it [...] > > ... could this explain the unexpected behaviour? Don't think so... the run_cost was correct, I didn't modify the indexCorrelation behavior beyond forcing it to 1.0. > I'm currently downloading your dump. Can you post the query you > mentioned above? SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes > 20000000::BIGINT; SELECT * FROM report_user_cat_count AS rucc WHERE user_id = 42 AND utc_date = NOW(); SELECT * FROM report_user_cat_count AS rucc WHERE user_id = 42; SELECT * FROM report_user_cat_count AS rucc WHERE user_id < 1000 AND utc_date > '2003-01-01'::TIMESTAMP WITH TIME ZONE; And various timestamps back to 2002-09-19 and user_id's IN(1,42). -sc -- Sean Chittenden
> In both cases ANALYZE will calculate correlation 1.0 for column X, > and something near zero for column Y. We would like to come out with > index correlation 1.0 for the left-hand case and something much less > (but, perhaps, not zero) for the right-hand case. I don't really see > a way to do this without actually examining the multi-column ordering > relationship during ANALYZE. The only way the second column correlation will be irrelevant is if the first column is already (nearly) unique (enough so, that the second column wont scatter fetches enough to fill the buffer before seeing cache hits). Thus I think when merging correlations you could take nunique into account. corr = corr_1 * (corr_2 * ( 1 - nunique_1 / nrows)) But, I think one (new) correlation metric for the whole index (whole key) and the data pages would actually be sufficient. This metric could imho always be used instead of the per column correlations to calculate index cost. This holds true as long as you walk an index range, and that is what it is all about, no ? ??? Andreas
On Fri, 8 Aug 2003 16:53:48 -0700, Sean Chittenden <sean@chittenden.org> wrote: >the problem with your patch was >that it picked an index less often than the current code when there >was low correlation. Maybe bit rot? What version did you apply the patch against? Here is a new version for Postgres 7.3.4: http://www.pivot.at/pg/16d-correlation_734.diff The only difference to the previous version is that for (nKeys = 1; index->indexkeys[nKeys] != 0; nKeys++) is now replaced with for (nKeys = 1; nKeys < index->ncolumns; nKeys++) Don't know whether the former just worked by chance when I tested the 7.3.2 version :-(. Tests with 7.4Beta1 showed that index correlation comes out too low with the old loop termination condition. Anyway, the latter version seems more robust. In my tests the new index_cost_algorithms (1, 2, 3, 4) gave consistently lower cost estimates than the old method (set index_cost_algorithm = 0), except of course for correlations of 1.0 or 0.0, because in these border cases you get always min_IO_cost or max_IO_cost, respectively. Care to re-evaluate? BTW, there's a version of the patch for 7.4Beta1 (http://www.pivot.at/pg/16d-correlation_74b1.diff) which also applies cleanly against cvs snapshot from 2003-08-17. ServusManfred