Thread: bitmaps and correlation
It's a new year and I'm getting reflective, so resuming a portion of conversation we had here: https://www.postgresql.org/message-id/CAMkU%3D1yVbwEAugaCmKWxjaX15ZduWee45%2B_DqCw--d_3N_O_%3DQ%40mail.gmail.com Find attached patch which implements use of correlation statistic in costing for bitmap scans. An opened question in my mind is how to combine the correlation statistic with existing cost_per_page: . sqrt(a^2+b^2) . MIN() . multiply existing computation by new correlation component On Wed, Dec 20, 2017 at 09:55:40PM -0800, Jeff Janes wrote: > On Tue, Dec 19, 2017 at 7:25 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: > > > I started playing with this weeks ago (probably during Vitaliy's problem > > report). Is there any reason cost_bitmap_heap_scan shouldn't interpolate > > based on correlation from seq_page_cost to rand_page_cost, same as > > cost_index ? > > I think that doing something like that is a good idea in general, but someone > has to implement the code, and so far no one seems enthused to do so. You > seem pretty interested in the topic, so.... I tested patch using CDR data which was causing issues for us years ago: https://www.postgresql.org/message-id/20160524173914.GA11880%40telsasoft.com Note: since that original problem report: . the SAN is backed by SSD rather than rotational storage; . we're using relkind=p partitioned tables; . PG12 uses pread() rather than lstat()+read(), so overhead of seek()+read() is avoided (but probably wasn't a substantial component of the problem); Unpatched, I'm unable to get bitmap scan without disabling indexscan and seqscan. | Bitmap Heap Scan on cdrs_huawei_pgwrecord_2018_12_25 (cost=55764.07..1974230.46 rows=2295379 width=1375) | Recheck Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <= '2018-12-2510:00:00-06'::timestamp with time zone)) | -> Bitmap Index Scan on cdrs_huawei_pgwrecord_2018_12_25_recordopeningtime_idx (cost=0.00..55190.22 rows=2295379 width=0) | Index Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <='2018-12-25 10:00:00-06'::timestamp with time zone)) Patched, I get bitmap scan when random_page_cost is reduced enough that startup cost (index scan component) is not prohibitive. But for simplicity, this just forces bitmap by setting enable_indexscan=off; | Bitmap Heap Scan on cdrs_huawei_pgwrecord_2018_12_25 (cost=55764.07..527057.45 rows=2295379 width=1375) | Recheck Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <= '2018-12-2510:00:00-06'::timestamp with time zone)) | -> Bitmap Index Scan on cdrs_huawei_pgwrecord_2018_12_25_recordopeningtime_idx (cost=0.00..55190.22 rows=2295379 width=0) | Index Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <='2018-12-25 10:00:00-06'::timestamp with time zone)) That addresses the issue we had in part. A remaining problem is that correlation fails to distinguish between "fresh" index and fragmented index, and so heap access of a correlated index may looks cheaper than it is. Which is why I still have to set random_page_cost to get bitmap scan. Justin
Attached for real.
Attachment
Attached is a fixed and rebasified patch for cfbot. Included inline for conceptual review. From f3055a5696924427dda280da702c41d2d2796a24 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryzbyj@telsasoft.com> Date: Tue, 1 Jan 2019 16:17:28 -0600 Subject: [PATCH v2] Use correlation statistic in costing bitmap scans.. Same as for an index scan, an uncorrelated bitmap (like modulus) which access a certain number of pages across the entire length of a table should have cost estimate heavily weighted by random access, compared to an bitmap scan which accesses same number of pages across a small portion of the table. Note, Tom points out that there are cases where a column could be tightly-clumped without being hightly-ordered. Since we have correlation already, we use that for now, and if someone creates a statistic for clumpiness, we'll re-evaluate at some later date. --- src/backend/optimizer/path/costsize.c | 84 ++++++++++++++++++++++++++++------- src/backend/optimizer/path/indxpath.c | 8 ++-- src/include/nodes/pathnodes.h | 3 ++ src/include/optimizer/cost.h | 2 +- 4 files changed, 77 insertions(+), 20 deletions(-) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index c5f6593..aaac29a 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -549,11 +549,12 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, /* * Save amcostestimate's results for possible use in bitmap scan planning. - * We don't bother to save indexStartupCost or indexCorrelation, because a - * bitmap scan doesn't care about either. + * We don't bother to save indexStartupCost, because a + * bitmap scan doesn't care. */ path->indextotalcost = indexTotalCost; path->indexselectivity = indexSelectivity; + path->indexCorrelation = indexCorrelation; /* all costs for touching index itself included here */ startup_cost += indexStartupCost; @@ -986,12 +987,33 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel, * appropriate to charge spc_seq_page_cost apiece. The effect is * nonlinear, too. For lack of a better idea, interpolate like this to * determine the cost per page. + * Note this works at PAGE granularity, so even if we read 1% of a + * table's tuples, if we have to read nearly every page, it should be + * considered sequential. */ - if (pages_fetched >= 2.0) + if (pages_fetched >= 2.0) { + double correlation, cost_per_page2; cost_per_page = spc_random_page_cost - (spc_random_page_cost - spc_seq_page_cost) * sqrt(pages_fetched / T); - else + + // XXX: interpolate based on correlation from seq_page_cost to rand_page_cost? + // a highly correlated bitmap scan 1) likely reads fewer pages; and, + // 2) at higher "density" (more sequential). + // double correlation = get_indexpath_correlation(root, bitmapqual); + correlation = ((IndexPath *)bitmapqual)->indexCorrelation; + cost_per_page2 = spc_seq_page_cost + + (1-correlation*correlation)*(spc_random_page_cost - spc_seq_page_cost); // XXX: *sqrt(pages_fetched/T) ? + // There are two variables: fraction of pages(T) and correlation. + // If T is high, giving sequential reads, we want low cost_per_page regardless of correlation. + // If correlation is high, we (probably) want low cost per page. + // ...the exception is if someone does an =ANY() query on a list of non-consecutive values. + // Something like start_time=ANY('2017-01-01', '2017-02-01',...) + // which reads small number of rows from pages all across the length of the table. + // But index scan doesn't seem to do address that at all, so leave it alone for now. + cost_per_page=Min(cost_per_page, cost_per_page2); + // cost_per_page=sqrt(cost_per_page*cost_per_page + cost_per_page2*cost_per_page2); + } else cost_per_page = spc_random_page_cost; run_cost += pages_fetched * cost_per_page; @@ -1035,15 +1057,16 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel, /* * cost_bitmap_tree_node - * Extract cost and selectivity from a bitmap tree node (index/and/or) + * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or) */ void -cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec) +cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation) { if (IsA(path, IndexPath)) { *cost = ((IndexPath *) path)->indextotalcost; *selec = ((IndexPath *) path)->indexselectivity; + if (correlation) *correlation = ((IndexPath *) path)->indexCorrelation; /* * Charge a small amount per retrieved tuple to reflect the costs of @@ -1057,11 +1080,13 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec) { *cost = path->total_cost; *selec = ((BitmapAndPath *) path)->bitmapselectivity; + if (correlation) *correlation = ((BitmapAndPath *) path)->bitmapcorrelation; } else if (IsA(path, BitmapOrPath)) { *cost = path->total_cost; *selec = ((BitmapOrPath *) path)->bitmapselectivity; + if (correlation) *correlation = ((BitmapOrPath *) path)->bitmapcorrelation; } else { @@ -1084,8 +1109,9 @@ void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root) { Cost totalCost; - Selectivity selec; + Selectivity selec, minsubselec; ListCell *l; + double correlation; /* * We estimate AND selectivity on the assumption that the inputs are @@ -1097,22 +1123,32 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root) * definitely too simplistic? */ totalCost = 0.0; - selec = 1.0; + minsubselec = selec = 1.0; + correlation = 0; foreach(l, path->bitmapquals) { Path *subpath = (Path *) lfirst(l); Cost subCost; Selectivity subselec; + double subcorrelation; - cost_bitmap_tree_node(subpath, &subCost, &subselec); + cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation); selec *= subselec; + /* For an AND node, use the correlation of its most-selective subpath */ + if (subselec<=minsubselec) { + correlation = subcorrelation; + minsubselec = subselec; + } + totalCost += subCost; if (l != list_head(path->bitmapquals)) + // ??? XXX && !IsA(subpath, IndexPath)) totalCost += 100.0 * cpu_operator_cost; } path->bitmapselectivity = selec; + path->bitmapcorrelation = correlation; path->path.rows = 0; /* per above, not used */ path->path.startup_cost = totalCost; path->path.total_cost = totalCost; @@ -1128,8 +1164,9 @@ void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root) { Cost totalCost; - Selectivity selec; + Selectivity selec, maxsubselec; ListCell *l; + double correlation; /* * We estimate OR selectivity on the assumption that the inputs are @@ -1142,23 +1179,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root) * optimized out when the inputs are BitmapIndexScans. */ totalCost = 0.0; - selec = 0.0; + maxsubselec = selec = 0.0; + correlation = 0; foreach(l, path->bitmapquals) { Path *subpath = (Path *) lfirst(l); Cost subCost; Selectivity subselec; + double subcorrelation; - cost_bitmap_tree_node(subpath, &subCost, &subselec); + cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation); selec += subselec; + /* For an OR node, use the correlation of its least-selective subpath */ + if (subselec>=maxsubselec) { + correlation = subcorrelation; + maxsubselec = subselec; + } + totalCost += subCost; if (l != list_head(path->bitmapquals) && !IsA(subpath, IndexPath)) totalCost += 100.0 * cpu_operator_cost; } path->bitmapselectivity = Min(selec, 1.0); + path->bitmapcorrelation = correlation; path->path.rows = 0; /* per above, not used */ path->path.startup_cost = totalCost; path->path.total_cost = totalCost; @@ -5510,8 +5556,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual, { Cost indexTotalCost; Selectivity indexSelectivity; + double indexCorrelation; double T; - double pages_fetched; + double pages_fetched, + pages_fetchedMIN, + pages_fetchedMAX; double tuples_fetched; double heap_pages; long maxentries; @@ -5520,7 +5569,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual, * Fetch total cost of obtaining the bitmap, as well as its total * selectivity. */ - cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity); + cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation); /* * Estimate number of main-table pages fetched. @@ -5534,7 +5583,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual, * the same as the Mackert and Lohman formula for the case T <= b (ie, no * re-reads needed). */ - pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + + /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */ + pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages); + + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX); /* * Calculate the number of pages fetched from the heap. Then based on diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 37b257c..2a3db34 100644 --- a/src/backend/optimizer/path/indxpath.c +++ b/src/backend/optimizer/path/indxpath.c @@ -1467,8 +1467,8 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths) Selectivity nselec; Selectivity oselec; - cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec); - cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec); + cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec, NULL); + cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec, NULL); if (ncost < ocost) pathinfoarray[i] = pathinfo; } @@ -1580,8 +1580,8 @@ path_usage_comparator(const void *a, const void *b) Selectivity aselec; Selectivity bselec; - cost_bitmap_tree_node(pa->path, &acost, &aselec); - cost_bitmap_tree_node(pb->path, &bcost, &bselec); + cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL); + cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL); /* * If costs are the same, sort by selectivity. diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index 23a06d7..beaac03 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -1181,6 +1181,7 @@ typedef struct IndexPath ScanDirection indexscandir; Cost indextotalcost; Selectivity indexselectivity; + double indexCorrelation; } IndexPath; /* @@ -1261,6 +1262,7 @@ typedef struct BitmapAndPath Path path; List *bitmapquals; /* IndexPaths and BitmapOrPaths */ Selectivity bitmapselectivity; + double bitmapcorrelation; } BitmapAndPath; /* @@ -1274,6 +1276,7 @@ typedef struct BitmapOrPath Path path; List *bitmapquals; /* IndexPaths and BitmapAndPaths */ Selectivity bitmapselectivity; + double bitmapcorrelation; } BitmapOrPath; /* diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index b3d0b4f..9a28665 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas Path *bitmapqual, double loop_count); extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root); extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root); -extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec); +extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation); extern void cost_tidscan(Path *path, PlannerInfo *root, RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info); extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root, -- 2.7.4
Attachment
On Sat, Nov 02, 2019 at 03:26:17PM -0500, Justin Pryzby wrote: > Attached is a fixed and rebasified patch for cfbot. > Included inline for conceptual review. Your patch still causes two regression tests to fail per Mr Robot's report: join and select. Could you look at those problems? I have moved the patch to next CF, waiting on author. -- Michael
Attachment
On Sun, Dec 01, 2019 at 12:34:37PM +0900, Michael Paquier wrote: > On Sat, Nov 02, 2019 at 03:26:17PM -0500, Justin Pryzby wrote: > > Attached is a fixed and rebasified patch for cfbot. > > Included inline for conceptual review. > > Your patch still causes two regression tests to fail per Mr Robot's > report: join and select. Could you look at those problems? I have > moved the patch to next CF, waiting on author. The regression failures seem to be due to deliberate, expected plan changes. I think the patch still needs some discussion, but find attached rebasified patch including regression diffs. Justin
Attachment
Find attached cleaned up patch. For now, I updated the regress/expected/, but I think the test maybe has to be updated to do what it was written to do.
Attachment
On Tue, Jan 7, 2020 at 1:29 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > > Find attached cleaned up patch. > For now, I updated the regress/expected/, but I think the test maybe has to be > updated to do what it was written to do. I have noticed that in "cost_index" we have used the indexCorrelation for computing the run_cost, not the number of pages whereas in your patch you have used it for computing the number of pages. Any reason for the same? cost_index { .. /* * Now interpolate based on estimated index order correlation to get total * disk I/O cost for main table accesses. */ csquared = indexCorrelation * indexCorrelation; run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost); } Patch - pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + + /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */ + pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages); + + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX); -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Tue, Jan 07, 2020 at 09:21:03AM +0530, Dilip Kumar wrote: > On Tue, Jan 7, 2020 at 1:29 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > > Find attached cleaned up patch. > > For now, I updated the regress/expected/, but I think the test maybe has to be > > updated to do what it was written to do. > > I have noticed that in "cost_index" we have used the indexCorrelation > for computing the run_cost, not the number of pages whereas in your > patch you have used it for computing the number of pages. Any reason > for the same? As Jeff has pointed out, high correlation has two effects in cost_index(): 1) the number of pages read will be less; 2) the pages will be read more sequentially; cost_index reuses the pages_fetched variable, so (1) isn't particularly clear, but does essentially: /* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */ pages_fetched(MIN) = index_pages_fetched(tuples_fetched, baserel->pages, (double) index->pages, root); max_IO_cost = pages_fetchedMIN * spc_random_page_cost; /* min_IO_cost is for the perfectly correlated case (csquared=1) */ pages_fetched(MAX) = ceil(indexSelectivity * (double) baserel->pages); min_IO_cost = (pages_fetchedMAX - 1) * spc_seq_page_cost; My patch 1) changes compute_bitmap_pages() to interpolate pages_fetched using the correlation; pages_fetchedMIN is new: > Patch > - pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); > + pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); > + > + /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */ > + pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages); > + > + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX); And, 2) also computes cost_per_page by interpolation between seq_page and random_page cost: + cost_per_page_corr = spc_random_page_cost - + (spc_random_page_cost - spc_seq_page_cost) + * (1-correlation*correlation); Thanks for looking. I'll update the name of pages_fetchedMIN/MAX in my patch for consistency with cost_index. Justin
On Mon, Jan 06, 2020 at 11:26:06PM -0600, Justin Pryzby wrote: > As Jeff has pointed out, high correlation has two effects in cost_index(): > 1) the number of pages read will be less; > 2) the pages will be read more sequentially; > > cost_index reuses the pages_fetched variable, so (1) isn't particularly clear, I tried to make this more clear in 0001 > + cost_per_page_corr = spc_random_page_cost - > + (spc_random_page_cost - spc_seq_page_cost) > + * (1-correlation*correlation); And fixed bug: this should be c*c not 1-c*c.
Attachment
There were no comments last month, so rebased, fixed tests, and kicked to next CF. -- Justin
Attachment
Status update for a commitfest entry According to cfbot, the patch fails to apply. Could you please send a rebased version? I wonder why this patch hangs so long without a review. Maybe it will help to move discussion forward, if you provide moreexamples of queries that can benefit from this imporovement? The first patch is simply a refactoring and don't see any possible objections against it. The second patch also looks fine to me. The logic is understandable and the code is neat. It wouldn't hurt to add a comment for this computation, though. + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX); The new status of this patch is: Waiting on Author
On Fri, Nov 06, 2020 at 01:51:26PM +0000, Anastasia Lubennikova wrote: > I wonder why this patch hangs so long without a review. Maybe it will help to move discussion forward, if you provide moreexamples of queries that can benefit from this imporovement? Thanks for looking. The explanation is that the planner currently gives index scans a cost "discount" for correlation. Jeff Janes has pointed out that there are two discounts: 1) fewer pages are read; and, 2) lower cost-per-page. This patch aims to give bitmap scans the same benefits. A "dense" bitmap will read fewer pages, more sequentially. Tom pointed out that the "correlation" isn't a perfect metric for this, since the index might be "clumpy" without being well-ordered, which doesn't matter for bitmap scans, which access in physical order anyway. In those cases, the correlation logic would fail to reduce the estimated cost of bitmap scan, even though the actual cost is less (same as now). This is true, but my goal is to give bitmap scans at least the same benefit as index scans, even if there's additional "discounts" which aren't yet being considered. This was an issue for me in the past when the planner chose a to scan a index, but it was slower than projected (for reasons unrelated to this patch). Making bitmap cost account for high correlation was one step towards addressing that. Since then, we switched to brin indexes, which force bitmap scans. https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com Here's an example. CREATE TABLE t AS SELECT a,b FROM generate_series(1,999) a, generate_series(1,999) b ORDER BY a+b/9.9; CREATE INDEX ON t(a); postgres=# SELECT attname, correlation FROM pg_stats WHERE tablename ='t'; a | 0.9951819 b | 0.10415093 postgres=# explain analyze SELECT * FROM t WHERE a BETWEEN 55 AND 77; Index Scan using t_a_idx on t (cost=0.42..810.89 rows=22683 width=8) (actual time=0.292..66.657 rows=22977 loops=1) vs (without my patch, with SET enable_indexscan=off); Bitmap Heap Scan on t (cost=316.93..5073.17 rows=22683 width=8) (actual time=10.810..26.633 rows=22977 loops=1) vs (with my patch, with SET enable_indexscan=off): postgres=# explain analyze SELECT * FROM t WHERE a BETWEEN 55 AND 77; Bitmap Heap Scan on t (cost=316.93..823.84 rows=22683 width=8) (actual time=10.742..33.279 rows=22977 loops=1) So bitmap scan is cheaper, but the cost estimate is a lot higher. My patch improves but doesn't completely "fix" that - bitmap scan is still costed as more expensive, but happens to be. This is probably not even a particularly good example, as it's a small table cached in RAM. There's always going to be cases like this, certainly near the costs where the plan changes "shape". I think a cost difference of 10 here is very reasonable (cpu_oper_cost, probably), but a cost difference of 5x is not. There's not many regression tests changed. Probably partially because bitmap scans have an overhead (the heap scan cannot start until after the index scan finishes), and we avoid large tests. If there's no interest in the patch, I guess we should just close it rather than letting it rot. > The first patch is simply a refactoring and don't see any possible objections against it. > The second patch also looks fine to me. The logic is understandable and the code is neat. > > It wouldn't hurt to add a comment for this computation, though. > + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX); You're right. It's like this: // interpolate between c==0: pages_fetched=max and c==1: pages_fetched=min pages_fetched = min + (max-min)*(1-c**2) pages_fetched = min + max*(1-c**2) - min*(1-c**2) pages_fetched = max*(1-c**2) + min - min*(1-c**2) pages_fetched = max*(1-c**2) + min*(c**2) pages_fetched = max - max*c**2 + min*(c**2) pages_fetched = max + min*(c**2) - max*c**2 pages_fetched = max + c**2 * (min-max) I'm not sure if there's a reason why it's written like that, but (min-max) looks odd, so I wrote it like: pages_fetched = max - c**2 * (max-min) > The new status of this patch is: Waiting on Author
Attachment
On 06/11/2020 19:57, Justin Pryzby wrote: > On Fri, Nov 06, 2020 at 01:51:26PM +0000, Anastasia Lubennikova wrote: >> The first patch is simply a refactoring and don't see any possible objections against it. >> The second patch also looks fine to me. The logic is understandable and the code is neat. +1 >> It wouldn't hurt to add a comment for this computation, though. >> + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX); > > You're right. It's like this: > // interpolate between c==0: pages_fetched=max and c==1: pages_fetched=min > pages_fetched = min + (max-min)*(1-c**2) > pages_fetched = min + max*(1-c**2) - min*(1-c**2) > pages_fetched = max*(1-c**2) + min - min*(1-c**2) > pages_fetched = max*(1-c**2) + min*(c**2) > pages_fetched = max - max*c**2 + min*(c**2) > pages_fetched = max + min*(c**2) - max*c**2 > pages_fetched = max + c**2 * (min-max) > > I'm not sure if there's a reason why it's written like that, but (min-max) > looks odd, so I wrote it like: > pages_fetched = max - c**2 * (max-min) I agree min-max looks odd. max - c**2 * (max-min) looks a bit odd too, though. Whatever we do here, though, I'd suggest that we keep it consistent with cost_index(). Other than that, and a quick pgdindent run, this seems ready to me. I'll mark it as Ready for Committer. - Heikki
Heikki Linnakangas <hlinnaka@iki.fi> writes: > Other than that, and a quick pgdindent run, this seems ready to me. I'll > mark it as Ready for Committer. I dunno, this seems largely misguided to me. It's already the case that index correlation is just not the right stat for this purpose, since it doesn't give you much of a toehold on whether a particular scan is going to be accessing tightly-clumped data. For specific kinds of index conditions, such as a range query on a btree index, maybe you could draw that conclusion ... but this patch isn't paying any attention to the index condition in use. And then the rules for bitmap AND and OR correlations, if not just plucked out of the air, still seem *far* too optimistic. As an example, even if my individual indexes are perfectly correlated and so a probe would touch only one page, OR'ing ten such probes together is likely going to touch ten different pages. But unless I'm misreading the patch, it's going to report back an OR correlation that corresponds to touching one page. Even if we assume that the correlation is nonetheless predictive of how big a part of the table we'll be examining, I don't see a lot of basis for deciding that the equations the patch adds to cost_bitmap_heap_scan are the right ones. I'd have expected this thread to focus a whole lot more on actual examples than it has done, so that we could have some confidence that these equations have something to do with reality. regards, tom lane
On Sat, Nov 28, 2020 at 5:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Heikki Linnakangas <hlinnaka@iki.fi> writes: > > Other than that, and a quick pgdindent run, this seems ready to me. I'll > > mark it as Ready for Committer. > > I dunno, this seems largely misguided to me. > > It's already the case that index correlation is just not the right > stat for this purpose, since it doesn't give you much of a toehold > on whether a particular scan is going to be accessing tightly-clumped > data. For specific kinds of index conditions, such as a range query > on a btree index, maybe you could draw that conclusion ... but this > patch isn't paying any attention to the index condition in use. > > And then the rules for bitmap AND and OR correlations, if not just > plucked out of the air, still seem *far* too optimistic. As an > example, even if my individual indexes are perfectly correlated and > so a probe would touch only one page, OR'ing ten such probes together > is likely going to touch ten different pages. But unless I'm > misreading the patch, it's going to report back an OR correlation > that corresponds to touching one page. > > Even if we assume that the correlation is nonetheless predictive of > how big a part of the table we'll be examining, I don't see a lot > of basis for deciding that the equations the patch adds to > cost_bitmap_heap_scan are the right ones. > > I'd have expected this thread to focus a whole lot more on actual > examples than it has done, so that we could have some confidence > that these equations have something to do with reality. > Status update for a commitfest entry. The discussion has been inactive since the end of the last CF. It seems to me that we need some discussion on the point Tom mentioned. It looks either "Needs Review" or "Ready for Committer" status but Justin set it to "Waiting on Author" on 2020-12-03 by himself. Are you working on this, Justin? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On Thu, Jan 28, 2021 at 9:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sat, Nov 28, 2020 at 5:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Heikki Linnakangas <hlinnaka@iki.fi> writes: > > > Other than that, and a quick pgdindent run, this seems ready to me. I'll > > > mark it as Ready for Committer. > > > > I dunno, this seems largely misguided to me. > > > > It's already the case that index correlation is just not the right > > stat for this purpose, since it doesn't give you much of a toehold > > on whether a particular scan is going to be accessing tightly-clumped > > data. For specific kinds of index conditions, such as a range query > > on a btree index, maybe you could draw that conclusion ... but this > > patch isn't paying any attention to the index condition in use. > > > > And then the rules for bitmap AND and OR correlations, if not just > > plucked out of the air, still seem *far* too optimistic. As an > > example, even if my individual indexes are perfectly correlated and > > so a probe would touch only one page, OR'ing ten such probes together > > is likely going to touch ten different pages. But unless I'm > > misreading the patch, it's going to report back an OR correlation > > that corresponds to touching one page. > > > > Even if we assume that the correlation is nonetheless predictive of > > how big a part of the table we'll be examining, I don't see a lot > > of basis for deciding that the equations the patch adds to > > cost_bitmap_heap_scan are the right ones. > > > > I'd have expected this thread to focus a whole lot more on actual > > examples than it has done, so that we could have some confidence > > that these equations have something to do with reality. > > > > Status update for a commitfest entry. > > The discussion has been inactive since the end of the last CF. It > seems to me that we need some discussion on the point Tom mentioned. > It looks either "Needs Review" or "Ready for Committer" status but > Justin set it to "Waiting on Author" on 2020-12-03 by himself. Are you > working on this, Justin? > Status update for a commitfest entry. This patch, which you submitted to this CommitFest, has been awaiting your attention for more than one month. As such, we have moved it to "Returned with Feedback" and removed it from the reviewing queue. Depending on timing, this may be reversable, so let us know if there are extenuating circumstances. In any case, you are welcome to address the feedback you have received, and resubmit the patch to the next CommitFest. Thank you for contributing to PostgreSQL. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/