Re: Bitmap table scan cost per page formula - Mailing list pgsql-hackers
From | Justin Pryzby |
---|---|
Subject | Re: Bitmap table scan cost per page formula |
Date | |
Msg-id | 20171220032459.GJ18184@telsasoft.com Whole thread Raw |
In response to | Bitmap table scan cost per page formula (Haisheng Yuan <hyuan@pivotal.io>) |
Responses |
Re: Bitmap table scan cost per page formula
Re: Bitmap table scan cost per page formula |
List | pgsql-hackers |
On Tue, Dec 19, 2017 at 07:55:32PM +0000, Haisheng Yuan wrote: > Hi hackers, > > This is Haisheng Yuan from Greenplum Database. > > We had some query in production showing that planner favors seqscan over > bitmapscan, and the execution of seqscan is 5x slower than using > bitmapscan, but the cost of bitmapscan is 2x the cost of seqscan. The > statistics were updated and quite accurate. > > Bitmap table scan uses a formula to interpolate between random_page_cost > and seq_page_cost to determine the cost per page. In Greenplum Database, > the default value of random_page_cost is 100, the default value of > seq_page_cost is 1. With the original cost formula, random_page_cost > dominates in the final cost result, even the formula is declared to be > non-linear. However, it is still more like linear, which can't reflect the > real cost per page when a majority of pages are fetched. Therefore, the > cost formula is updated to real non-linear function to reflect both > random_page_cost and seq_page_cost for different percentage of pages > fetched. > > Even though PostgreSQL has much smaller default value of random_page_cost > (4), the same problem exists there if we change the default value. > > Old cost formula: > cost_per_page = random_page_cost - (random_page_cost - seq_page_cost) * > sqrt(pages_fetched / T); > [image: Inline image 1] > > New cost formula: > cost_per_page = seq_page_cost * pow(random_page_cost / seq_page_cost, 1 - > sqrt(pages_fetched / T)); > [image: Inline image 2] > > Below is the graph (credit to Heikki) that plots the total estimated cost > of a bitmap heap scan, where table size is 10000 pages, and > random_page_cost=10 and seq_page_cost=1. X axis is the number of pages > fetche. I.e. on the left, no pages are fetched, and on the right end, at > 10000, all pages are fetched. The original formula is in black, the new > formula in red, and the horizontal line, in blue, shows the cost of a Seq > Scan. > [image: Inline image 3] Thanks for caring about bitmap scans ;) There's a couple earlier/ongoing discussions on this: In this old thread: https://www.postgresql.org/message-id/CAGTBQpZ%2BauG%2BKhcLghvTecm4-cGGgL8vZb5uA3%3D47K7kf9RgJw%40mail.gmail.com ..Claudio Freire <klaussfreire(at)gmail(dot)com> wrote: > Correct me if I'm wrong, but this looks like the planner not > accounting for correlation when using bitmap heap scans. > > Checking the source, it really doesn't. ..which I think is basically right: the formula does distinguish between the cases of small or large fraction of pages, but doesn't use correlation. Our issue in that case seems to be mostly a failure of cost_index to account for fine-scale deviations from large-scale correlation; but, if cost_bitmap accounted for our high correlation metric (>0.99), it might've helped our case. Note costsize.c: * Save amcostestimate's results for possible use in bitmap scan planning. * We don't bother to save indexStartupCost or indexCorrelation, because a * BITMAP SCAN DOESN'T CARE ABOUT EITHER. See more at this recent/ongoing discussion (involving several issues, only one of which is bitmap cost vs index cost): https://www.postgresql.org/message-id/flat/20171206214652.GA13889%40telsasoft.com#20171206214652.GA13889@telsasoft.com Consider the bitmap scans in the two different cases: 1) In Vitaliy's case, bitmap was being chosen in preference to index scan (due in part to random_page_cost>1), but performed poorly, partially because bitmap component must complete before the heap reads can begin. And also because the heap reads for the test case involving modular division would've read pages across the entire length of the table, incurring maximum lseek costs. 2) In my case from ~16 months ago, index scan was being chosen in preference to bitmap, but index scan was incurring high seek cost. We would've been happy if the planner would've done a bitmap scan on a weekly-partitioned child table (with 7 days data), while querying one day's data (1/7th of the table), 99% of which would been strictly sequential page reads, so incurring low lseek costs (plus some component of random costs for the remaining 1% "long tail"). It seems clear to me that sequentially reading 1/7th of the tightly clustered pages in a table ought to be costed differently than reading 1/7th of the pages evenly distributed accross its entire length. I started playing with this weeks ago (probably during Vitaliy's problem report). Is there any reason cost_bitmap_heap_scan shouldn't interpolate based on correlation from seq_page_cost to rand_page_cost, same as cost_index ? Justin
pgsql-hackers by date: