Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages
Date
Msg-id CAFiTN-vreUZ9oALKjH76-UNBBqk-ahEg43iAGQTomKBt51aPNg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages
Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages
List pgsql-hackers
On Thu, Oct 5, 2017 at 8:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Sep 17, 2017 at 7:04 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> I used  lossy_pages = max(0, total_pages - maxentries / 2). as
>> suggesed by Alexander.
>
> Does that formula accurately estimate the number of lossy pages?

I have printed the total_pages, exact_pages and lossy_pages during
planning time, and for testing purpose, I tweak the code a bit so that
it doesn't consider lossy_pages in cost calculation (same as base
code).

I have tested TPCH scale factor 20. at different work_mem(4MB, 20MB,
64MB) and noted down the estimated pages vs actual pages.

Analysis: The estimated value of the lossy_pages is way higher than
its actual value and reason is that the total_pages calculated by the
"Mackert and Lohman formula" is not correct.

work_mem=4 MB

query:4
estimated: total_pages=552472.000000 exact_pages=32768.000000
lossy_pages=519704.000000
actual:    exact=18548 lossy=146141

query:6
estimated: total_pages=1541449.000000 exact_pages=32768.000000
lossy_pages=1508681.000000
actual:    exact=13417 lossy=430385

query:8
estimated:  total_pages=552472.000000 exact_pages=32768.000000
lossy_pages=519704.000000
actual:     exact=56869 lossy=495603

query:14
estimated:  total_pages=1149603.000000 exact_pages=32768.000000
lossy_pages=1116835.000000
actual:     exact=17115 lossy=280949

work_mem: 20 MB
query:4
estimated:  total_pages=552472.000000 exact_pages=163840.000000
lossy_pages=388632.000000
actual:     exact=109856 lossy=57761

query:6
estimated:   total_pages=1541449.000000 exact_pages=163840.000000
lossy_pages=1377609.000000
actual:      exact=59771 lossy=397956

query:8
estimated:  total_pages=552472.000000 exact_pages=163840.000000
lossy_pages=388632.000000
actual:     Heap Blocks: exact=221777 lossy=330695

query:14
estimated:  total_pages=1149603.000000 exact_pages=163840.000000
lossy_pages=985763.000000
actual:     exact=63381 lossy=235513

work_mem:64 MB
query:4
estimated:  total_pages=552472.000000 exact_pages=552472.000000
lossy_pages=0.000000
actual:     exact=166005 lossy=0

query:6
estimated:  total_pages=1541449.000000 exact_pages=524288.000000
lossy_pages=1017161.000000
actual:     exact=277717 lossy=185919

query:8
estimated: total_pages=552472.000000 exact_pages=552472.000000
lossy_pages=0.000000
actual:    exact=552472 lossy=0

query:14
estimated:  total_pages=1149603.000000 exact_pages=524288.000000
lossy_pages=625315.000000
actual:     exact=309091 lossy=0


>
> The performance results look good, but that's a slightly different
> thing from whether the estimate is accurate.
>
> +    nbuckets = tbm_calculate_entires(maxbytes);
>
> entires?

changed to
+ tbm->maxentries = (int) tbm_calculate_entires(maxbytes);


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: [HACKERS] Still another race condition in recovery TAP tests
Next
From: Amit Khandekar
Date:
Subject: Re: [HACKERS] Parallel Append implementation