Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
Date
Msg-id CA+TgmobxcnD6x_rUa7h680qw3ChpF=qXx_wnQiKpuT3+45n5zw@mail.gmail.com
Whole thread Raw
In response to Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan  ("Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
List pgsql-hackers
On Fri, Dec 27, 2013 at 1:47 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
>> I wrote:
>> > Robert Haas wrote:
>> > > I'd be wary of showing a desired value unless it's highly likely to
>> > > be accurate.
>
>> > The desired value is accurately estimated based on (a) the total
>> > number of exact/lossy pages stored in the TIDBitmap and (b) the
>> > following equation in tbm_create(), except for the GIN case where
>> > lossy pages are added to the TIDBitmap by tbm_add_page().
>
> I've found there is another risk of overestimating the desired memory space
> for a BitmapAnded TIDBitmap.  I'm inclined to get rid of the estimation
> functionality from the patch completely, and leave it for future work.
> Attached is a new version of the patch, which shows only fetch block
> information and memory usage information.  I'll add this to the upcoming CF.

I spent some time looking at this tonight.  I don't think the value
that is displayed for the bitmap memory tracking will be accurate in
complex cases.  The bitmap heap scan may sit on top of one or more
bitmap-and or bitmap-or nodes.  When a bitmap-and operation happens,
one of the two bitmaps being combined will be thrown out and the
number of entries in the other map will, perhaps, be decreased.  The
peak memory usage for the surviving bitmap will be reflected in the
number displayed for the bitmap heap scan, but the peak memory usage
for the discarded bitmap will not.  This is wholly arbitrary because
both bitmaps existed at the same time, side by side, and which one we
keep and which one we throw out is essentially random.

I think we could report the results in a more principled way if we
reported the value for each bitmap *index* scan node rather than each
bitmap *heap* scan node.  However, I'm not sure it's really worth it.
I think what people really care about is knowing whether the bitmap
lossified or not, and generally how much got lossified.  The counts of
exact and lossy pages are sufficient for that, without anything
additional - so I'm inclined to think that the best course of action
might be to remove from the patch everything that's concerned with
trying to measure memory usage and just keep the exact/lossy page
counts.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Logging WAL when updating hintbit
Next
From: Robert Haas
Date:
Subject: Re: preserving forensic information when we freeze