"Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote
>
>> Feasibility: Our bufmgr lock rewrite already makes this possible. But to
>> enable it, we may need more work: (w1) make bufferpool relation-wise,
>> which makes our estimation of data page residence more easy and reliable.
>> (w2) add aggresive pre-read on buffer pool level. Also, another benefit
>> of
>> w1 will make our query planner can estimate query cost more precisely.
>
> "w1" is doable by introducing a shared-memory bitmap indicating which
> pages of a relation are in buffer pool (We may want to add a hash to
> manage the relations). Theoretically, O(shared_buffer) bits are enough. So
> this will not use a lot of space.
>
> When we maintain the SharedBufHash, we maintain this bitmap. When we do
> query cost estimation or preread, we just need a rough number, so this can
> be done by scanning the bitmap without lock. Thus there is also almost no
> extra cost.
After some research, I come to the conclusion that the bitmap idea is bad -
I hope I am wrong :-(.
The benefits of adding a bitmap can enable us knowing current buffer
residence: (b1) Plan stage: give a more accurate estimation of sequential
scan; (b2) Execution stage: provide another way to let sequential
scan/bitmap scan to identify the pages that need pre-read.
For b1, it actually doesn't matter much though. With bitmap we definitely
can give a better EXPLAIN numbers for seqscan only, but without the bitmap,
we seldom make wrong choice of choosing/not choosing sequential scan.
Another other cost estimation can get benefits? I am afraid no since before
execution, we simply don't know what to read. For b2, the bitmap does
provide another way without contenting the BufMappingLock to know the
buffers we should preread, but since the contention of BufMappingLock is not
intensive, this does marginal benefits.
My previous estimation of the trouble/cost of maintaining this bitmap is too
optimistic, for one thing, we need compress the bitmap since many of them
are sparse. Different from uncompressed bitmap, reading without lock can
cause core dump or totally wrong result instead of just some lossy one. Thus
to visit a bitmap, we have to at least grab two locks as I can envision, one
for relation mapping hash, the other for bitmap content protection.
If no more possible benefits to expect, I don't think adding a bitmap is a
good idea. Any other benefits that you can foresee?
Regards,
Qingqing