Re: the big picture for index-only scans - Mailing list pgsql-hackers

From Cédric Villemain
Subject Re: the big picture for index-only scans
Date
Msg-id BANLkTi=sowGsRYDynDbqtY1fGc3SehKMDQ@mail.gmail.com
Whole thread Raw
In response to Re: the big picture for index-only scans  (Bruce Momjian <bruce@momjian.us>)
Responses Re: the big picture for index-only scans
List pgsql-hackers
2011/5/11 Bruce Momjian <bruce@momjian.us>:
> Cédric Villemain wrote:
>> 2011/5/10 Kevin Grittner <Kevin.Grittner@wicourts.gov>:
>> > Simon Riggs <simon@2ndQuadrant.com> wrote:
>> >> The typical speed up for non-covered indexes will come when we
>> >> access a very large table (not in cache) via an index scan that is
>> >> smaller than a bitmapindex scan. Will we be able to gauge
>> >> selectivities sufficiently accurately to be able to pinpoint that
>> >> during optimization? How will we know that the table is not in
>> >> cache? Or is this an optimisation in the executor for a bitmapheap
>> >> scan?
>> >
>> > I would continue to object to using current cache contents for plan
>> > choice because of plan instability and the fact that an odd initial
>> > cache load could skew plans in a bad direction indefinitely. ?I do
>> > agree (and have already posted) that I think the hardest part of
>> > this might be developing a good cost model. ?I doubt that's an
>> > insoluble problem, especially since it is something we can refine
>> > over time as we gain experience with the edge cases.
>>
>> you will have the same possible instability in planning with the
>> index(-only?) scan because we may need to access heap anyway and this
>> needs is based on estimation, or I miss something ? I understood the
>> idea was just to bypass the heap access *if* we can for *this*
>> heap-page.
>>
>> In reality, I am not really scared by plan instability because of a
>> possible PG/OS cache estimation. The percentages remain stable in my
>> observations ... I don't know yet how it will go for vis map.
>>
>> And, we already have plan instability currently, which is *good* : at
>> some point a seq scan is better than an bitmap heap scan. Because the
>> relation size change and because ANALYZE re-estimate the distribution
>> of the data. I will be very happy to issue ANALYZE CACHE as I have to
>> ANALYZE temp table for large query if it allows the planner to provide
>> me the best plan in some scenario...but this is another topic, sorry
>> for the digression..
>
> Good point --- we would be making plan decisions based on the visibility
> map coverage.  The big question is whether visibility map changes are
> more dynamic than the values we already plan against, like rows in the
> table, table size, and value distributions.  I don't know the answer.
>

Robert, I though of Covered-Index as just a usage of the vis map:
don't take the heap block if not needed. This look easier to do and
better in the long term (because read-only DB may quickly turn into a
no-heap access DB for example). Thus this is not real covered-index.
Did you want to implement real covered-index and did you have ideas on
how to do that ? Or just an optimization of the current
planner/executor on index usage ?
___

I don't know VM internals:
* do we have a counter of ALL_VISIBLE flag set on a relation ? (this
should be very good for planner)* do we need a pg_class.rel_vmvisible ?! (I have hands up, don't
shoot pleeaase)* is it ok to parse VM for planning (I believe it is not) ?

Ideas ?

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


pgsql-hackers by date:

Previous
From: Jeff Shanab
Date:
Subject: Libpq enhancement
Next
From: David Fetter
Date:
Subject: Re: [COMMITTERS] pgsql: Make external_pid_file world readable