Re: Setting Statistics on Functional Indexes - Mailing list pgsql-performance

From Tom Lane
Subject Re: Setting Statistics on Functional Indexes
Date
Msg-id 3989.1351288898@sss.pgh.pa.us
Whole thread Raw
In response to Re: Setting Statistics on Functional Indexes  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: Setting Statistics on Functional Indexes  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-performance
Claudio Freire <klaussfreire@gmail.com> writes:
> Because once you've accessed that last index page, it would be rather
> trivial finding out how many duplicate tids are in that page and, with
> a small CPU cost (no disk access if you don't query other index pages)
> you could verify the assumption of near-uniqueness.

I thought about that too, but I'm not sure how promising the idea is.
In the first place, it's not clear when to stop counting duplicates, and
in the second, I'm not sure we could get away with not visiting the heap
to check for tuple liveness.  There might be a lot of apparent
duplicates in the index that just represent unreaped old versions of a
frequently-updated endpoint tuple.  (The existing code is capable of
returning a "wrong" answer if the endpoint tuple is dead, but I don't
think it matters much in most cases.  I'm less sure such an argument
could be made for dup-counting.)

            regards, tom lane


pgsql-performance by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Setting Statistics on Functional Indexes
Next
From: Claudio Freire
Date:
Subject: Re: Setting Statistics on Functional Indexes