On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> Yes, I also think the same idea can be used, in fact, I have mentioned
>>> it [1] as soon as you have committed that patch. Do we want to do
>>> anything at this stage for PG-10? I don't think we should attempt
>>> something this late unless people feel this is a show-stopper issue
>>> for usage of hash indexes. If required, I think a separate function
>>> can be provided to allow users to perform squeeze operation.
>>
>> Sorry, I have no idea how critical this squeeze thing is for the
>> newfangled hash indexes, so I cannot comment on that. Does this make
>> the indexes unusable in some way under some circumstances?
>
> It seems so. Basically, in the case of a large number of duplicates,
> we hit the maximum number of overflow pages. There is a theoretical
> possibility of hitting it but it could also happen that we are not
> free the existing unused overflow pages due to which it keeps on
> growing and hit the limit. I have requested up thread to verify if
> that is happening in this case and I am still waiting for same. The
> squeeze operation does free such unused overflow pages after cleaning
> them. As this is a costly operation and needs a cleanup lock, so we
> currently perform it only during Vacuum and next split from the bucket
> which can have redundant overflow pages.
Oops. It was rather short-sighted of us not to increase
HASH_MAX_BITMAPS when we bumped HASH_VERSION. Actually removing that
limit is hard, but we could have easily bumped it for 128 to say 1024
without (I think) causing any problem, which would have given us quite
a bit of headroom here. I suppose we could still try to jam that
change in before beta3 (bumping HASH_VERSION again) but that might be
asking for trouble.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company