Re: [HACKERS] pgsql 10: hash indexes testing - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] pgsql 10: hash indexes testing |
Date | |
Msg-id | CAA4eK1+nywfLyrfQV4LBSe2DMTEdD2wCuKMY-wzSGQROsn5eWg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] pgsql 10: hash indexes testing (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [HACKERS] pgsql 10: hash indexes testing
Re: [HACKERS] pgsql 10: hash indexes testing |
List | pgsql-hackers |
On Fri, Aug 4, 2017 at 8:21 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Aug 2, 2017 at 9:04 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>>>> Yes, I also think the same idea can be used, in fact, I have mentioned >>>>> it [1] as soon as you have committed that patch. Do we want to do >>>>> anything at this stage for PG-10? I don't think we should attempt >>>>> something this late unless people feel this is a show-stopper issue >>>>> for usage of hash indexes. If required, I think a separate function >>>>> can be provided to allow users to perform squeeze operation. >>>> >>>> Sorry, I have no idea how critical this squeeze thing is for the >>>> newfangled hash indexes, so I cannot comment on that. Does this make >>>> the indexes unusable in some way under some circumstances? >>> >>> It seems so. Basically, in the case of a large number of duplicates, >>> we hit the maximum number of overflow pages. There is a theoretical >>> possibility of hitting it but it could also happen that we are not >>> free the existing unused overflow pages due to which it keeps on >>> growing and hit the limit. I have requested up thread to verify if >>> that is happening in this case and I am still waiting for same. The >>> squeeze operation does free such unused overflow pages after cleaning >>> them. As this is a costly operation and needs a cleanup lock, so we >>> currently perform it only during Vacuum and next split from the bucket >>> which can have redundant overflow pages. >> >> Oops. It was rather short-sighted of us not to increase >> HASH_MAX_BITMAPS when we bumped HASH_VERSION. Actually removing that >> limit is hard, but we could have easily bumped it for 128 to say 1024 >> without (I think) causing any problem, which would have given us quite >> a bit of headroom here. > > Yes, that sounds sensible, but I think it will just delay the problem > to happen. I think here the actual problem is that we are not able to > perform squeeze operation often enough that it frees the overflow > pages. Currently, we try to perform the squeeze only at the start of > next split of the bucket or during vacuum. The reason for doing it > that way was that squeeze operation needs cleanup lock and we already > have that during the start of split and vacuum. Now, to solve it I > have already speculated few ways above [1] and among those, it is > feasible to either do this at end of split which can have performance > implications in some work loads, but will work fine for the case > reported in this thread > I have implemented the patch with this approach as other approach require quite extensive changes which I am not sure is the right thing to do at this stage. > > I think we can fix it in one of above ways and increase the value of > HASH_MAX_BITMAPS as well. > I have increased the number of hash bitmap pages as a separate patch. I am not completely sure if it is a good idea to directly increase it to 1024 as that will increase the size of hashmetapagedata from 960 bytes to 4544 bytes. Shall we increase it to 512? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: