On Sat, Mar 25, 2017 at 1:24 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee > <pavan.deolasee@gmail.com> wrote: >> >> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit.kapila16@gmail.com> >> wrote: >>> > >> While looking at this problem, it occurred to me that the assumptions made >> for hash indexes are also wrong :-( Hash index has the same problem as >> expression indexes have. A change in heap value may not necessarily cause a >> change in the hash key. If we don't detect that, we will end up having two >> hash identical hash keys with the same TID pointer. This will cause the >> duplicate key scans problem since hashrecheck will return true for both the >> hash entries.
Isn't it possible to detect duplicate keys in hashrecheck if we compare both hashkey and tid stored in index tuple with the corresponding values from heap tuple?
Hmm.. I thought that won't work. For example, say we have a tuple (X, Y, Z) in the heap with a btree index on X and a hash index on Y. If that is updated to (X, Y', Z) and say we do a WARM update and insert a new entry in the hash index. Now if Y and Y' both generate the same hashkey, we will have exactly similar looking <hashkey, TID> tuples in the hash index leading to duplicate key scans.
I think one way to solve this is to pass both old and new heap values to amwarminsert and expect each AM to detect duplicates and avoid creating of a WARM pointer if index keys are exactly the same (we can do that since there already exists another index tuple with the same keys pointing to the same root TID).