On Thu, Aug 11, 2016 at 11:07 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 8/10/16 12:48 PM, Claudio Freire wrote:
>>
>> On Tue, Aug 9, 2016 at 11:39 PM, Jim Nasby <Jim.Nasby@bluetreble.com>
>> wrote:
>>>
>>> On 8/9/16 6:44 PM, Claudio Freire wrote:
>>>>
>>>>
>>>> Since we can lookup all occurrences of k1=a index=0 and k2=a index=0,
>>>> and in fact we probably did so already as part of the update logic
>>>
>>>
>>>
>>> That's a change from what currently happens, right?
>>>
>>> The reason I think that's important is that dropping the assumption that
>>> we
>>> can't safely re-find index entries from the heap opens up other
>>> optimizations, ones that should be significantly simpler to implement.
>>> The
>>> most obvious example being getting rid of full index scans in vacuum.
>>> While
>>> that won't help with write amplification, it would reduce the cost of
>>> vacuum
>>> enormously. Orders of magnitude wouldn't surprise me in the least.
>>>
>>> If that's indeed a prerequisite to WARM it would be great to get that
>>> groundwork laid early so others could work on other optimizations it
>>> would
>>> enable.
>>
>>
>> I can do that. I've been prospecting the code to see what changes it
>> would entail already.
>>
>> But it's still specific to btree, I'm not sure the same optimizations
>> can be applied to GIN (maybe, if the posting list is sorted) or GIST
>> (probably, since it's like a btree, but I don't know the code well
>> enough).
>>
>> Certainly hash indexes won't support it.
>
>
> Why not? If this is all predicated on re-finding index keys based on heap
> data then this is just another index lookup, no?
A lookup on a hash index cannot be made to work for both key-only
lookups and key-ctid lookups, it's a limitation of the data structure.
A key-only lookup can potentially return too many results that don't
match the ctid so a walk of all equal-key item pointers is out of the
question.