Re: Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id CAA4eK1J455T2BBH16rj2ez57koVaUrSWfaMecXzNPfu6eEXg7A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Patch: Write Amplification Reduction Method (WARM)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
List pgsql-hackers
On Mon, Mar 27, 2017 at 2:19 PM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>
>
> On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee <pavan.deolasee@gmail.com>
> wrote:
>>
>>
>>
>> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit.kapila16@gmail.com>
>> wrote:
>>>
>>>
>>>
>>> I was worried for the case if the index is created non-default
>>> collation, will the datumIsEqual() suffice the need.  Now again
>>> thinking about it, I think it will because in the index tuple we are
>>> storing the value as in heap tuple.  However today it occurred to me
>>> how will this work for toasted index values (index value >
>>> TOAST_INDEX_TARGET).  It is mentioned on top of datumIsEqual() that it
>>> probably won't work for toasted values.  Have you considered that
>>> point?
>>>
>>
>> No, I haven't and thanks for bringing that up. And now that I think more
>> about it and see the code, I think the naive way of just comparing index
>> attribute value against heap values is probably wrong. The example of
>> TOAST_INDEX_TARGET is one such case, but I wonder if there are other varlena
>> attributes that we might store differently in heap and index. Like
>> index_form_tuple() ->heap_fill_tuple seem to some churning for varlena. It's
>> not clear to me if index_get_attr will return the values which are binary
>> comparable to heap values.. I wonder if calling index_form_tuple on the heap
>> values, fetching attributes via index_get_attr on both index tuples and then
>> doing a binary compare is a more robust idea. Or may be that's just
>> duplicating efforts.
>>
>> While looking at this problem, it occurred to me that the assumptions made
>> for hash indexes are also wrong :-( Hash index has the same problem as
>> expression indexes have. A change in heap value may not necessarily cause a
>> change in the hash key. If we don't detect that, we will end up having two
>> hash identical hash keys with the same TID pointer. This will cause the
>> duplicate key scans problem since hashrecheck will return true for both the
>> hash entries. That's a bummer as far as supporting WARM for hash indexes is
>> concerned, unless we find a way to avoid duplicate index entries.
>>
>
> Revised patches are attached.
>

Noted few cosmetic issues in 0005_warm_updates_v21:

1.
pruneheap.c(939): warning C4098: 'heap_get_root_tuples' : 'void'
function returning a value

2.
+ *  HCWC_WARM_UPDATED_TUPLE - a tuple with HEAP_WARM_UPDATED is found somewhere
+ *    in the chain. Note that when a tuple is WARM
+ *    updated, both old and new versions are marked
+ *    with this flag/
+ *
+ *  HCWC_WARM_TUPLE  - a tuple with HEAP_WARM_TUPLE is found somewhere in
+ *  the chain.
+ *
+ *  HCWC_CLEAR_TUPLE - a tuple without HEAP_WARM_TUPLE is found somewhere in
+ *   the chain.

Description of all three flags is same.

3.
+ *  HCWC_WARM_UPDATED_TUPLE - a tuple with HEAP_WARM_UPDATED is found somewhere
+ *    in the chain. Note that when a tuple is WARM
+ *    updated, both old and new versions are marked
+ *    with this flag/

Spurious '/' at end of line.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Documentation improvements for partitioning
Next
From: Amit Langote
Date:
Subject: Re: pg_dump emits ALTER TABLE ONLY partitioned_table