Re: Performance degradation of REFRESH MATERIALIZED VIEW - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Performance degradation of REFRESH MATERIALIZED VIEW
Date
Msg-id 68a1b3e1-078c-fca1-a53b-54a97aa6a3ad@enterprisedb.com
Whole thread Raw
In response to Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On 4/27/21 5:44 PM, Masahiko Sawada wrote:
> On Wed, Apr 28, 2021 at 12:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Tue, Apr 27, 2021 at 10:43 PM Tomas Vondra
>> <tomas.vondra@enterprisedb.com> wrote:
>>>
>>>
>>>
>>> On 4/27/21 7:34 AM, Masahiko Sawada wrote:
>>>> On Tue, Apr 27, 2021 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
>>>>>> On 4/26/21 9:27 PM, Andres Freund wrote:
>>>>>>> On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
>>>>>>>> I'm not sure what to do about this :-( I don't have any ideas about how to
>>>>>>>> eliminate this overhead, so the only option I see is reverting the changes
>>>>>>>> in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
>>>>>>>> be frozen ...
>>>>>>>
>>>>>>> ISTM that the fundamental issue here is not that we acquire pins that we
>>>>>>> shouldn't, but that we do so at a much higher frequency than needed.
>>>>>>>
>>>>>>> It's probably too invasive for 14, but I think it might be worth exploring
>>>>>>> passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
>>>>>>> the input will be more than one row.
>>>>>>>
>>>>>>> And then add the vm buffer of the target page to BulkInsertState, so that
>>>>>>> hio.c can avoid re-pinning the buffer.
>>>>>>>
>>>>>>
>>>>>> Yeah. The question still is what to do about 14, though. Shall we leave the
>>>>>> code as it is now, or should we change it somehow? It seem a bit unfortunate
>>>>>> that a COPY FREEZE optimization should negatively influence other (more)
>>>>>> common use cases, so I guess we can't just keep the current code ...
>>>>>
>>>>> I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
>>>>> and see whether that fixes the regression.
>>>>
>>>> Is this idea to have RelationGetBufferForTuple() skip re-pinning
>>>> vmbuffer? If so, is this essentially the same as the one in the v3
>>>> patch?
>>>>
>>>
>>> I don't think it is the same approach - it's a bit hard to follow what
>>> exactly happens in RelationGetBufferForTuple, but AFAICS it always
>>> starts with vmbuffer = InvalidBuffer, so it may pin the vmbuffer quite
>>> often, no?
>>
>> With that patch, we pin the vmbuffer only when inserting a frozen
>> tuple into an empty page. That is, when inserting a frozen tuple into
>> an empty page, we pin the vmbuffer and heap_insert() will mark the
>> page all-visible and set all-frozen bit on vm. And from the next
>> insertion (into the same page) until the page gets full, since the
>> page is already all-visible, we skip pinning the vmbuffer. IOW, if the
>> target page is not empty but all-visible, we skip pinning the
>> vmbuffer. We pin the vmbuffer only once per heap page used during
>> insertion.
>>
>>>
>>> What Andres is suggesting (I think) is to modify ExecInsert() to pass a
>>> valid bistate to table_tuple_insert, instead of just NULL, and store the
>>> vmbuffer in it.
>>
>> Understood. This approach keeps using the same vmbuffer until we need
>> another vm page corresponding to the target heap page, which seems
>> better.
> 
> But how is ExecInsert() related to REFRESH MATERIALIZED VIEW?
> 

TBH I haven't looked into the details, but Andres talked about 
nodeModifyTable and table_tuple_insert, and ExecInsert is the only place 
calling it. But maybe I'm just confused and Andres meant something else?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SQL-standard function body
Next
From: Alvaro Herrera
Date:
Subject: Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY