Re: Performance degradation of REFRESH MATERIALIZED VIEW - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Performance degradation of REFRESH MATERIALIZED VIEW
Date
Msg-id CAD21AoDLfZfxTnS48DsRK_DRq3ke-Z2gh_wh8aV2K7C1M2+jYw@mail.gmail.com
Whole thread Raw
In response to Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Apr 28, 2021 at 12:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 10:43 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> >
> >
> > On 4/27/21 7:34 AM, Masahiko Sawada wrote:
> > > On Tue, Apr 27, 2021 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
> > >>> On 4/26/21 9:27 PM, Andres Freund wrote:
> > >>>> On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> > >>>>> I'm not sure what to do about this :-( I don't have any ideas about how to
> > >>>>> eliminate this overhead, so the only option I see is reverting the changes
> > >>>>> in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
> > >>>>> be frozen ...
> > >>>>
> > >>>> ISTM that the fundamental issue here is not that we acquire pins that we
> > >>>> shouldn't, but that we do so at a much higher frequency than needed.
> > >>>>
> > >>>> It's probably too invasive for 14, but I think it might be worth exploring
> > >>>> passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
> > >>>> the input will be more than one row.
> > >>>>
> > >>>> And then add the vm buffer of the target page to BulkInsertState, so that
> > >>>> hio.c can avoid re-pinning the buffer.
> > >>>>
> > >>>
> > >>> Yeah. The question still is what to do about 14, though. Shall we leave the
> > >>> code as it is now, or should we change it somehow? It seem a bit unfortunate
> > >>> that a COPY FREEZE optimization should negatively influence other (more)
> > >>> common use cases, so I guess we can't just keep the current code ...
> > >>
> > >> I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
> > >> and see whether that fixes the regression.
> > >
> > > Is this idea to have RelationGetBufferForTuple() skip re-pinning
> > > vmbuffer? If so, is this essentially the same as the one in the v3
> > > patch?
> > >
> >
> > I don't think it is the same approach - it's a bit hard to follow what
> > exactly happens in RelationGetBufferForTuple, but AFAICS it always
> > starts with vmbuffer = InvalidBuffer, so it may pin the vmbuffer quite
> > often, no?
>
> With that patch, we pin the vmbuffer only when inserting a frozen
> tuple into an empty page. That is, when inserting a frozen tuple into
> an empty page, we pin the vmbuffer and heap_insert() will mark the
> page all-visible and set all-frozen bit on vm. And from the next
> insertion (into the same page) until the page gets full, since the
> page is already all-visible, we skip pinning the vmbuffer. IOW, if the
> target page is not empty but all-visible, we skip pinning the
> vmbuffer. We pin the vmbuffer only once per heap page used during
> insertion.
>
> >
> > What Andres is suggesting (I think) is to modify ExecInsert() to pass a
> > valid bistate to table_tuple_insert, instead of just NULL, and store the
> > vmbuffer in it.
>
> Understood. This approach keeps using the same vmbuffer until we need
> another vm page corresponding to the target heap page, which seems
> better.

But how is ExecInsert() related to REFRESH MATERIALIZED VIEW?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Performance degradation of REFRESH MATERIALIZED VIEW
Next
From: Alvaro Herrera
Date:
Subject: Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY