Re: Performance degradation of REFRESH MATERIALIZED VIEW - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Performance degradation of REFRESH MATERIALIZED VIEW
Date
Msg-id 20210427182236.hmos7xbvxun73cjc@alap3.anarazel.de
Whole thread Raw
In response to Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
Hi,

On 2021-04-28 00:44:47 +0900, Masahiko Sawada wrote:
> On Wed, Apr 28, 2021 at 12:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > What Andres is suggesting (I think) is to modify ExecInsert() to pass a
> > > valid bistate to table_tuple_insert, instead of just NULL, and store the
> > > vmbuffer in it.
> >
> > Understood. This approach keeps using the same vmbuffer until we need
> > another vm page corresponding to the target heap page, which seems
> > better.
> 
> But how is ExecInsert() related to REFRESH MATERIALIZED VIEW?

I was thinking of the CONCURRENTLY path for REFRESH MATERIALIZED VIEW I
think. Or something.

That actually makes it easier - we already pass in a bistate in the relevant
paths. So if we add a current_vmbuf to BulkInsertStateData, we can avoid
needing to pin so often. It seems that'd end up with a good bit cleaner and
less risky code than the skip_vmbuffer_for_frozen_tuple_insertion_v3.patch
approach.

The current RelationGetBufferForTuple() interface / how it's used in heapam.c
doesn't make this quite as trivial as it could be... Attached is a quick hack
implementing this. For me it reduces the overhead noticably:

REFRESH MATERIALIZED VIEW mv;
before:
Time: 26542.333 ms (00:26.542)
after:
Time: 23105.047 ms (00:23.105)

Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Addition of authenticated ID to pg_stat_activity
Next
From: Andres Freund
Date:
Subject: Re: Addition of authenticated ID to pg_stat_activity