Re: Performance degradation of REFRESH MATERIALIZED VIEW - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Performance degradation of REFRESH MATERIALIZED VIEW
Date
Msg-id 20210427140347.GZ7256@telsasoft.com
Whole thread Raw
In response to Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Tue, Apr 27, 2021 at 03:43:07PM +0200, Tomas Vondra wrote:
> On 4/27/21 7:34 AM, Masahiko Sawada wrote:
> > On Tue, Apr 27, 2021 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:
> > > On 2021-04-26 23:59:17 +0200, Tomas Vondra wrote:
> > > > On 4/26/21 9:27 PM, Andres Freund wrote:
> > > > > On 2021-04-26 15:31:02 +0200, Tomas Vondra wrote:
> > > > > > I'm not sure what to do about this :-( I don't have any ideas about how to
> > > > > > eliminate this overhead, so the only option I see is reverting the changes
> > > > > > in heap_insert. Unfortunately, that'd mean inserts into TOAST tables won't
> > > > > > be frozen ...
> > > > > 
> > > > > ISTM that the fundamental issue here is not that we acquire pins that we
> > > > > shouldn't, but that we do so at a much higher frequency than needed.
> > > > > 
> > > > > It's probably too invasive for 14, but I think it might be worth exploring
> > > > > passing down a BulkInsertState in nodeModifyTable.c's table_tuple_insert() iff
> > > > > the input will be more than one row.
> > > > > 
> > > > > And then add the vm buffer of the target page to BulkInsertState, so that
> > > > > hio.c can avoid re-pinning the buffer.
> > > > > 
> > > > 
> > > > Yeah. The question still is what to do about 14, though. Shall we leave the
> > > > code as it is now, or should we change it somehow? It seem a bit unfortunate
> > > > that a COPY FREEZE optimization should negatively influence other (more)
> > > > common use cases, so I guess we can't just keep the current code ...
> > > 
> > > I'd suggest prototyping the use of BulkInsertState in nodeModifyTable.c
> > > and see whether that fixes the regression.
> 
> What Andres is suggesting (I think) is to modify ExecInsert() to pass a
> valid bistate to table_tuple_insert, instead of just NULL, and store the
> vmbuffer in it. Not sure how to identify when inserting more than just a
> single row, though ...

Maybe this is relevant.
https://commitfest.postgresql.org/33/2553/
| INSERT SELECT: BulkInsertState and table_multi_insert

The biistate part is small - Simon requested to also use table_multi_insert,
which makes the patch much bigger, and there's probably lots of restrictions I
haven't even thought to check.

This uses a GUC threshold for bulk insert, but I'm still not sure it's really
problematic to use a biistate for a single row.

        /* Use bulk insert after a threshold number of tuples */
        // XXX: maybe this should only be done if it's not a partitioned table or
        // if the partitions don't support miinfo, which uses its own bistates
        mtstate->ntuples++;
        if (mtstate->bistate == NULL &&
                        mtstate->ntuples > bulk_insert_ntuples &&
                        bulk_insert_ntuples >= 0)
        {
                elog(DEBUG1, "enabling bulk insert");
                mtstate->bistate = GetBulkInsertState();

-- 
Justin



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Performance degradation of REFRESH MATERIALIZED VIEW
Next
From: Amit Langote
Date:
Subject: Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY