Re: Performance degradation of REFRESH MATERIALIZED VIEW - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Performance degradation of REFRESH MATERIALIZED VIEW
Date
Msg-id CALj2ACVMWdvv3sq11ww0Fb-K0knxCCe93oDAbPqEXQWPvpFQMQ@mail.gmail.com
Whole thread Raw
In response to Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Performance degradation of REFRESH MATERIALIZED VIEW  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Tue, Apr 20, 2021 at 11:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 11:04 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Mon, Apr 19, 2021 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > I’ve updated the patch including the above comment.
> >
> > Thanks for the patch.
> >
> > I was trying to understand below statements:
> > +                 * we check without a buffer lock if the page is empty but the
> > +                 * caller doesn't need to recheck that since we assume that in
> > +                 * HEAP_INSERT_FROZEN case, only one process is inserting a
> > +                 * frozen tuple into this relation.
> > +                 *
> >
> > And earlier comments from upthread:
> >
> > >> AFAICS the page is always empty when RelationGetBufferForTuple
> > >> returned a valid vmbuffer.  So the "if" should be an "assert" instead.
> >
> > > There is a chance that RelationGetBufferForTuple() returns a valid
> > > vmbuffer but the page is not empty, since RelationGetBufferForTuple()
> > > checks without a lock if the page is empty. But when it comes to
> > > HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
> > > since only one process inserts tuples into the relation. Will fix."
> >
> > I'm not sure whether it is safe to assume "at least for now since only
> > one process inserts tuples into the relation". What if we allow
> > parallel inserts for HEAP_INSERT_FROZEN cases, I don't know whether we
> > can do that or not. Correct me if I'm wrong.
>
> I think if my assumption is wrong or we allow parallel insert for
> HEAP_INSERT_FROZEN cases in the future, we need to deal with the case
> where frozen tuples are concurrently inserted into the same page. For
> example, we can release vmbuffer when we see the page is no longer
> empty, or we can return a valid buffer but require the caller to
> re-check if the page is still empty. The previous version patch took
> the former approach. More concretely, heap_insert() rechecked if the
> page is still empty in HEAP_INSERT_FROZEN case and set all_frozen_set
> if so. But AFAICS concurrently inserting frozen tuples into the same
> page doesn’t happen for now (COPY FREEZE and REFRESH MATERIALIZED VIEW
> are users of HEAP_INSERT_FROZEN), also pointed out by Horiguchi-san.
> So I added comments and assertions rather than addressing the case
> that never happens with the current code. If concurrently inserting
> frozen tuples into the same page happens, we should get the assertion
> failure that I added in RelationGetBufferForTuple().

Upon thinking further, concurrent insertions into the same page are
not possible while we are in heap_insert in between
RelationGetBufferForTuple and UnlockReleaseBuffer(buffer);.
RelationGetBufferForTuple will lock the buffer in exclusive mode, see
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); and comment " *    Returns
pinned and exclusive-locked buffer of a page in given relation". Even
if parallel insertions are allowed in HEAP_INSERT_FROZEN cases, then
each worker will separately acquire pages, insert into them and they
skip getting visibility map page pin if the page is set all-visible by
another worker.

Some more comments on v3 patch:
1) Isn't it good to specify here that what we gain by avoiding pinning
visibility map page something like: gain a few seconds/avoid extra
function calls/or some other better wording?
+                 * If the page already is non-empty and all-visible, we skip to
+                 * pin on a visibility map buffer since we never clear and set
+                 * all-frozen bit on visibility map during inserting a frozen
+                 * tuple.
+                 */

2) Isn't it good to put PageIsAllVisible(BufferGetPage(buffer))) in
the if clause instead of else if clause, because this is going to be
hitting most of the time, we could avoid page empty check every time?
+                if (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0)
+                    visibilitymap_pin(relation, targetBlock, vmbuffer);
+                else if (PageIsAllVisible(BufferGetPage(buffer)))
+                    skip_vmbuffer = true;

3) I found another typo in v3 - it is "will set" not "will sets":
+ *    In HEAP_INSERT_FROZEN cases, we handle the possibility that the
caller will
+ *    sets all-frozen bit on the visibility map page.  We pin on the visibility

4) I think a commit message can be added to the upcoming patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: terminate called after throwing an instance of 'std::bad_alloc'
Next
From: Michael Paquier
Date:
Subject: Re: Table refer leak in logical replication