Re: postgres materialized view refresh performance - Mailing list pgsql-general

From Philip Semanchuk
Subject Re: postgres materialized view refresh performance
Date
Msg-id C3EA7141-A27D-44C4-80BE-DB7968B09617@americanefficient.com
Whole thread Raw
In response to Re: postgres materialized view refresh performance  (Ayub M <hiayub@gmail.com>)
List pgsql-general

> On Oct 26, 2020, at 10:45 AM, Ayub M <hiayub@gmail.com> wrote:
>
> It's a simple sequential scan plan of one line, just reading the base table sequentially.

Well, unless I have misunderstood you, the materialized view is basically just "select * from some_other_table”, the
numberof records in the source table is ~6m and doesn’t change much, there are no locking delays and no resource
shortages,but sometimes the refresh takes minutes, and sometimes hours. There’s something missing from the story here. 

Some things to try or check on —
 - activity (CPU, disk, memory) during the period when the mat view is refreshing
 - each time after you refresh the mat view, vacuum it
 - even better, if you can afford a brief lock on reads, run a vacuum full instead of just regular vacuum
 - if possible, at the same time as you create the problematic mat view, run a similar process that writes to a
differentmat view (tmp_throwaway_mat_view) without the CONCURRENTLY keyword and see if it behaves similarly.  



>
> On Mon, Oct 26, 2020, 9:21 AM Philip Semanchuk <philip@americanefficient.com> wrote:
>
>
> > On Oct 25, 2020, at 10:52 PM, Ayub M <hiayub@gmail.com> wrote:
> >
> > Thank you both.
> >
> > As for the mview refresh taking long --
> >   • The mview gets refreshed in a couple of mins sometimes and sometimes it takes hours. When it runs for longer,
thereare no locks and no resource shortage, the number of recs in the base table is 6m (7.5gb) which is not huge so why
doesit take so long to refresh the mview? 
> >
> > Does the run time correlate with the number of changes being made?
> >
> > -- Almost the same number of records are present in the base table (6 million records). The base table gets
truncatedand reloaded everytime with almost the same number of records.  
> >
> > And the mview is a simple select from this one base table.
> >
> > The mview has around 10 indexes, 1 unique and 9 non-unique indexes.
> >
> > Population of the base tables takes about 2 mins, using "insert into select from table", but when the mview is
createdfor the first time it takes 16 minutes. Even when I remove all but one unique index it takes about 7 minutes.
Anyclue as to why it is taking longer than the create of the base table (which is 2 mins). 
>
> Do you know if it’s executing a different plan when it takes a long time? auto_explain can help with that.
>
>
>
> >
> > On Fri, Oct 23, 2020 at 10:53 AM Philip Semanchuk <philip@americanefficient.com> wrote:
> >
> >
> > > On Oct 23, 2020, at 9:52 AM, Ravi Krishna <sravikrishna@mail.com> wrote:
> > >
> > >> My understanding is that when CONCURRENTLY is specified, Postgres implements the refresh as a series of INSERT,
UPDATE,
> > >> and DELETE statements on the existing view. So the answer to your question is no, Postgres doesn’t create
anothertable and 
> > >> then swap it.
> > >
> > > The INSERTS/UPDATE/DELETE happens only for the difference.  PG first creates a new temp table and then compares
itwith 
> > > the MV and detects the difference.  That is why for CONCURRENTLY, a unique index is required on the MV.
> >
> > Yes, thank you, that’s what I understand too but I expressed it very poorly.
> >
> >
> >
> > --
> > Regards,
> > Ayub
>




pgsql-general by date:

Previous
From: Ayub M
Date:
Subject: Re: postgres materialized view refresh performance
Next
From: Magnus Hagander
Date:
Subject: Re: Feature Requests