Docs for refresh materialized view concurrently - Mailing list pgsql-hackers

From Jeremy Finzel
Subject Docs for refresh materialized view concurrently
Date
Msg-id CAMa1XUi2nbHVVa6PrFTygpwPY1b_FYsJc6pywj3qPHoLdZ03_w@mail.gmail.com
Whole thread Raw
Responses Re: Docs for refresh materialized view concurrently
List pgsql-hackers
Speaking with Robert today at pgcon, I happily discovered that REFRESH MATERIALIZED VIEW CONCURRENTLY actually only updates rows that have changed since the last refresh, rather than rewriting every row.  In my curiosity, I went to the docs, and found that this detail is not mentioned anywhere.

This is a great feature that is being undersold, and it should be made clear in the docs.

In my experience, there can be tons of WAL generated from large materialized views and the normal REFRESH (without CONCURRENTLY).  I had assumed the only benefit of CONCURRENTLY was to allow concurrent access to the table.  But actually the incremental refresh is a much bigger win for us in reducing WAL overhead drastically.

I've not submitted a patch before, and have a few suggestions I'd like feedback on before I write one (for the docs only).

1.

First, even this summary looks untrue:

REFRESH MATERIALIZED VIEW — replace the contents of a materialized view.

"replace" is not really accurate with the CONCURRENTLY option, because in fact it only updates changed rows.

Perhaps instead of "replace":
  • "replace or incrementally update the contents of a materialized view".
Also, the Description part has the same inaccuracy:

"completely replaces the contents of a materialized view.....The old contents are discarded."

That is not true with CONCURRENTLY, correct?  Only the old contents *which have changed* are discarded.

2.

Lastly, I would suggest adding something like the following to the first paragraph under CONCURRENTLY:
  • With this option, only actual changed rows are updated in the materialized view, which can significantly reduce the amount of write churn and WAL traffic from a refresh if only a small number of rows will change with each refresh.  It is recommended to have a unique index on the materialized view if possible, which will improve the performance of a concurrent refresh.
Please correct me if my understanding of this is not right.

3.

On a different note, none of the documentation on materialized views notes that they can only be LOGGED.  This should be noted, or at least it should be noted that one cannot create an UNLOGGED materialized view in the same place it says that one cannot create a temporary one (under Description in CREATE MATERIALIZED VIEW).


Thanks!
Jeremy Finzel

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: compiling PL/pgSQL plugin with C++
Next
From: Floris Van Nee
Date:
Subject: Re: Index Skip Scan