Re: refresh materialized view concurrently - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: refresh materialized view concurrently
Date
Msg-id 1371473850.55293.YahooMailNeo@web162904.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: refresh materialized view concurrently  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 14.06.2013 19:05, Kevin Grittner wrote:
>> Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY
>> for 9.4 CF1.  The goal of this patch is to allow a refresh
>> without interfering with concurrent reads, using transactional
>> semantics.
>>
>> It is my hope to get this committed during this CF to allow me
>> to focus on incremental maintenance for the rest of the release
>> cycle.
>
> I must say this seems a bit pointless on its own.

I completely disagree.  When I read what people were posting about
the materialized view creation that went into 9.3, there were many
comments by people that they can't use it until the materialized
views can be refreshed without blocking readers.  There is a clear
need for this.  It doesn't do much to advance incremental
maintenance, but it is a much smaller patch which will make
matviews usable by a lot of people who can't use the initial
feature set.

> I didn't understand this error message:
>
> +     if (!foundUniqueIndex)
> +         ereport(ERROR,
> +                 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +                  errmsg("concurrent refresh requires a unique index on just columns for all rows of the
materializedview"))); 
> +
>
> What does that mean?

It means that the REFRESH MATERIALIZED VIEW CONCURRENTLY command
cannot be used on a materialized view unless it has at least one
UNIQUE index which is not partial (i.e., there is no WHERE clause)
and is not indexing on an expression (i.e., the index is entirely
on bare column names).  Set logic to do the "diff" is hard to get
right if the tables are not proper sets (i.e., they contain
duplicate rows).  I can see at least three ways it *could* be done,
but all of them are much more complex and significantly slower.
With a UNIQUE index on some set of columns in all rows the correct
guarantees exist to use fast set logic.  It isn't that it's needed
for access; it is needed to provide a guarantee that there is no
row without NULLs that exactly duplicates another row.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: GIN improvements part 3: ordering in index
Next
From: Heikki Linnakangas
Date:
Subject: Re: GIN improvements part2: fast scan