Thread: REFRESH MATERIALIZED VIEW locklevel

REFRESH MATERIALIZED VIEW locklevel

From
Andres Freund
Date:
Hi,

if I understand things correctly REFRESH MATERIALIZED VIEW locks the
materialized view with an AcessExclusiveLock even if the view already
contains data.
I am pretty sure that will - understandably - confuse users, so I vote
for at least including a note about that in the docs.

This fact imo reduces the usability of the matviews features as it
stands atm considerably. I think we should be very careful not to
advocate its existance much and document very clearly that its work in
progress.
Working incrementally is a sensible thing to do, don't get me wrong...

Making the refresh work concurrently doesn't seem to be too hard if its
already initialized:

1) lock relation exlusively in session mode (or only ShareUpdateExlusive?)
2) build new data into a separate relfilenode
3) switch relfilenode
4) wait till all potential users of the old relfilenode are gone  (VirtualXactLock games, just as in CREATE INDEX
CONCURRENTLY)
5) drop old relfilenode

The only problem I see right now is that we might forget to delete the
old relation if we crash during 4). Even if we WAL log it, due to
checkpoints causing that action not to be replayed.
But that seems to be nothing new, I think the same problem exists during
normal table rewrites as well, just the other way round (i.e. we forget
about the new relfilenode).

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REFRESH MATERIALIZED VIEW locklevel

From
Peter Geoghegan
Date:
On 7 March 2013 09:30, Andres Freund <andres@2ndquadrant.com> wrote:
> This fact imo reduces the usability of the matviews features as it
> stands atm considerably. I think we should be very careful not to
> advocate its existance much and document very clearly that its work in
> progress.

+1

-- 
Regards,
Peter Geoghegan



Re: REFRESH MATERIALIZED VIEW locklevel

From
Kevin Grittner
Date:
Andres Freund <andres@2ndquadrant.com> wrote:

> if I understand things correctly REFRESH MATERIALIZED VIEW locks
> the materialized view with an AcessExclusiveLock even if the view
> already contains data.

Yeah.  At the time I had to make a decision on that, REINDEX
CONCURRENTLY did not seem reliable with a weaker lock, and REFRESH
MATERIALIZED VIEW has to rebuild indexes (among other things).  If
we have all the issues sorted out with REINDEX CONCURRENTLY then
the same techniques will probably apply to RMV without too much
difficulty.  It's a bit late to think about that for 9.3, though.

> I am pretty sure that will - understandably - confuse users, so I
> vote for at least including a note about that in the docs.

Will see about working that in.

-Kevin

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: REFRESH MATERIALIZED VIEW locklevel

From
Andres Freund
Date:
On 2013-03-07 09:55:39 -0800, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
>
> > if I understand things correctly REFRESH MATERIALIZED VIEW locks
> > the materialized view with an AcessExclusiveLock even if the view
> > already contains data.
>
> Yeah.  At the time I had to make a decision on that, REINDEX
> CONCURRENTLY did not seem reliable with a weaker lock, and REFRESH
> MATERIALIZED VIEW has to rebuild indexes (among other things).  If
> we have all the issues sorted out with REINDEX CONCURRENTLY then
> the same techniques will probably apply to RMV without too much
> difficulty.  It's a bit late to think about that for 9.3, though.

I don't think that REFRESH MATERIALIZED VIEW has to deal with the same
problems that REINDEX CONCURRENTLY has - after all, there won't be any
DML coming in while its running. That should get rid of the REINDEX
CONCURRENTLY problems.
There doesn't seem to be any need to use the far more expensive REINDEX
CONCURRENLTY style computation of indexes which has to scan the heap
multiple times et al. They just should be built ontop of new matview
relation which is essentially read only.
> > I am pretty sure that will - understandably - confuse users, so I
> > vote for at least including a note about that in the docs.
>
> Will see about working that in.

Cool.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REFRESH MATERIALIZED VIEW locklevel

From
"anarazel@anarazel.de"
Date:

Kevin Grittner <kgrittn@ymail.com> schrieb:

>Andres Freund <andres@2ndquadrant.com> wrote:
>
>> if I understand things correctly REFRESH MATERIALIZED VIEW locks
>> the materialized view with an AcessExclusiveLock even if the view
>> already contains data.
>
>Yeah.  At the time I had to make a decision on that, REINDEX
>CONCURRENTLY did not seem reliable with a weaker lock, and REFRESH
>MATERIALIZED VIEW has to rebuild indexes (among other things).  If
>we have all the issues sorted out with REINDEX CONCURRENTLY then
>the same techniques will probably apply to RMV without too much
>difficulty.  It's a bit late to think about that for 9.3, though.
>
>> I am pretty sure that will - understandably - confuse users, so I
>> vote for at least including a note about that in the docs.
>
>Will see about working that in.

In the ride home I realized that unless - not that unlikely - you thought about something I didtn't  REFRESH will
behavesimilar to TRUNCATE for repeatable read+ transactions that only access it after REFRESH finished. That is, they
willappear empty. If that's the case, it needs to be documented prominently as well. 

It would be possible to get different behaviour by immediately freezing all tuples, but that would also result in
violationsof visibility by showing  tuples that are not yet visible. 

Andres

---
Please excuse brevity and formatting - I am writing this on my mobile phone.



Re: REFRESH MATERIALIZED VIEW locklevel

From
Kevin Grittner
Date:
"anarazel@anarazel.de" <andres@anarazel.de> wrote:

> In the ride home I realized that unless - not that unlikely - you
> thought about something I didtn't  REFRESH will behave similar to
> TRUNCATE for repeatable read+ transactions that only access it
> after REFRESH finished. That is, they will appear empty.

In an early version of the patch someone found that in testing and
pointed it out.

> It would be possible to get different behaviour by immediately
> freezing all tuples

Which is what I did.

> but that would also result in violations of visibility by showing
> tuples that are not yet visible.

Which is the case, and should be documented.  (I had not remembered
to do so yet; I'll tuck away your email as a reminder.)  Since the
MV is already not guaranteed to be in sync with other data, that
didn't seem like a fatal flaw.  It is, however, the one case where
the MV could appear to be *ahead* of the supporting data rather
than *behind* it.  In a way this is similar to how READ COMMITTED
transactions can see data from more than one snapshot on write
conflicts, so I see it as a bigger issue for more strict isolation
levels -- but those are unlikely to be all that useful with MVs in
this release anyway.  This is something that I think deserves some
work in a subsequent release.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: REFRESH MATERIALIZED VIEW locklevel

From
Andres Freund
Date:
On 2013-03-07 11:50:11 -0800, Kevin Grittner wrote:
> "anarazel@anarazel.de" <andres@anarazel.de> wrote:
>
> > In the ride home I realized that unless - not that unlikely - you
> > thought about something I didtn't  REFRESH will behave similar to
> > TRUNCATE for repeatable read+ transactions that only access it
> > after REFRESH finished. That is, they will appear empty.
>
> In an early version of the patch someone found that in testing and
> pointed it out.

Cool ;)
> > It would be possible to get different behaviour by immediately
> > freezing all tuples
>
> Which is what I did.

Ok.

> > but that would also result in violations of visibility by showing
> > tuples that are not yet visible.
>
> Which is the case, and should be documented.  (I had not remembered
> to do so yet; I'll tuck away your email as a reminder.)  Since the
> MV is already not guaranteed to be in sync with other data, that
> didn't seem like a fatal flaw.  It is, however, the one case where
> the MV could appear to be *ahead* of the supporting data rather
> than *behind* it.  In a way this is similar to how READ COMMITTED
> transactions can see data from more than one snapshot on write
> conflicts, so I see it as a bigger issue for more strict isolation
> levels -- but those are unlikely to be all that useful with MVs in
> this release anyway.  This is something that I think deserves some
> work in a subsequent release.

I am not that convinced that this is unproblematic. I don't see any
problem with READ COMMITTED but in higher levels Even if you expect the
view to be out-of-date it may very well be surprising to see it
referring to rows in another table that do not yet exists although rows
in that table never get deleted.

Especially in the initial population I don't see any way to get rid of
the problem - I don't think there exists a valid way to compute valid
xmin/xmax values for the resulting tuples of all queries. So unless we
get catalog accesses that completly objeys repeatable read semantics
there's not much we can do about that. And while I think getting rid of
SnapshotNow is realistic, I don't think fully versioned catalog access
is (i.e. it working in a way that you could access a table in the old
and new version after a ALTER TABLE ...).

I wonder if we should add something like indexcheckxmin to matviews
which specifies after which value its valid. Only that it errors out if
you haven't reached it.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REFRESH MATERIALIZED VIEW locklevel

From
Josh Berkus
Date:
Andres,

> if I understand things correctly REFRESH MATERIALIZED VIEW locks the
> materialized view with an AcessExclusiveLock even if the view already
> contains data.
> I am pretty sure that will - understandably - confuse users, so I vote
> for at least including a note about that in the docs.

+1 for mentioning it in the docs.  We could stand to document what
locklevels various commands take more in general.

> This fact imo reduces the usability of the matviews features as it
> stands atm considerably. I think we should be very careful not to
> advocate its existance much and document very clearly that its work in
> progress.
> Working incrementally is a sensible thing to do, don't get me wrong...

-1 from me.

Postgres is currently full of fairly innocent-looking commands which
take an unexpected ACCESS EXCLUSIVE lock.  For example, DROP CONSTRAINT
takes an accessexclusive lock, but it hasn't stopped people from using
constraints, and isn't particularly high up on our todo list to fix it.

This limitation is in no way crippling for this feature, or even a major
detraction.  I still intend to promote the heck out of this feature.

Now, I agree that having a REFRESH ... CONCURRENTLY would be a wonderful
feature for 9.4.  But the fact that we don't have it yet is not a big
deal, and I would put several other matview-related features ahead of
concurrent in priority.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: REFRESH MATERIALIZED VIEW locklevel

From
Andres Freund
Date:
Hi,

On 2013-03-07 15:21:35 -0800, Josh Berkus wrote:
> > This fact imo reduces the usability of the matviews features as it
> > stands atm considerably. I think we should be very careful not to
> > advocate its existance much and document very clearly that its work in
> > progress.
> > Working incrementally is a sensible thing to do, don't get me wrong...
> 
> -1 from me.
> 
> Postgres is currently full of fairly innocent-looking commands which
> take an unexpected ACCESS EXCLUSIVE lock.  For example, DROP CONSTRAINT
> takes an accessexclusive lock, but it hasn't stopped people from using
> constraints, and isn't particularly high up on our todo list to fix
> it.

Thats a pretty unconvincing comparison. There isn't any expectation that
ALTER TABLE works without taking exlusive locks from common
implementations and DROP CONSTRAINT only takes a very short time while
refreshing a materialized view possibly take rather long.

> This limitation is in no way crippling for this feature, or even a major
> detraction.  I still intend to promote the heck out of this feature.

Thats scaring me. Because the current state of the feature isn't
something that people expect under the term "materialized views" and I
am pretty damn sure people will then remember postgres as trying to
provide a tick-box item without it being really usable in the real
world.
And thats not something I want postgres to be known for.

Note that I *really* think working incrementally on such things is the
way to go and I think its good that this got committed in 9.3. But if
this now gets used prominently in promotion in its current state I think
the conclusion is that working incrementally in postgres isn't the way
to go and that will make it *much* harder to do so in future
releases. Which will slow postgres down.

Greetings,

Andres Freund



Re: REFRESH MATERIALIZED VIEW locklevel

From
Josh Berkus
Date:
>> Postgres is currently full of fairly innocent-looking commands which
>> take an unexpected ACCESS EXCLUSIVE lock.  For example, DROP CONSTRAINT
>> takes an accessexclusive lock, but it hasn't stopped people from using
>> constraints, and isn't particularly high up on our todo list to fix
>> it.
> 
> Thats a pretty unconvincing comparison. There isn't any expectation that
> ALTER TABLE works without taking exlusive locks from common

Not exclusive (which is expected), but AccessExclusive (which catches
many of our users by surprise).

How about the fact that dropping an FK constraint takes an
AccessExclusiveLock on the *referenced* table?

> implementations and DROP CONSTRAINT only takes a very short time while
> refreshing a materialized view possibly take rather long.

Right now there's no expectations at all about our new Matview feature.I think putting the locking information in the
docsis the right way to go.
 

> Thats scaring me. Because the current state of the feature isn't
> something that people expect under the term "materialized views" and I
> am pretty damn sure people will then remember postgres as trying to
> provide a tick-box item without it being really usable in the real
> world.
> And thats not something I want postgres to be known for.

We promoted the heck out of binary replication when it was barely
usable.  We've gotten huge interest in our JSON support, even when it's
a work-in-progress.  I don't see why I should change an approach to
advocacy which is clearly working.  What our project considers an
incomplete feature other OSS DBMSes call a version 2.0.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: REFRESH MATERIALIZED VIEW locklevel

From
Andres Freund
Date:
On 2013-03-07 15:54:32 -0800, Josh Berkus wrote:
> 
> >> Postgres is currently full of fairly innocent-looking commands which
> >> take an unexpected ACCESS EXCLUSIVE lock.  For example, DROP CONSTRAINT
> >> takes an accessexclusive lock, but it hasn't stopped people from using
> >> constraints, and isn't particularly high up on our todo list to fix
> >> it.
> > 
> > Thats a pretty unconvincing comparison. There isn't any expectation that
> > ALTER TABLE works without taking exlusive locks from common
> 
> Not exclusive (which is expected), but AccessExclusive (which catches
> many of our users by surprise).
> 
> How about the fact that dropping an FK constraint takes an
> AccessExclusiveLock on the *referenced* table?

All of that is DDL.

> > implementations and DROP CONSTRAINT only takes a very short time while
> > refreshing a materialized view possibly take rather long.
> 
> Right now there's no expectations at all about our new Matview feature.
>  I think putting the locking information in the docs is the right way to go.

That should definitely be done.

The point is that
a) refreshing is the only way to update materialized views. There's no  incremental support.
b) refreshing will take a long time (otherwise you wouldn't have  create a materialized view) and you can't use the
viewduring that.
 

Which means for anyone wanting to use matviews in a busy environment you
will need to build the new matview separately and then move it into
place via renames. With all the issues that brings like needing to
recreate dependent views and such.

Sorry, but thats not very useful expect (and there very much so) as a
stepping stone for further work.

> > Thats scaring me. Because the current state of the feature isn't
> > something that people expect under the term "materialized views" and I
> > am pretty damn sure people will then remember postgres as trying to
> > provide a tick-box item without it being really usable in the real
> > world.
> > And thats not something I want postgres to be known for.
> 
> We promoted the heck out of binary replication when it was barely
> usable.  We've gotten huge interest in our JSON support, even when it's
> a work-in-progress.  I don't see why I should change an approach to
> advocacy which is clearly working.  What our project considers an
> incomplete feature other OSS DBMSes call a version 2.0.

I heard some people grumble about binary replication in 9.0 but there
were loads of realword scenarios it could be used. I heard quite some
people being annoyed about the level of json support even though it
provided some usefulness with row_to_json (or whatever its called). And
it's a feature that can be extended by extensions. And lots of the
defficiencies of binary replication could be solved by outside tooling.

Thats not possible with matviews as is. Which, again, is *totally fine*
in itself.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REFRESH MATERIALIZED VIEW locklevel

From
Nicolas Barbier
Date:
2013/3/8 Andres Freund <andres@anarazel.de>:

> On 2013-03-07 15:21:35 -0800, Josh Berkus wrote:
>
>> This limitation is in no way crippling for this feature, or even a major
>> detraction.  I still intend to promote the heck out of this feature.
>
> Thats scaring me. Because the current state of the feature isn't
> something that people expect under the term "materialized views" and I
> am pretty damn sure people will then remember postgres as trying to
> provide a tick-box item without it being really usable in the real
> world.
> And thats not something I want postgres to be known for.

+1. It seems wise to wait for the feature to ripen some more. That
way, the impact of any promotion will be stronger; Most people
understand “materialized views” to mean something more that what is
currently there.

Of course, a drawback of waiting would be that you might lose the
momentum of the expression “materialized views.” OTOH, any questions
along the lines of “I thought PG supported materialized views since
9.3? Why are they making such a fuss about it now (i.e., > 9.3)?”
would lead to people discussing even more, which might enhance the
effect of the promotion.

Nicolas

--
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?



Re: REFRESH MATERIALIZED VIEW locklevel

From
Josh Berkus
Date:
Andres,

Crossing this over to pgsql-advocacy, because this is really an Advocacy
discussion.

> The point is that
> a) refreshing is the only way to update materialized views. There's no
>    incremental support.
> b) refreshing will take a long time (otherwise you wouldn't have
>    create a materialized view) and you can't use the view during that.
>
> Which means for anyone wanting to use matviews in a busy environment you
> will need to build the new matview separately and then move it into
> place via renames. With all the issues that brings like needing to
> recreate dependent views and such.

There's a lot of shops which currently have matviews which are referesed
daily, during low-activity periods.  I consult for several.  While
concurrent refresh will make MVs much more useful for shops with a
tighter refresh cycle, what Kevin has developed is useful *to me*
immediately.  It allows me to cut dozens to hundreds of lines of
application code and replace it with a simple declaration and a Refresh
cron job.

> Sorry, but thats not very useful expect (and there very much so) as a
> stepping stone for further work.

What you're saying is "MVs aren't useful *to me* in their current state,
therefore they aren't useful, therefore they're a non-feature."  Well,
the 9.3 version is useful to *me*, and I expect that they will be useful
to a large number of other people, even if they don't help *you*.

As a parallel feature, 9.2's cascading replication is completely useless
to me and my clients because streaming-only remastering isn't supported.
 I expressed the opinion that maybe we shouldn't promote cascade rep as
a major feature until it was; I was outvoted, because it turns out that
9.2 cascade rep *is* useful to a large number of people who are willing
to work around its current limitations.

Further, we get pretty much one and only one chance to promote a new
major feature, which is when that feature is first introduced.
Improving the feature in the next version of Postgres is not news, so we
can't successfully promote it.  If we soft-pedal MVs in the 9.3
announcement, we will not be able to get people excited about them in
9.4; they will be "yesterday's news".

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: REFRESH MATERIALIZED VIEW locklevel

From
Merlin Moncure
Date:
On Fri, Mar 8, 2013 at 11:59 AM, Josh Berkus <josh@agliodbs.com> wrote:
> Andres,
>
> Crossing this over to pgsql-advocacy, because this is really an Advocacy
> discussion.
>
>> The point is that
>> a) refreshing is the only way to update materialized views. There's no
>>    incremental support.
>> b) refreshing will take a long time (otherwise you wouldn't have
>>    create a materialized view) and you can't use the view during that.
>>
>> Which means for anyone wanting to use matviews in a busy environment you
>> will need to build the new matview separately and then move it into
>> place via renames. With all the issues that brings like needing to
>> recreate dependent views and such.
>
> There's a lot of shops which currently have matviews which are referesed
> daily, during low-activity periods.  I consult for several.  While
> concurrent refresh will make MVs much more useful for shops with a
> tighter refresh cycle, what Kevin has developed is useful *to me*
> immediately.  It allows me to cut dozens to hundreds of lines of
> application code and replace it with a simple declaration and a Refresh
> cron job.
>
>> Sorry, but thats not very useful expect (and there very much so) as a
>> stepping stone for further work.
>
> What you're saying is "MVs aren't useful *to me* in their current state,
> therefore they aren't useful, therefore they're a non-feature."  Well,
> the 9.3 version is useful to *me*, and I expect that they will be useful
> to a large number of other people, even if they don't help *you*.
>
> As a parallel feature, 9.2's cascading replication is completely useless
> to me and my clients because streaming-only remastering isn't supported.
>  I expressed the opinion that maybe we shouldn't promote cascade rep as
> a major feature until it was; I was outvoted, because it turns out that
> 9.2 cascade rep *is* useful to a large number of people who are willing
> to work around its current limitations.
>
> Further, we get pretty much one and only one chance to promote a new
> major feature, which is when that feature is first introduced.
> Improving the feature in the next version of Postgres is not news, so we
> can't successfully promote it.  If we soft-pedal MVs in the 9.3
> announcement, we will not be able to get people excited about them in
> 9.4; they will be "yesterday's news".

+1 on this.  they are useful to me as immediately and I work in busy
environments.  the formal matview feature is a drop in replace for my
ad hoc implementation of 'drop cache table, replace from view'.  I
already have to work around the locking issue anyways -- sure, it
would be great if I didn't have to do that either but I'll take the
huge syntactical convenience alone.

merlin



Re: REFRESH MATERIALIZED VIEW locklevel

From
"Joshua D. Drake"
Date:
On 03/08/2013 10:09 AM, Merlin Moncure wrote:
>
> On Fri, Mar 8, 2013 at 11:59 AM, Josh Berkus <josh@agliodbs.com> wrote:
>> Andres,

>> Further, we get pretty much one and only one chance to promote a new
>> major feature, which is when that feature is first introduced.
>> Improving the feature in the next version of Postgres is not news, so we
>> can't successfully promote it.  If we soft-pedal MVs in the 9.3
>> announcement, we will not be able to get people excited about them in
>> 9.4; they will be "yesterday's news".
>
> +1 on this.  they are useful to me as immediately and I work in busy
> environments.  the formal matview feature is a drop in replace for my
> ad hoc implementation of 'drop cache table, replace from view'.  I
> already have to work around the locking issue anyways -- sure, it
> would be great if I didn't have to do that either but I'll take the
> huge syntactical convenience alone.

Just to throw my +1 into the ring. Well written Josh.

JD


>
> merlin
>
>


--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579