Re: refresh materialized view concurrently - Mailing list pgsql-hackers

From Hitoshi Harada
Subject Re: refresh materialized view concurrently
Date
Msg-id CAP7QgmkjVhTS-3BzUgeRyy0VR9ASvj-cVCsq_WE1ywHEA=uXNg@mail.gmail.com
Whole thread Raw
In response to Re: refresh materialized view concurrently  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: refresh materialized view concurrently
List pgsql-hackers



On Tue, Jun 25, 2013 at 9:07 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jun 21, 2013 at 5:20 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> If I don't miss something, the requirement for the CONCURRENTLY option is to
> allow simple SELECT reader to read the matview concurrently while the view
> is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
> UPDATE/SHARE are still blocked.  So, I wonder why it is not possible just to
> acquire ExclusiveLock on the matview while populating the data and swap the
> relfile by taking small AccessExclusiveLock.  This lock escalation is no
> dead lock hazard, I suppose, because concurrent operation would block the
> other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts
> AccessExclusiveLock.  Then you don't need the complicated SPI logic or
> unique key index dependency.

This is no good.  One, all lock upgrades are deadlock hazards.  In
this case, that plays out as follows: suppose that the session running
REFRESH MATERIALIZED VIEW CONCURRENTLY also holds a lock on something
else.  Some other process takes an AccessShareLock on the materialized
view and then tries to take a conflicting lock on the other object.
Kaboom, deadlock.  Granted, the chances of that happening in practice
are small, but it IS the reason why we typically try to having
long-running operations perform lock upgrades.  Users get really
annoyed when their DDL runs for an hour and then rolls back.


OK, that' not safe.  What I was thinking was something similar to compare-and-swap, where the whole operation is atomic under an AccessExclusiveLock.  What if we release ExclusiveLock once a new matview was created and re-acquire AccessExclusiveLock before trying swap?  Note matview is a little different from index which I know people are talking about in REINDEX CONCURRENTLY thread, in that the content of matview does not change incrementally (at least at this point), but only does change fully in swapping operation by the same REFRESH MATERIALIZED VIEW command.  The only race condition is between releasing Exclusive lock and re-acquire AccessExclusiveLock someone else can go ahead with the same operation and could create another one.  If it happens, let's abort us, because I guess that's the way our transaction system is working anyway;  in case of unique key index insertion for example, if I find another guy is inserting the same value in the index, I wait for the other guy to finish his work and if his transaction commits I give up, otherwise I go ahead.  Maybe it's annoying if an hour operation finally gets aborted, but my purpose is actually achieved by the other guy.  If the primary goal of this feature is let reader reads the matview concurrently it should be ok?

Hmm, but in such cases the first guy is always win and the second guy who may come an hour later loses so we cannot get the result from the latest command... I still wonder there should be some way.
 
Two, until we get MVCC catalog scans, it's not safe to update any
system catalog tuple without an AccessExclusiveLock on some locktag
that will prevent concurrent catalog scans for that tuple.  Under
SnapshotNow semantics, concurrent readers can fail to see that the
object is present at all, leading to mysterious failures - especially
if some of the object's catalog scans are seen and others are missed.


So what I'm saying above is take AccessExclusiveLock on swapping relfile in catalog.  This doesn't violate your statement, I suppose.  I'm actually still skeptical about MVCC catalog, because even if you can make catalog lookup MVCC, relfile on the filesystem is not MVCC.  If session 1 changes relfilenode in pg_class and commit transaction, delete the old relfile from the filesystem, but another concurrent session 2 that just took a snapshot before 1 made such change keeps running and tries to open this relation, grabbing the old relfile and open it from filesystem -- ERROR: relfile not found.  So everyone actually needs to see up-to-date information that synchronizes with what filesystem says and that's SnapshotNow.   In my experimental thought above about compare-and-swap way, in compare phase he needs to see the most recent valid information, otherwise he never thinks someone did something new.  Since I haven't read the whole thread, maybe we have already discussed about it, but it would help if you clarify this concern.


Thanks,
--
Hitoshi Harada

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: pg_filedump 9.3: checksums (and a few other fixes)
Next
From: Hitoshi Harada
Date:
Subject: Re: refresh materialized view concurrently