Re: Materialized views WIP patch - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Materialized views WIP patch
Date
Msg-id 20130124180754.GB2448@tornado.leadboat.com
Whole thread Raw
In response to Re: Materialized views WIP patch  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Materialized views WIP patch
List pgsql-hackers
On Thu, Jan 17, 2013 at 07:54:55AM -0500, Robert Haas wrote:
> On Wed, Jan 16, 2013 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Where I really need someone to hit me upside the head with a
> >> clue-stick is the code I added to the bottom of RelationBuildDesc()
> >> in relcache.c. The idea is that on first access to an unlogged MV,
> >> to detect that the heap has been replaced by the init fork, set
> >> relisvalid to false, and make the heap look normal again.
> >
> > Hmm.  I agree that relcache.c has absolutely no business doing that,
> > but not sure what else to do instead.  Seems like it might be better
> > done at first touch of the MV in the parser, rewriter, or planner ---
> > but the fact that I can't immediately decide which of those is right
> > makes me feel that it's still too squishy.
> 
> I think we shouldn't be doing that at all.  The whole business of
> transferring the relation-is-invalid information from the relation to
> a pg_class flag seems like a Rube Goldberg device to me.  I'm still
> not convinced that we should have a relation-is-invalid flag at all,
> but can we at least not have two?
> 
> It seems perfectly adequate to detect that the MV is invalid when we
> actually try to execute a plan - that is, when we first access the
> heap or one of its indexes.  So the bit can just live in the
> file-on-disk, and there's no need to have a second copy of it in
> pg_class.

Like Kevin, I want a way to distinguish unpopulated MVs from MVs that
genuinely yielded the empty set at last refresh.  I agree that there's no
particular need to store that fact in pg_class, and I would much prefer only
storing it one way in any case.  A user-visible disadvantage of the current
implementation is that relisvalid remains stale until something opens the rel.
That's fine for the system itself, but it can deceive user-initiated catalog
queries.  Imagine a check_postgres action that looks for invalid MVs to
complain about.  It couldn't just scan pg_class; it would need to first do
something that opens every MV.

I suggest the following:

1. Let an invalid MV have a zero-length heap.  Distinguish a valid, empty MV  by giving it a page with no tuples.  This
entailsVACUUM[1] not truncating  MVs below one page and the refresh operation, where necessary, extending  the relation
fromzero pages to one.
 
2. Remove pg_class.relisvalid.
3. Add a bool field to RelationData.  The word "valid" is used in that context  to refer to the validity of the
structureitself, so perhaps call the new  field rd_scannable.  RelationIsFlaggedAsValid() can become a macro;  consider
changingits name for consistency with the field name.
 
4. During relcache build, set the field to "RelationGetNumberBlocks(rel) != 0"  for MVs, fixed "true" for everyone
else. Any operation that changes the  field must, and probably would anyway, instigate a relcache invalidation.
 
5. Expose a database function, say pg_relation_scannable(), for querying the  current state of a relation.  This
supportsuser-level monitoring.
 

Does that seem reasonable?  One semantic difference to keep in mind is that
unlogged MVs will be considered invalid on the standby while valid on the
master.  That's essentially an accurate report, so I won't mind it.

For the benefit of the archives, I note that we almost need not truncate an
unlogged materialized view during crash recovery.  MVs are refreshed in a
VACUUM FULL-like manner: fill a new relfilenode, fsync it, and point the MV's
pg_class to that relfilenode.  When a crash occurs with no refresh in flight,
the latest refresh had been safely synced.  When a crash cuts short a refresh,
the pg_class update will not stick, and the durability of the old heap is not
in doubt.  However, non-btree index builds don't have the same property; we
would need to force an immediate sync of the indexes to be safe here.  It
would remain necessary to truncate unlogged MVs when recovering a base backup,
which may contain a partially-written refresh that did eventually commit.
Future MV variants that modify the MV in place would also need the usual
truncate on crash.

I'm going to follow this with a review covering a broader range of topics.

Thanks,
nm

[1] For the time being, it's unfortunate to VACUUM materialized views at all;
they only ever bear frozen tuples.



pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: LATERAL, UNNEST and spec compliance
Next
From: Noah Misch
Date:
Subject: Re: Materialized views WIP patch