Thread: matview incremental maintenance

matview incremental maintenance

From
Kevin Grittner
Date:
Since there seems to be interest in discussing incremental
maintenance of materialized views *now*, I'm starting this thread
to try to avoid polluting unrelated threads with the discussion.  I
don't intend to spend a lot of time on it until the CF in progress
completes, but at that point the work will start in earnest.  So
I'll say where I'm at, and welcome anyone who has time to spare
outside of the CF to comment or contribute ideas.

The paper at the core of the discussion can be found by searching
for "maintaining views incrementally gupta mumick subrahmanian" --
it's on both the ACM and CiteSeerX websites.  Of course, one
doesn't need to understand that paper to discuss techniques for
capturing the base deltas, but I'm hoping that's not what takes up
most of the discussion.  I expect the most important discussions to
be around how best to handle the "count(t)" (or count_t) column,
what form should be use for intermediate results, how to modify or
add execution nodes which know how to deal with the count, how to
generate set operations to use those nodes, and how to modify the
planner to choose the best plan for these operations.  Whether to
pull the deltas off the WAL stream or stuff them into a tuplestore
as they are written seems to me to be a relatively minor point.  If
properly abstracted, the performance and complexity of alternatives
can be compared.

The one thing that seems somewhat clear to me at the moment is that
the complex set algebra needed to use the counting algorithm for
incremental maintenance is not going to be something I want to
handle by dynamically building up execution nodes.  That way lies
madness. SPI or something very similar to it should be used,
probably with a layer or two above it to simplify working with the
algebra separately from diddling around with strings for the query
fragments.

At the developer meeting last month, we talked about the special
new count column for a bit, and everyone seemed to agree that
adding such an animal, and creating execution nodes which were
aware of it, would best be done on top of the patch Álvaro has been
working on to replace attnum with three columns: a logical ID
number for each column, the physical order of the attribute within
the tuple image, and the display order (for SELECT *, INSERT
without a column list, and similar cases).  We seemed to have
consensus that the count_t column would not display by default, but
could be explicitly called out by a query, similar to the current
handling of system columns.  Nobody wanted to have a negative
column number for the count or add it to the tuple header
structure.  Unfortunately I have heard from Álvaro that the patch
is not complete and is not on his list of things to work on in the
near future.

Long term, timings for incremental maintenance that people would
like to see (from most eager to least eager) are:

 - as part of completing each statement, so that the affect on the
matview is immediately visible to the transaction which modifies a
supporting table, and becomes visible at commit to other
transactions

 - at transaction commit time, so that other transactions see the
changes to the base tables and the referencing matviews at the same
point in time

 - from a FIFO queue which is processed by a background process
whenever data is present (possibly with pacing)

 - from a FIFO queue based on a schedule, so that matviews are
stable between applications and/or to avoid burdening the machine
during peak periods

 - incremental update, or even full refresh, on an attempt to query
a "stale" matview

 - explicit request to apply incremental updates or refresh

Incremental maintenance of a materialized view is a heuristic, to
refresh contents more quickly than might happen by re-running the
query which defines the matview.  There will always be cases where
the changes are so extensive that applying the delta will be slower
than a refresh.  At some point we should have a cost-based way to
recognize when we have crossed that threshold, and fall back to the
refresh technique.  That's not for this release, though.

In previous discussion there seemed to be a consensus that before
incremental maintenance for a materialized view could be turned on,
the matview would need to be populated and all referenced tables
would need to be flagged as generating delta information, through a
new ALTER TABLE option.

While I have yet to look in detail at the mechanism for capturing
the initial delta on the base tables, the two fairly obvious
candidates are to stuff the before and after images into a
tuplestore or temp table as base table changes are written,
somewhere around the point that triggers would be fired, or to use
the WAL stream in some way.  The advantages of the former are that
it would be hard to find a lower overhead way to capture the data,
nor a more certain way to get exactly the right data.  The latter,
which Simon has been arguing is better than using triggers, would
have the advantage of not directly slowing down a process writing
to base tables, although for more eager modes transactions would
need to block waiting for the data to flow through the walsender,
be filtered and assembled as data of interest, and communicated
back to the transaction somehow before it could proceed.  Assuming
that it can provide the changeset prior to the commit, and that it
can include "before" images, it could work, but the timing sure
seems dubious for the more eager modes.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: matview incremental maintenance

From
Stefan Drees
Date:
On 2013-06-17 16:41 +02:00, Kevin Grittner wrote:
> Since there seems to be interest in discussing incremental
> maintenance of materialized views *now*, I'm starting this thread
> to try to avoid polluting unrelated threads with the discussion.  I
> don't intend to spend a lot of time on it until the CF in progress
> completes, but at that point the work will start in earnest.  So
> I'll say where I'm at, and welcome anyone who has time to spare
> outside of the CF to comment or contribute ideas.
>
> The paper at the core of the discussion can be found by searching
> for "maintaining views incrementally gupta mumick subrahmanian" --
> it's on both the ACM and CiteSeerX websites.  Of course, one

i.e.
Ashish Gupta, Inderpal Singh Mumick, and V. S. Subrahmanian. 1993. 
Maintaining views incrementally. In Proceedings of the 1993 ACM SIGMOD 
international conference on Management of data (SIGMOD '93), Peter 
Buneman and Sushil Jajodia (Eds.). ACM, New York, NY, USA, 157-166. 
DOI=10.1145/170035.170066 http://doi.acm.org/10.1145/170035.170066

just in case a direct reference might come in handy :-)

> ...

All the ebst,
Stefan.



Re: matview incremental maintenance

From
Simon Riggs
Date:
On 17 June 2013 15:41, Kevin Grittner <kgrittn@ymail.com> wrote:

> Since there seems to be interest in discussing incremental
> maintenance of materialized views *now*

Since your earlier complaint, I specificaly said I was happy to wait
to discuss that. Why have you raised this now?

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: matview incremental maintenance

From
Simon Riggs
Date:
On 17 June 2013 15:41, Kevin Grittner <kgrittn@ymail.com> wrote:

> While I have yet to look in detail at the mechanism for capturing
> the initial delta on the base tables, the two fairly obvious
> candidates are to stuff the before and after images into a
> tuplestore or temp table as base table changes are written,
> somewhere around the point that triggers would be fired, or to use
> the WAL stream in some way.  The advantages of the former are that
> it would be hard to find a lower overhead way to capture the data,
> nor a more certain way to get exactly the right data.  The latter,
> which Simon has been arguing is better than using triggers, would
> have the advantage of not directly slowing down a process writing
> to base tables, although for more eager modes transactions would
> need to block waiting for the data to flow through the walsender,
> be filtered and assembled as data of interest, and communicated
> back to the transaction somehow before it could proceed.  Assuming
> that it can provide the changeset prior to the commit, and that it
> can include "before" images, it could work, but the timing sure
> seems dubious for the more eager modes.

It isn't unconditionally true statement to say "it would be hard to
find a lower overhead way to capture the data", since there is strong
experimental evidence from work on replication that shows that using
the WAL is very effective mechanism for changeset extraction.

There is nothing to say the changeset must occur through the
WalSender. That is just where it currently occurs, but it could easily
occur elsewhere, if the requirement existed. Similarly, changeset
extraction doesn't currently allow access to uncommitted rows, but it
could do so, if required. Before images of change could be provided by
direct access to prior versions via their tid, just as they are with
triggers.

There are other advantages to using WAL that you don't mention, such
as the avoidance of the need for the trigger queue to spill to disk,
avoidance of memory overhead for large transactions and avoidance of
random I/O.

ISTM that using WAL has to be properly considered as a viable option
which is why open discussion makes sense.

The timing of that discussion doesn't need to be immediate but
certainly it should happen before any options are precluded because of
the progress of other events. Let me me know when that's appropriate,
so we can discuss.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: matview incremental maintenance

From
Nico Williams
Date:
On Mon, Jun 17, 2013 at 07:41:15AM -0700, Kevin Grittner wrote:
> Since there seems to be interest in discussing incremental
> maintenance of materialized views *now*, I'm starting this thread
> to try to avoid polluting unrelated threads with the discussion.  I
> don't intend to spend a lot of time on it until the CF in progress
> completes, but at that point the work will start in earnest.  So
> I'll say where I'm at, and welcome anyone who has time to spare
> outside of the CF to comment or contribute ideas.

I have an implementation that supports updates, but it doesn't implement
automatic updates as described in the paper you cited.

> The paper at the core of the discussion can be found by searching
> for "maintaining views incrementally gupta mumick subrahmanian" --
> it's on both the ACM and CiteSeerX websites.  Of course, one
> doesn't need to understand that paper to discuss techniques for
> capturing the base deltas, but I'm hoping that's not what takes up
> most of the discussion.  I expect the most important discussions to
> be around how best to handle the "count(t)" (or count_t) column,
> what form should be use for intermediate results, how to modify or
> add execution nodes which know how to deal with the count, how to
> generate set operations to use those nodes, and how to modify the
> planner to choose the best plan for these operations.  Whether to
> pull the deltas off the WAL stream or stuff them into a tuplestore
> as they are written seems to me to be a relatively minor point.  If
> properly abstracted, the performance and complexity of alternatives
> can be compared.

Sure.  Automatically converting INSERTs/UPDATEs/DELETEs of MATERIALIZED
VIEW table sources is not trivial, though I've had some luck with a
particular multi-join query by just manually adding constraints from the
OLD.* and NEW.* rows, though unfortunately the PG query planner was
unable to find the obvious fast query plan (more on that in another
thread) and I ended up having to manually optimize the query using
WITH...

So at least for some queries it's easy enough to automatically propagate
constraints into the VIEW's query.  For some recursive queries it may
also be easy to propagate OLD.*/NEW.* into a seed.  Anyways, this is a
bit far afield though, as what I have managed so far is very useful even
without automatically producing updates based only on the VIEW's query.

Nor am I certain that automatically updating a materialized view is
always the right thing to do.

A case in point for me is an authorization system (which generally means
there's something of a transitive closure involved, which means
recursive queries).  In this system adding grants is cheap enough, but
massive revocation (e.g., deleting all of a user's entitlements, perhaps
by deleting the user and cascading the deletion) is not: it can be
faster to just refresh the view old-style than to update it dynamically!
(You noted this problem.)

The queries I use for dynamically updating the materialized views are
hand-optimized as mentioned above.  They are not too unlike what an
automatic system would have generated.

Granted, the problem partly is that an ORM is involved, which adds
obnoxious overhead: one statement per grant deletion versus one
statement for deleting all of them.  But it's still possible that at
some point it's best to refresh, even if an ORM were not involved.

> At the developer meeting last month, we talked about the special
> new count column for a bit, and everyone seemed to agree that
> adding such an animal, ...

What's that?

> Long term, timings for incremental maintenance that people would
> like to see (from most eager to least eager) are:
>
> - as part of completing each statement, so that the affect on the
> matview is immediately visible to the transaction which modifies a
> supporting table, and becomes visible at commit to other
> transactions

That can be done now with hand-coded triggers, as I do in my use case.
Though I do leave some cases to a deferred refresh as mentioned above.

> - at transaction commit time, so that other transactions see the
> changes to the base tables and the referencing matviews at the same
> point in time

See above.

> - from a FIFO queue which is processed by a background process
> whenever data is present (possibly with pacing)

I do this.  I use NOTIFY/LISTEN and timeouts to drive refreshes as
needed.

> - from a FIFO queue based on a schedule, so that matviews are
> stable between applications and/or to avoid burdening the machine
> during peak periods

Ditto.

> - incremental update, or even full refresh, on an attempt to query
> a "stale" matview

I record and store (in a per-view history table) differences computed
during a refresh.

> - explicit request to apply incremental updates or refresh

I make this decision by using one trigger function or another.  One set
of trigger functions generates updates properly (by updating the
materialization table), and another merely marks the view as needing a
refresh and NOTIFYies a waiting daemon.

> Incremental maintenance of a materialized view is a heuristic, to
> refresh contents more quickly than might happen by re-running the
> query which defines the matview.  There will always be cases where
> the changes are so extensive that applying the delta will be slower
> than a refresh.  At some point we should have a cost-based way to
> recognize when we have crossed that threshold, and fall back to the
> refresh technique.  That's not for this release, though.

Yes.

One way to do this is to keep track of refresh times and interrupt
dynamic updates that take longer, then running a refresh instead.

However, I do feel that users will want some measure of control over
this at the schema level -- after all, I do!  The solution I use lets me
have this control.

> In previous discussion there seemed to be a consensus that before
> incremental maintenance for a materialized view could be turned on,
> the matview would need to be populated and all referenced tables
> would need to be flagged as generating delta information, through a
> new ALTER TABLE option.

That seems strange to me.  I'd rather express this through
CREATE/ALTER MATERIALIZED VIEW.  Enabling updates for changes of a
particular table source should cause PG to create internal trigger
functions and triggers on the given table source.

Using triggers for this (even hidden, internal ones) does create a
problem in that we might need a way to control order of execution, so
that user-defined triggers run first so that they can make any
additional changes that might also trigger updates of the view, but this
can be addressed later.

In any case, even now, three years later, isn't it the case that the
code needed to perform query algebra to produce the necessary "triggers"
is missing in action?  Wouldn't it be better to do something in the
interim?  Here's my attempt:

https://github.com/twosigma/postgresql-contrib

Summary:
- PlPgSQL-coded materialized views (no syntax, just functions)
- only concurrent refresh available
- deltas computed on refresh are stored in a history table, with "old"  and "new" columns of the record type of the
materializationtable 
- triggers on the materialization table automatically record updates of  the materialization table in the history table
- last refresh timestamp and "refresh needed" state kept for each  materialized view
- NOTIFY is used to broadcast the need to refresh (for cases where it  would be faster to refres) to a waiting daemon
(notincluded, but it  is scripted around the pqasyncnotifier tool included in this repo) 
- hand-coded triggers (not included here, naturally) can update the  materialized views, which as explained above,
automaticallyupdates  the history table 

This works very well.  I use the history table to generate deltas for an
external non-SQL system.  So I get something close to real-time updates,
and if NOTIFY/LISTEN were safer (e.g., indicated who NOTIFYied, and
encoded proper row values), then I could use NOTIFY/LISTEN to distribute
updates in something closer to real-time.

I would really like to see something like this integrated into PG.  I
can do some of the work for that.  I do believe this will not only not
conflict, but instead maybe even enable, further work on automatic mat
view updates.

Sorry for the length of this reply,

Nico
--