Re: record identical operator - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: record identical operator
Date
Msg-id 20130923190833.GI2706@tamriel.snowman.net
Whole thread Raw
In response to Re: record identical operator  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: record identical operator
List pgsql-hackers
* Kevin Grittner (kgrittn@ymail.com) wrote:
> Unless we can tell whether there are any differences between two
> versions of a row, we can't accurately generate the delta to drive
> the incremental maintenance.

This is predicated on the assumption that you simply generate the new
view and then try to figure out what to go update.  I'm trying to
explain that using that methodology is what landed us in this situation
to begin with.

>   The initial thread discussing how
> incremental maintenance would be done is here:

My apologies for not paying quite as close attention to that thread as I
should have.

> I think it is fairly obvious that REFRESH should REgenerate a FRESH
> copy of the data, versus incremental maintenance -- which attempts
> to keep the matview up-to-date without regenerating the full set of
> data.

Having 'REFRESH' regenerate a fresh copy of the data makes sense to me,
and is what we have now, no?  The only issue there is that it takes out
a big lock, which I appreciate that you're trying to get rid of.

>   Whenever there is logical replication (and materialized
> views are, conceptually, one form of that -- within the database) I
> feel it is important to be able to correct any possible "drift".
> With matviews, I see the way to do that as the REFRESH command, and
> I feel that it is important to be able to do that in a way that can
> run concurrently with readers of the matview -- without blocking
> them or being blocked by them.

Of course.

> Discussion of incremental maintenance really belongs on a different
> thread. 

I'm really getting tired of everyone saying "this is the only way to do
it" (or perhaps "well, this is already committed, therefore it must be
what we're gonna do") when a) we're already planning to rip this out
and change it, or so I thought, and b) we're trying to make promises we
can't keep with this approach.

> Since I have gone to the trouble to read a lot of papers
> on the topic, and select one that I think is a good basis for our
> implementation, I hope everyone will frame discussion in terms of
> either:
>   -  how best to implement the techniques from that paper, or
>   -  why some other paper presents a better technique.

My recollection from the hackers meeting is that I'm trying to simply
paraphrase what you had said was in the paper wrt keeping track of what
rows are changed underneath and using that as a basis to implement the
changes necessary in the view.  Does the paper you're referring to
describe rerunning the whole query and then trying to figure out what's
been changed..?  That's really what I'm having trouble understanding
why anyone would want to implement.  I'll try and find time to hunt down
the threads and papers on it, but I really could have sworn this was
gone over at the hacker meeting- and it made a lot of sense to me, then.

> I really didn't expect to have to burn so much time
> and energy arguing over whether a REFRESH should leave the matview
> accurately containing the results of the matview's query.

I appreciate you bringing me up to speed on where things actually are
here- again, sorry for not realizing the direction that this was going
in earlier; it really didn't even occur to me that it would have gone
down this road.  I, also, didn't expect to spend so much time on this.

> >> We can argue about how it should be named

Really, I'm back to trying to figure out why we want to go down this
road at all.

> >> and whether it should be documented
>
> I thought we had a consensus to document both the existing record
> comparison operators and these new ones, and I'm fine with that.

If it gets added, it certainly should be documented, and heavily
caveated.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: File_fdw documentation patch to clarify OPTIONS clause
Next
From: Robert Haas
Date:
Subject: Re: record identical operator