Re: record identical operator - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: record identical operator
Date
Msg-id 20130924160035.GT2706@tamriel.snowman.net
Whole thread Raw
In response to Re: record identical operator  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: record identical operator
List pgsql-hackers
* Kevin Grittner (kgrittn@ymail.com) wrote:
> Stephen Frost <sfrost@snowman.net> wrote:
> > The promise that we'll always return the binary representation of the
> > data that we saw last.  When greatest(x,y) comes back 'false' for a
> > MAX(), we then have to go check "well, does the type consider them
> > equal?", because, if the type considers them equal, we then have to
> > decide if we should replace x with y anyway, because it's different
> > at a binary level.  That's what we're saying we'll always do now.
>
> I'm having a little trouble following that. 

You're looking at it from the perspective of what's committed today, I
think.

> The query runs as it
> always has, with all the old definitions of comparisons.  After it
> is done, we check whether the rows are the same.  The operation of
> MAX() will not be affected in any way.  If MAX() returns a value
> which is not the same as what the matview has, the matview will be
> modified to match what MAX() returned.

I'm referring to a case where we're doing incremental maintenance of the
matview (in the far distant future..) and all we've got is the current
MAX value and the value from the row being updated.  We're going to
update that MAX on cases where the values are
"equal-but-binary-different".

> > We're also saying that we'll replace things based on plan differences
> > rather than based on if the rows underneath actually changed at all.
>
> Only if the user uses a query which does not produce deterministic
> results.

Which is apt to happen..

> > We could end up with material differences in the result of matviews
> > updated through incremental REFRESH and matviews updated through
> > actual incremental mainteance- and people may *care* about those
> > because we've told them (or they discover) they can depend on these
> > types of changes to be reflected in the result.
>
> Perhaps we should document the recommendation that people not
> create materialized views with non-deterministic results, but I
> don't think that should be a hard restriction. 

I wasn't suggesting a hard restriction, but I was postulating about how
the behavior may change in the future (or we may need to do very hacky
things to prevent it from channging) and how these decisions may impact
that.

> > What I was trying to get at is really that the delete/insert
> > approach would be good enough in very many cases and it wouldn't
> > have what look, to me anyway, as some pretty ugly warts around
> > these cases.
>
> I guess we could add a "DELETE everything and INSERT the new
> version of everything option for REFRESH in addition to what is
> there now, but I would be very reluctant to use it as a
> replacement.

It wouldn't address my concerns anyway, which are around these binary
operators and the update-in-place approach being risky and setting us up
for problems down the road.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Steve Singer
Date:
Subject: Re: logical changeset generation v6
Next
From: Pavel Stehule
Date:
Subject: Re: Improving avg performance for numeric