Re: Collations and Replication; Next Steps - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Collations and Replication; Next Steps
Date
Msg-id CAM-w4HPaBXFE6NF4MvqkhrPs_FM8G5Zd+VeJTT1VqL3GYxzcwg@mail.gmail.com
Whole thread Raw
In response to Re: Collations and Replication; Next Steps  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Collations and Replication; Next Steps
Re: Collations and Replication; Next Steps
List pgsql-hackers
On Tue, Sep 16, 2014 at 11:41 PM, Peter Geoghegan <pg@heroku.com> wrote:
> The timezone case you highlight here seems quite distinct from what
> Matthew is talking about, because in point of fact the on-disk
> representation is merely *interpreted* with reference to the timezone
> database. So, you could have an inconsistency between standbys
> concerning what the time was in a particular timezone at a particular
> timestamp value as reported by the timestamptz output function, but
> both standbys would be correct on their own terms, which isn't too
> bad.

You could have a problem if you have an expression index on (timestamp
AT TIME ZONE '...'). I may have the expression slightly wrong but I
believe it is posisble to write an immutable expression that depends
on the tzdata data as long as it doesn't depend on not the user's
current time zone (which would be stable but not immutable). The
actual likelihood of that situation might be much lower and the
ability to avoid it higher but in theory I think Peter's right that
it's the same class of problem.

Generally speaking we try to protect against most environment
dependencies that lead to corrupt databases by encoding them in the
control file. Obviously we can't encode an entire collation in the
controlfile though. We could conceivably have a corpus of
representative strings that we sort and then checksum in the
controlfile. It wouldn't be foolproof but if we collect interesting
examples as we find them it might be a worthwhile safety check.

Just brainstorming... I wonder if it would be possible to include any
collation comparisons made in handling an index insert in the xlog
record and have the standby verify those comparisons are valid on the
standby. I guess that would be pretty hard to arrange code-wise since
the comparisons could be coming from anywhere to say nothing of the
wal bloat.

Peter G, could go into more detail about collation versioning? What
would the implications be for Postgres?

-- 
greg



pgsql-hackers by date:

Previous
From: Matthew Kelly
Date:
Subject: Re: Collations and Replication; Next Steps
Next
From: Tatsuo Ishii
Date:
Subject: Re: Collations and Replication; Next Steps