Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation - Mailing list pgsql-hackers

From Noah Misch
Subject Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Date
Msg-id 20240906232539.47.nmisch@google.com
Whole thread Raw
In response to Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
List pgsql-hackers
On Fri, Sep 06, 2024 at 06:36:53PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > Yes, that's one way to make it work.  If we do it that way, it would be
> > critical to make the ALTER EXTENSION UPDATE run before anything uses the
> > index.  Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
> > char" data file.  Running ALTER EXTENSION UPDATE early enough should be
> > feasible, so that's fine.  Some other options:
> 
> > - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
> >   then make it also emit the statements to create the opclass.
> 
> > - Ship pg_trgm--1.6--1.7.sql in back branches.  If the upgrade wants to use
> >   gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
> >   (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
> >   an error if called.)
> 
> I feel like all of these are leaning heavily on users to get it right,

What do you have in mind?  I see things for the pg_upgrade implementation to
get right, but I don't see things for pg_upgrade users to get right.

The last option is disruptive for users of unsigned char platforms, since it
requires a two-step upgrade if upgrading from v11 or earlier.  The other two
don't have that problem.

> and therefore have a significant chance of breaking use-cases that
> were perfectly fine before.
> 
> Remind me of why we can't do something like this:
> 
> * Add APIs that allow opclasses to read/write some space in the GIN
> metapage.  (We could get ambitious and add such APIs for other AMs
> too, but doing it just for GIN is probably a prudent start.)  Ensure
> that this space is initially zeroed.
> 
> * In gin_trgm_ops, read a byte of this space and interpret it as
>     0 = unset
>     1 = signed chars
>     2 = unsigned chars
> If the value is zero, set the byte on the basis of the native
> char-signedness of the current platform.  (Obviously, a
> secondary server couldn't write the byte, and would have to
> wait for the primary to update the value.  In the meantime,
> it's no more broken than today.)
> 
> * Subsequently, use either signed or unsigned comparisons
> based on that value.
> 
> This would automatically do the right thing in all cases that
> work today, and it'd provide the ability for cross-platform
> replication to work in future.  You can envision cases where
> transferring a pre-existing index to a platform of the other
> stripe would misbehave, but they're the same cases that fail
> today, and the fix remains the same: reindex.

No reason you couldn't do it that way.  Works for me.




pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Re: PostgreSQL 17 release announcement draft
Next
From: Tom Lane
Date:
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation