Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation |
Date | |
Msg-id | 20240518214546.e8@rfd.leadboat.com Whole thread Raw |
In response to | Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
|
List | pgsql-hackers |
On Wed, May 15, 2024 at 02:56:54PM +0900, Masahiko Sawada wrote: > On Sat, May 4, 2024 at 7:36 AM Joe Conway <mail@joeconway.com> wrote: > > On 5/3/24 11:44, Peter Eisentraut wrote: > > > On 03.05.24 16:13, Tom Lane wrote: > > >> Peter Eisentraut <peter@eisentraut.org> writes: > > >>> On 30.04.24 19:29, Tom Lane wrote: > > >>>> Also, the bigger picture here is the seeming assumption that "if > > >>>> we change pg_trgm then it will be safe to replicate from x86 to > > >>>> arm". I don't believe that that's a good idea and I'm unwilling > > >>>> to promise that it will work, regardless of what we do about > > >>>> char signedness. That being the case, I don't want to invest a > > >>>> lot of effort in the signedness issue. Option (1) is clearly > > >>>> a small change with little if any risk of future breakage. > > >> > > >>> But note that option 1 would prevent some replication that is currently > > >>> working. > > >> > > >> The point of this thread though is that it's working only for small > > >> values of "work". People are rightfully unhappy if it seems to work > > >> and then later they get bitten by compatibility problems. > > >> > > >> Treating char signedness as a machine property in pg_control would > > >> signal that we don't intend to make it work, and would ensure that > > >> even the most minimal testing would find out that it doesn't work. > > >> > > >> If we do not do that, it seems to me we have to buy into making > > >> it work. That would mean dealing with the consequences of an > > >> incompatible change in pg_trgm indexes, and then going through > > >> the same dance again the next time(s) similar problems are found. > > > > > > Yes, that is understood. But anecdotally, replicating between x86-64 arm64 is > > > occasionally used for upgrades or migrations. In practice, this appears to have > > > mostly worked. If we now discover that it won't work with certain index > > > extension modules, it's usable for most users. Even if we say, you have to > > > reindex everything afterwards, it's probably still useful for these scenarios. Like you, I would not introduce a ControlFileData block for sign-of-char, given the signs of breakage in extension indexing only. That would lose much useful migration capability. I'm sympathetic to Tom's point, which I'll attempt to summarize as: a ControlFileData block is a promise we know how to keep, whereas we should expect further bug discoveries without it. At the same time, I put more weight on the apparently-wide span of functionality that works fine. (I wonder whether any static analyzer does or practically could detect char sign dependence with a decent false positive rate.) > +1 > > How about extending amcheck to support GIN and GIst indexes so that it > can detect potential data incompatibility due to changing 'char' to > 'unsigned char'? I think these new tests would be useful also for > users to check if they really need to reindex indexes due to such > changes. Also we fix pg_trgm so that it uses 'unsigned char' in PG18. > Users who upgraded to PG18 can run the new amcheck tests on the > primary as well as the standby. If I were standardizing pg_trgm on one or the other notion of "char", I would choose signed char, since I think it's still the majority. More broadly, I see these options to fix pg_trgm: 1. Change to signed char. Every arm64 system needs to scan pg_trgm indexes. 2. Change to unsigned char. Every x86 system needs to scan pg_trgm indexes. 3. Offer both, as an upgrade path. For example, pg_trgm could have separate operator classes gin_trgm_ops and gin_trgm_ops_unsigned. Running pg_upgrade on an unsigned-char system would automatically map v17 gin_trgm_ops to v18 gin_trgm_ops_unsigned. This avoids penalizing any architecture with upgrade-time scans. Independently, having amcheck for GIN and/or GiST would be great.
pgsql-hackers by date: