Re: btree_gist macaddr valgrind woes - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: btree_gist macaddr valgrind woes
Date
Msg-id 5377CAAF.4030703@vmware.com
Whole thread Raw
In response to Re: btree_gist macaddr valgrind woes  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: btree_gist macaddr valgrind woes
Re: btree_gist macaddr valgrind woes
List pgsql-hackers
On 05/17/2014 11:12 PM, Tom Lane wrote:
> I wrote:
>> BTW, the *real* problem with all this stuff is that the gbtreekeyNN types
>> are declared as having int alignment, even though some of the opclasses
>> store double-aligned types in them.  I imagine it's possible to provoke
>> bus errors on machines that are picky about alignment.  The first column
>> of an index is safe enough because index tuples will be double-aligned
>> anyway, but it seems like there's a hazard for lower-order columns.
>
> And indeed there is:
>
> regression=# create extension btree_gist ;
> CREATE EXTENSION
> regression=# create table tt (f1 int2, f2 float8);
> CREATE TABLE
> regression=# create index on tt using gist(f1,f2);
> CREATE INDEX
> regression=# insert into tt values(1,2);
> INSERT 0 1
> regression=# insert into tt values(2,4);
> INSERT 0 1
> regression=# set enable_seqscan TO 0;
> SET
> regression=# explain select * from tt where f1=2::int2 and f2=4;
>                                 QUERY PLAN
> ------------------------------------------------------------------------
>   Index Scan using tt_f1_f2_idx on tt  (cost=0.14..8.16 rows=1 width=10)
>     Index Cond: ((f1 = 2::smallint) AND (f2 = 4::double precision))
>   Planning time: 1.043 ms
> (3 rows)
>
> regression=# select * from tt where f1=2::int2 and f2=4;
> << bus error >>
>
>> This is something we cannot fix compatibly :-(
>
> AFAICS, what we have to do is mark the wider gbtreekeyNN types as
> requiring double alignment.  This will break pg_upgrade'ing any index in
> which they're used as non-first columns, unless perhaps all the preceding
> columns have at least double size/alignment.  I guess pg_upgrade can
> check for that, but it'll be kind of a pain.
>
> Another issue is what the heck btree_gist's extension upgrade script ought
> to do about this.  It can't just go and modify the type declarations.
>
> Actually, on further thought, this isn't an issue for pg_upgrade at all,
> just for the extension upgrade script.  Maybe we just have to make it
> poke through the catalogs looking for at-risk indexes, and refuse to
> complete the extension upgrade if there are any?

I think it would be best to just not allow pg_upgrade if there are any 
indexes using the ill-defined types. The upgrade script could then 
simply drop the types and re-create them. The DBA would need to drop the 
affected indexes before upgrade and re-create them afterwards, but I 
think that would be acceptable. I doubt there are many people using 
btree_gist on int8 or float8 columns.

Another way to attack this would be to change the code to memcpy() the 
values before accessing them. That would be ugly, but it would be 
back-patchable. In HEAD, I'd rather bite the bullet and get the catalogs 
fixed, though.

- Heikki



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: buildfarm animals and 'snapshot too old'
Next
From: Tomas Vondra
Date:
Subject: Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)