Re: GiST support for UUIDs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: GiST support for UUIDs
Date
Msg-id 30332.1480370699@sss.pgh.pa.us
Whole thread Raw
In response to Re: GiST support for UUIDs  (Adam Brusselback <adambrusselback@gmail.com>)
Responses Re: GiST support for UUIDs  (Chris Bandy <bandy.chris@gmail.com>)
List pgsql-hackers
Adam Brusselback <adambrusselback@gmail.com> writes:
> [ btree_gist_uuid_7.patch ]

I spent awhile looking at this.  I have exactly no faith that it won't
crash on alignment-picky hardware, because this declaration:

union pg_uuid_t
{unsigned char data[UUID_LEN];uint64 v64[2];
};

is not the same as the existing pg_uuid_t; it instructs the compiler
that union pg_uuid_t must be double-aligned.  The existing type is
only byte-aligned, and is declared that way in pg_type, which means
that values on-disk are quite likely not to meet the alignment
expectation.

I spent a little bit of time trying to get the patch to crash on a PPC
machine, without success, but the compiler I have on that (gcc 4.0.1) is
very old and is not aggressive about things like optimizing memcpy calls
on supposedly-aligned arguments.  I think a modern gcc version on such
hardware would probably generate code that fails.  (Also, PPC is
big-endian, so some of the at-risk code isn't used; a picky little-endian
machine would have more scope for crashing.  Not real sure, but I think
ARM might be like that.)

What I would suggest is that you forget the union hack and just use
memcmp in all the comparison functions.  It's not likely to be worth
the trouble to try to get those calls to be safely aligned.  The
only place where you need go to any trouble is in uuid_parts_distance,
which could be redesigned to memcpy a not-necessarily-aligned source
into a local uint64[2] array and then byte-swap if needed.

(BTW, considering that we're going to return a float distance that has
only got twenty-something significant bits, do we *really* need to deal
with do-it-yourself int128 arithmetic in uuid_parts_distance?  Color
me skeptical.)

Also, I can't say that I think it's good style to be duplicating the
declaration of pg_uuid_t out of uuid.c.  (And it absolutely, positively,
is vile style to have two different declarations of the same struct in
different files, quite aside from whether they're ABI-compatible or not.)

I don't entirely see the point of making pg_uuid_t an opaque struct when
the only interesting thing about it, namely UUID_LEN, is exposed anyway.
So my inclination would be to just do

typedef struct pg_uuid_t
{unsigned char data[UUID_LEN];
} pg_uuid_t;

in uuid.h and forget the idea of it being opaque.

Also, the patch could be made a good deal smaller (and easier to review)
in the wake of commit 40b449ae8: you no longer need to convert
btree_gist--1.2.sql into btree_gist--1.3.sql, just leave it alone and add
btree_gist--1.2--1.3.sql.  That way we don't have to worry about whether
the upgrade script matches the change in the base script.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: PSQL commands: \quit_if, \quit_unless
Next
From: Artur Zakirov
Date:
Subject: Re: [BUG?] pg_event_trigger_ddl_commands() error with ALTER TEXT SEARCH CONFIGURATION