Re: Review: GiST support for UUIDs - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: Review: GiST support for UUIDs
Date
Msg-id 55F7B494.7060603@sigaev.ru
Whole thread Raw
In response to Re: Review: GiST support for UUIDs  (Paul Jungwirth <pj@illuminatedcomputing.com>)
Responses Re: Review: GiST support for UUIDs  (Michael Paquier <michael.paquier@gmail.com>)
Re: Review: GiST support for UUIDs  (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
List pgsql-hackers

Paul Jungwirth wrote:
>> Or something like this in pseudocode:
>>
>> numeric = int8_numeric(*(uint64 *)(&i->data[0])) *
>> int8_numeric(MAX_INT64) + int8_numeric(*(uint64 *)(&i->data[8]))
>
> This is more like what I was hoping for, rather than converting to a
> string and back. Would you mind confirming for me: int8_numeric turns an
> int64 into an arbitrary-precision numeric Datum? So there is no overflow
> risk here?
Sure, no risk. Numeric precision is limited 1000 digits with magnitude 1000


>
> But it looks like int8_numeric takes a *signed* integer. Isn't that a
> problem? I suppose I could get it working though by jumping through some
> hoops.
signed vs unsigned problem does not exist actually, because of precision 
of numeric is much better than we need and presence of numeric_abs.


> Yes, but that seems like an unrealistic concern. Even "only" 2^64 is
> 18446744073709551616 records. Or another way of thinking about it, if 64
:) "only"

> bits (or 32) is a good enough penalty input for all the other data
> types, why not for UUIDs? Keep in mind type 3-5 UUIDs should be evenly
> distributed. Perhaps we could use the bottom half (instead of the top)
> to ensure even distribution for type 1 and 2 too.
it must be. But UUID could be taken for unknown source and we can't 
predict distribution. I believe pg generates them correctly, but other 
generators could be not so good.

> It seems to me that using only the top half should be okay, but if you
> believe it's not I'll go with the int8_numeric approach (in three chunks
> instead of two to work around signed-vs-unsigned).
Yes, I believe. It is not good case when we can ruin index performance 
with special set of value.

Some difficulty  which I see is how to transform numeric penalty to 
double as it requires by GiST. May be, penalty/(INT_MAX64*INT_MAX64 + 
INT_MAX64)? Keep in mind, that penalty is how range will be enlarged 
after range union, so minimal penalty is 0 and maximum is 
0xffffffffffffffffffffffffffffffff

-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru                                      WWW:
http://www.sigaev.ru/



pgsql-hackers by date:

Previous
From: Charles Clavadetscher
Date:
Subject: Re: Attach comments to functions' parameters and return value
Next
From: Teodor Sigaev
Date:
Subject: Re: [PROPOSAL] Covering + unique indexes.