Thread: Using indices with long unique IDs.

Using indices with long unique IDs.

From
"Sergey Olefir (edit e-mail to reply)"
Date:
Hello!

I am planning to use unique IDs in the little system I am building. Now
being more than a little paranoid (and having no idea about expected loads),
I am wary of using int4 as a basis for uids (for the fear of ever running
out of them).

So the logical choice would be int8, right? Unfortunately quite wrong.
Statement of the form:
"SELECT * FROM table WHERE id=1"
will never use index for id (assumming id is int8) since '1' is of type
int4. This is confirmed both by documentation and SQL EXPLAIN (after set
enable_seqscan TO 'off').

There are two suggested work-arounds:
"SELECT * FROM table WHERE id='1'"
"SELECT * FROM table WHERE id=1::int8"

Unfortunately neither of them seem to be portable (for example, the one with
single quotes fails if I create PreparedStatement in Java:
con.prepareStatement("SELECT * FROM table WHERE id='?'"); apparently Java
doesn't parse question mark inside quotes).

I cannot have non-portable SQL as I am not convinced that PostgreSQL is the
right choice for my system. Oh, and by the way, I am using Java + JDBC for
my DB needs.


So what other options do I have? One seems to be NUMERIC( 13, 0 ) [or
something to that effect]. But what impact does it have performance-wise?
(code-wise it makes no difference through JDBC whether I use int4, int8, or
NUMERIC; I still represent that as 'long' in Java)

One thing I am definitely noticing is that NUMERIC( 13, 0 ) does not seem to
ever use sequential scan for queries (verified with EXPLAIN), it always goes
for indices. It that bad (performance wise)? Are there any other issues I
need to be aware of before settling on an uid type?

Thanks in advance,
-------------
Sergey Olefir
Exigen Latvia, system analyst

Honesty is a virtue.
That is if you manage to survive.


Re: Using indices with long unique IDs.

From
Mike Mascari
Date:
Sergey Olefir (edit e-mail to reply) wrote:

>Hello!
>
>I am planning to use unique IDs in the little system I am building. Now
>being more than a little paranoid (and having no idea about expected loads),
>I am wary of using int4 as a basis for uids (for the fear of ever running
>out of them).
>
>So the logical choice would be int8, right? Unfortunately quite wrong.
>Statement of the form:
>"SELECT * FROM table WHERE id=1"
>will never use index for id (assumming id is int8) since '1' is of type
>int4. This is confirmed both by documentation and SQL EXPLAIN (after set
>enable_seqscan TO 'off').
>
>There are two suggested work-arounds:
>"SELECT * FROM table WHERE id='1'"
>"SELECT * FROM table WHERE id=1::int8"
>
>
1. I believe 7.4 contains modifications that would allow for index scans
WHERE id=1
2. You could always start the sequence at 5 billion, in which case a
pre-7.4 version will also use the index:

[estore@lexus] explain select * from employees where employee = 5000000000;
                                   QUERY
PLAN
--------------------------------------------------------------------------------
 Index Scan using i_employees1 on employees  (cost=0.00..4.68 rows=1
width=264)
   Index Cond: (employee = 5000000000::bigint)
(2 rows)

3. If you want *globally* unique ids then you must use a different type.
I believe there's a UUID project on gborg, IIRC.

Mike Mascari