Thread: SERIAL or INT8 / Unique BLOB's

SERIAL or INT8 / Unique BLOB's

From
"Richard Teviotdale"
Date:
Using SERIAL or INT8
 
I have been using SERIAL a great deal when creating table primary keys.
The type of integer that this creates is an INT4, but I noticed that there is a larger INT8 available.
 
Would it be advisable to create a sequence manually and use the INT8 if the table size might be expected to grow beyond the 2 billion limit imposed by a INT4 (INTEGER) field type?
 
Unique BLOB's
 
I want to allow users to put blob's into a database. But I would like the database to recognise when a user is tying to insert a file that is identical to an existing file, as opposed to duplicating the blob. Either way I have to record the oid of the blob in a table so I can "locate" it later. I call this table blob_oid.
 
My plan is:
 
Add a field ( blob_md5 VARCHAR(32) ) to the blob_oid table and make a UNIQUE INDEX on this field to prevent duplicate values.
Before adding any blobs do a md5sum of the file and try inserting this value into blob_md5. If the file already exists we should get an error because it would have the same md5sum, otherwise upload the blob and update the blob_oid record created in the previous step.
 
I think this would work, but perhaps there is a better way?

Re: SERIAL or INT8 / Unique BLOB's

From
Dave Trombley
Date:
Richard Teviotdale wrote:

> Using SERIAL or INT8
>
>
>
> I have been using SERIAL a great deal when creating table primary keys.
>
> The type of integer that this creates is an INT4, but I noticed that
> there is a larger INT8 available.
>
>
>
> Would it be advisable to create a sequence manually and use the INT8
> if the table size might be expected to grow beyond the 2 billion limit
> imposed by a INT4 (INTEGER) field type?
>
>
>
       Currently, I don't think sequences can return values larger than
the max int4. You could of course emulate a sequence, but if you don't
have a large number of rows at any given time you may want to take
advantage of the fact that sequences can cycle around, optionally, since
they do cache values for performance gain.


(http://www2.us.postgresql.org/users-lounge/docs/7.1/reference/sql-createsequence.html
)


       Also, FYI, 7.2 looks-like-it-will/does have a
'bigserial'/'serial8' type.

    -dj trombley
     <dtrom@bumba.net>




Re: SERIAL or INT8 / Unique BLOB's

From
"Jeffrey W. Baker"
Date:

On Wed, 2 Jan 2002, Dave Trombley wrote:

>        Also, FYI, 7.2 looks-like-it-will/does have a
> 'bigserial'/'serial8' type.

yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
machine)?

-jwb


Re: SERIAL or INT8 / Unique BLOB's

From
Dave Trombley
Date:
Jeffrey W. Baker wrote:

>
>yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
>machine)?
>

I don't see why it wouldn't!  gcc has supported 8-byte longs for a
(*sigh*) long, long, time now. =)
This is on my i686-class linux box, build of yesterday's CVS snapshot:

test=# create table ttab(a int, b serial8);
NOTICE:  CREATE TABLE will create implicit sequence 'ttab_b_seq' for
SERIAL column 'ttab.b'
NOTICE:  CREATE TABLE / UNIQUE will create implicit index 'ttab_b_key'
for table 'ttab'
CREATE
test=# CREATE INDEX ttab_b_idx ON ttab (a);
CREATE
test=# CREATE FUNCTION populate(int) RETURNS bool AS '
test'#  BEGIN
test'#   FOR i IN 1..$1 LOOP
test'#    INSERT INTO ttab(a) VALUES((i*i)%133);
test'#   END LOOP;
test'#   RETURN true;
test'#  END; ' language 'plpgsql';
CREATE
test=# select populate(90000);
test=# EXPLAIN SELECT * FROM ttab WHERE a = 30 LIMIT 18;
NOTICE:  QUERY PLAN:

Limit  (cost=0.00..17.07 rows=5 width=12)
  ->  Index Scan using ttab_b_idx on ttab  (cost=0.00..17.07 rows=5
width=12)

EXPLAIN
test=# select * from ttab where a = 30 limit 10;
 a  |   b
----+--------
 30 | 136329
 30 | 136315
 30 | 136253
 30 | 136239
 30 | 136196
 30 | 136182
 30 | 136120
 30 | 136106
 30 | 136063
 30 | 136049
(10 rows)

-dj





Re: SERIAL or INT8 / Unique BLOB's

From
Tom Lane
Date:
"Jeffrey W. Baker" <jwbaker@acm.org> writes:
> yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
> machine)?

bigint index support has been there for quite awhile.  Possibly you
are missing the need to cast the constant to bigint:

    WHERE bigintcol = 42::bigint
or
    WHERE bigintcol = '42'
but not just
    WHERE bigintcol = 42
since int8-eq-int4 is not an indexable operator for an int8 index.

int2, float4, and numeric columns suffer from variants of this issue,
as do OID and some other types.

There have been various proposals to fix this class of annoyance,
but none have got past the assembled complainants yet ;-).  See the
pghackers archives, eg mid-May 2000, for past arguments.

            regards, tom lane

Re: SERIAL or INT8 / Unique BLOB's

From
"Jeffrey W. Baker"
Date:

On Wed, 2 Jan 2002, Tom Lane wrote:

> "Jeffrey W. Baker" <jwbaker@acm.org> writes:
> > yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
> > machine)?
>
> bigint index support has been there for quite awhile.  Possibly you
> are missing the need to cast the constant to bigint:

Of course.  I was already told that once but apparently I forget it
already.  -jwb


Re: SERIAL or INT8 / Unique BLOB's

From
"Jeffrey W. Baker"
Date:

On Wed, 2 Jan 2002, Dave Trombley wrote:

> Jeffrey W. Baker wrote:
>
> >
> >yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
> >machine)?
> >
>
> I don't see why it wouldn't!  gcc has supported 8-byte longs for a
> (*sigh*) long, long, time now. =)
> This is on my i686-class linux box, build of yesterday's CVS snapshot:

I only asked because 7.1 won't use a bigint index under any circumstances.
I discovered this *after* creating some tremendously large tables :)

-jwb