Re: Creating large database of MD5 hash values - Mailing list pgsql-performance

From Alvaro Herrera
Subject Re: Creating large database of MD5 hash values
Date
Msg-id 20080411142543.GA6442@alvh.no-ip.org
Whole thread Raw
In response to Creating large database of MD5 hash values  ("Jon Stewart" <jonathan.l.stewart@gmail.com>)
Responses Re: Creating large database of MD5 hash values
List pgsql-performance
Jon Stewart escribió:
> Hello,
>
> I am creating a large database of MD5 hash values. I am a relative
> newb with PostgreSQL (or any database for that matter). The schema and
> operation will be quite simple -- only a few tables, probably no
> stored procedures -- but I may easily end up with several hundred
> million rows of hash values, possible even get into the billions. The
> hash values will be organized into logical sets, with a many-many
> relationship. I have some questions before I set out on this endeavor,
> however, and would appreciate any and all feedback, including SWAGs,
> WAGs, and outright lies. :-) I am trying to batch up operations as
> much as possible, so I will largely be doing comparisons of whole
> sets, with bulk COPY importing. I hope to avoid single hash value
> lookup as much as possible.

If MD5 values will be your primary data and you'll be storing millions
of them, it would be wise to create your own datatype and operators with
the most compact and efficient representation possible.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

pgsql-performance by date:

Previous
From: Florian Weimer
Date:
Subject: Re: Creating large database of MD5 hash values
Next
From: Vivek Khera
Date:
Subject: Re: recommendations for web/db connection pooling or DBD::Gofer reviews