Re: store A LOT of 3-tuples for comparisons - Mailing list pgsql-performance

From Matthew
Subject Re: store A LOT of 3-tuples for comparisons
Date
Msg-id Pine.LNX.4.64.0802221546370.20402@aragorn.flymine.org
Whole thread Raw
In response to store A LOT of 3-tuples for comparisons  (Moritz Onken <onken@houseofdesign.de>)
Responses Re: store A LOT of 3-tuples for comparisons
List pgsql-performance
On Fri, 22 Feb 2008, Moritz Onken wrote:
> I need to store a lot of 3-tuples of words (e.g. "he", "can", "drink"), order
> matters!
> The source is about 4 GB of these 3-tuples.
> I need to store them in a table and check whether one of them is already
> stored, and if that's the case to increment a column named "count" (or
> something).

My suggestion would be to use three varchar columns to store the 3-tuples.
You should then create a B-tree index on the three columns together.

> I thought of doing all the inserts without having an index and without doing
> the check whether the row is already there. After that I'd do a "group by"
> and count(*) on that table. Is this a good idea?

That sounds like the fastest way to do it, certainly.

Matthew

--
"We have always been quite clear that Win95 and Win98 are not the systems to
use if you are in a hostile security environment." "We absolutely do recognize
that the Internet is a hostile environment." Paul Leach <paulle@microsoft.com>

pgsql-performance by date:

Previous
From: Moritz Onken
Date:
Subject: store A LOT of 3-tuples for comparisons
Next
From: Susan Russo
Date:
Subject: CORRECTION to msg 'loading same instance of dump to two different servers simultaneously'