Re: Adding a suffix array index - Mailing list pgsql-hackers

From Troels Arvin
Subject Re: Adding a suffix array index
Date
Msg-id pan.2004.11.19.12.55.48.92702@arvin.dk
Whole thread Raw
In response to Adding a suffix array index  (Troels Arvin <troels@arvin.dk>)
List pgsql-hackers
On Fri, 19 Nov 2004 14:38:20 +0200, Hannu Krosing wrote:

>> Part of my current code concerns packing DNA characters: As the alphabet 
>> of DNA strings is very small (four characters), it seems like a 
>> straigt-forward optimization to store each character in two bits. 
> 
> My advice would be to get it to work first, oprimize later.

Valid point. However, I needed something rather basic to work on, to get
to know C and to get to know PostgreSQL in a user defined type context.
But if packing proves to be a problem when implementing the interesting
stuff, then thanks&yes: Packing should be an afterthought.

>> My first and most immediate goal is to support efficient answering of a
>> question like "which rows contain the sequence TTGACCACTTG in column foo?".
> 
> If you store your sequences as strings, you may try to use trigrams (or
> modify them to 4,5,6 or 7-grams ;) to get some feel how that works.
> 
> trigram module is in contrib/pg_trgm.

(/me Printing readme.) Thanks.

-- 
Greetings from Troels Arvin, Copenhagen, Denmark




pgsql-hackers by date:

Previous
From: Troels Arvin
Date:
Subject: Re: Adding a suffix array index
Next
From: Oleg Bartunov
Date:
Subject: Re: Adding a suffix array index