Re: [PERFORM] EXTERNAL storage and substring on long strings - Mailing list pgsql-sql

From Matt Clark
Subject Re: [PERFORM] EXTERNAL storage and substring on long strings
Date
Msg-id OAEAKHEHCMLBLIDGAFELEEDDDGAA.matt@ymogen.net
Whole thread Raw
In response to Re: [PERFORM] EXTERNAL storage and substring on long strings  (Scott Cain <cain@cshl.org>)
List pgsql-sql
> > 2. If you want to search for a sequence you'll need to deal with the case
> > where it starts in one chunk and ends in another.
>
> I forgot about searching--I suspect that application is why I faced
> opposition for shredding in my schema development group.  Maybe I should
> push that off to the file system and use grep (or BLAST).  Otherwise, I
> could write a function that would search the chunks first, then after
> failing to find the substring in those, I could start sewing the chunks
> together to look for the query string.  That could get ugly (and
> slow--but if the user knows that and expects it to be slow, I'm ok with
> that).

If you know the max length of the sequences being searched for, and this is much less than the chunk size, then you
couldsimply 
have the chunks overlap by that much, thus guaranteeing every substring will be found in its entirety in at least one
chunk.



pgsql-sql by date:

Previous
From: Joe Conway
Date:
Subject: Re: [PERFORM] EXTERNAL storage and substring on long strings
Next
From: "Shridhar Daithankar"
Date:
Subject: Re: [PERFORM] EXTERNAL storage and substring on long strings