Re: TEXT column > 1Gb - Mailing list pgsql-general

From Mark Dilger
Subject Re: TEXT column > 1Gb
Date
Msg-id E18E02B1-39DE-4464-9C8C-F0C06BBF771B@enterprisedb.com
Whole thread Raw
In response to Re: TEXT column > 1Gb  (Joe Carlson <jwcarlson@lbl.gov>)
Responses Re: TEXT column > 1Gb
List pgsql-general

> On Apr 12, 2023, at 7:59 AM, Joe Carlson <jwcarlson@lbl.gov> wrote:
>
> The use case is genomics. Extracting substrings is common. So going to chunked storage makes sense.

Are you storing nucleotide sequences as text strings?  If using the simple 4-character (A,C,G,T) alphabet, you can
storefour bases per byte.  If using a nucleotide code 16-character alphabet you can still get two bases per byte.  An
aminoacid 20-character alphabet can be stored 8 bases per 5 bytes, and so forth.  Such a representation might allow you
tostore sequences two or four times longer than the limit you currently hit, but then you are still at an impasse.
Woulda factor or 2x or 4x be enough for your needs?  

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






pgsql-general by date:

Previous
From: Rob Sargent
Date:
Subject: Re: TEXT column > 1Gb
Next
From: Joe Carlson
Date:
Subject: Re: TEXT column > 1Gb