Re: [ADMIN] bytea size limit? - Mailing list pgsql-general

From Kenneth Marshall
Subject Re: [ADMIN] bytea size limit?
Date
Msg-id 20090121141748.GH14570@it.is.rice.edu
Whole thread Raw
In response to Re: [ADMIN] bytea size limit?  (Grzegorz Jaśkiewicz <gryzman@gmail.com>)
Responses Re: [ADMIN] bytea size limit?
List pgsql-general
On Wed, Jan 21, 2009 at 02:09:01PM +0000, Grzegorz Ja??kiewicz wrote:
> On Wed, Jan 21, 2009 at 2:06 PM, Kenneth Marshall <ktm@rice.edu> wrote:
>
> > The TOAST implementation however only allows 30-bits for the
> > size of the TOAST entry which caps the size at 2^30 or 1GB. I
> > agree that he could very well be limited also by the memory on
> > his system.
>
> i wasn't aware of that, and also - it doesn't say anything about it in docs.
> As for limitations, that also depends on db drivers he is using, etc,
> etc. I use bytea to store 100-200MB objects in many dbs, but I
> wouldn't go as far as 1.5GB ...
>
The reference is in:
    http://www.postgresql.org/docs/8.3/static/storage-toast.html

Here is the pertinent excerpt:

Only certain data types support TOAST -- there is no need to impose the
overhead on data types that cannot produce large field values. To support
TOAST, a data type must have a variable-length (varlena) representation,
in which the first 32-bit word of any stored value contains the total
length of the value in bytes (including itself). TOAST does not constrain
the rest of the representation. All the C-level functions supporting a
TOAST-able data type must be careful to handle TOASTed input values.
(This is normally done by invoking PG_DETOAST_DATUM before doing anything
with an input value, but in some cases more efficient approaches are possible.)

TOAST usurps two bits of the varlena length word (the high-order bits on
big-endian machines, the low-order bits on little-endian machines),
thereby limiting the logical size of any value of a TOAST-able data type
to 1 GB (230 - 1 bytes). When both bits are zero, the value is an ordinary
un-TOASTed value of the data type, and the remaining bits of the length
word give the total datum size (including length word) in bytes. When the
highest-order or lowest-order bit is set, the value has only a single-byte
header instead of the normal four-byte header, and the remaining bits give
the total datum size (including length byte) in bytes. As a special case,
if the remaining bits are all zero (which would be impossible for a
self-inclusive length), the value is a pointer to out-of-line data stored
in a separate TOAST table. (The size of a TOAST pointer is given in the
second byte of the datum.) Values with single-byte headers aren't aligned
on any particular boundary, either. Lastly, when the highest-order or
lowest-order bit is clear but the adjacent bit is set, the content of the
datum has been compressed and must be decompressed before use. In this
case the remaining bits of the length word give the total size of the
compressed datum, not the original data. Note that compression is also
possible for out-of-line data but the varlena header does not tell whether
it has occurred -- the content of the TOAST pointer tells that, instead.

Cheers,
Ken

pgsql-general by date:

Previous
From: Grzegorz Jaśkiewicz
Date:
Subject: Re: [ADMIN] bytea size limit?
Next
From: Grzegorz Jaśkiewicz
Date:
Subject: Re: [ADMIN] bytea size limit?