Re: Support allocating memory for large strings - Mailing list pgsql-hackers

From Jim Mlodgenski
Subject Re: Support allocating memory for large strings
Date
Msg-id CAB_5SRfupaoWbEU3aZ80_3Nz69+RfYTW3kcq2nNs953Pm9wLnw@mail.gmail.com
Whole thread Raw
In response to Re: Support allocating memory for large strings  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Support allocating memory for large strings
List pgsql-hackers
On Mon, Nov 10, 2025 at 4:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Nathan Bossart <nathandbossart@gmail.com> writes:
> > FWIW something I am hearing about more often these days, and what I believe
> > Maxim's patch is actually after, is the 1GB limit on row size.  Even if
> > each field doesn't exceed 1GB (which is what artifacts.md seems to
> > demonstrate), heap_form_tuple() and friends can fail to construct the whole
> > tuple.  This doesn't seem to be covered in the existing documentation about
> > limits [0].
>
> Yeah.  I think our hopes of relaxing the 1GB limit on individual
> field values are about zero, but maybe there is some chance of
> allowing tuples that are wider than that.  The notion that it's
> a one-line fix is still ludicrous though :-(
>
> One big problem with a scheme like that is "what happens when
> I try to make a bigger-than-1GB tuple into a composite datum?".
>
> Another issue is what happens when a wider-than-1GB tuple needs
> to be sent to or from clients.  I think there are assumptions
> in the wire protocol about message lengths fitting in an int,
> for example.  Even if the protocol were okay with it, I wouldn't
> count on client libraries not to fall over.
>
> On the whole, it's a nasty can of worms, and I stand by the
> opinion that the cost-benefit ratio of removing the limit is
> pretty awful.
>

This is a case that I see with users on a somewhat recurring basis. It is
typically from applications that were migrated from other databases. It's not
super common but when it happens, they bump into a fair amount of pain.

Consider the following contrived example. A user inserts a row and then updates
that row to exceed 1GB. That row effectively becomes unreadable without
breaking it up into smaller pieces and tools like pg_dump can't export it.

CREATE TABLE wide_row (
  id int, a varchar, b varchar, c varchar, d varchar, e varchar, f varchar,
  g varchar, h varchar, i varchar, j varchar, k varchar, l varchar, m varchar,
  n varchar, o varchar);

INSERT INTO wide_row (id, a, b, c, d, e, f, g, h, i)
VALUES (1, repeat('x', (10^8)::int), repeat('x', (10^8)::int),
        repeat('x', (10^8)::int), repeat('x', (10^8)::int),
        repeat('x', (10^8)::int), repeat('x', (10^8)::int),
        repeat('x', (10^8)::int), repeat('x', (10^8)::int),
        repeat('x', (10^8)::int));

UPDATE wide_row
SET j = repeat('x', (10^8)::int), k = repeat('x', (10^8)::int),
    l = repeat('x', (10^8)::int), m = repeat('x', (10^8)::int),
    n = repeat('x', (10^8)::int), o = repeat('x', (10^8)::int)
WHERE id = 1;

SELECT * FROM wide_row WHERE id = 1;
ERROR:  string buffer exceeds maximum allowed length (1073741823 bytes)
DETAIL:  Cannot enlarge string buffer containing 1000000051 bytes by
100000000 more bytes.

Putting some information in the limits doc page can help some users to avoid
the issue and maybe a better error message could help others fix the issue
faster when it occurs. Ideally, we should fix things far enough where we can
dump out any valid row in the table. Perhaps less common things like composites
could just continue to throw an error? If we don't want to open that
can of worms
we should be making it clear that the effective row size limit is 1GB.



pgsql-hackers by date:

Previous
From: Shruthi Gowda
Date:
Subject: Re: [BUG] CRASH: ECPGprepared_statement() and ECPGdeallocate_all() when connection is NULL
Next
From: Peter Geoghegan
Date:
Subject: Re: index prefetching