Home > mailing lists

Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem) - Mailing list pgsql-hackers

From	Michael Paquier
Subject	Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
Date	July 9 02:20:31
Msg-id	aG2nv0Iw9Te7c6en@paquier.xyz Whole thread Raw
In response to	Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem) ("Burd, Greg" <greg@burd.me>)
List	pgsql-hackers

Tree view

On Tue, Jul 08, 2025 at 12:58:26PM -0400, Burd, Greg wrote:
> All that aside, I think you're right to tackle this one step at a
> time and try not to boil too much of the ocean at once (the patch
> set is already large enough).  With that in mind I've read once or
> twice over your changes and have a few basic comments/questions.
>
>
> v2-0001 Refactor some TOAST value ID code to use uint64 instead of Oid
>
> This set of changes make sense and as you say are mechanical in
> nature, no real comments other than I think that using uint64 rather
> than Oid is the right call and addresses #2 on Tom's list.
>
>
> v2-0002 Minimize footprint of TOAST_MAX_CHUNK_SIZE in heap TOAST code
>
> I like this as well, clarifies the code and reduces repetition.

Thanks.  These are independent pieces if you want to link the code
less to TOAST, assuming that an area of 8 bytes would be good enough
for any TOAST "value" concept.  TIDs were mentioned as well on a
different part of the thread, ItemPointerData is 6 bytes.

> v2-0003 Refactor external TOAST pointer code for better pluggability
>
> + * For now there are only two types, all defined in this file. For now this
> + * is the maximum value of vartag_external, which is a historical choice.
>
> This provides a bridge for compatibility, but doesn't open the door
> to a truly pluggable API.  I'm guessing the goal is incremental
> change rather than wholesale rewrite.

Nope, it does not introduce a pluggable thing, but it does untangle
the fact that one needs to change 15-ish code paths when they want to
add a new type of external TOAST pointer, without showing an actual
impact AFAIK when we insert a TOAST value or fetch it, as long as we
know that we're dealing with an on-disk thing that requires an
external lookup.

> + * The different kinds of on-disk external TOAST pointers. divided by
> + * vartag_external.
>
> Extra '.' in "TOAST pointers. divided" I'm guessing.

Indeed, thanks.

> v2-0007 Introduce global 64-bit TOAST ID counter in control file
>
> Do you have any concern that this might become a bottleneck when
> there are many relations and many backends all contending for a new
> id?  I'd imagine that this would show up in a flame graph, but I
> think your test focused on the read side detoast_attr_slice() rather
> than insert/update and contention on the shared counter.  Would this
> be even worse on NUMA systems?

That may be possible, see below.

> Thanks for the flame graphs examining a heavy detoast_attr_slice()
> workload.  I agree that there is little or no difference between
> them which is nice.

Cool.  Yes.  I was wondering why detoast_attr_slice() does not show up
in the profile on HEAD, perhaps it just got optimized away (I was
under -O2 for these profiles).

> I think the only call out Tom made [4] that isn't addressed was the
> ask for localized ID selection.  That may make sense at some point,
> especially if there is contention on GetNewToastId().  I think that
> case is worth a separate performance test, something with a large
> number of relations and backends all performing a lot of updates
> generating a lot of new IDs.  What do you think?

Yeah, I need to do more benchmark for the int8 part, I was holding on
such evaluations because this part of the patch does not fly if we
don't do the refactoring pieces first.  Anyway, I cannot get excited
about the extra workload that this would require in the catalogs,
because we would need one TOAST sequence tracked in there, linked to
the TOAST relation so it would not be free.  Or we invent a new
facility just for this purpose, meaning that we get far away even more
from being able to resolve the original problem with the values and
compression IDs.  We're talking about two instructions.  Well, I guess
that we could optimize it more some atomics or even cache a range of
values to save in ToastIdGenLock acquisitions in a single backend.  I
suspect that the bottleneck is going to be the insertion of the TOAST
entries in toast_save_datum() anyway with the check for conflicting
values, even if your relation is unlogged or running-on-scissors in
memory.

> [2] "Yes, the idea is to put the tid pointer directly in the varlena
> external header and have a tid array in the toast table as an extra
> column. If all of the TOAST fits in the single record, this will be
> empty, else it will have an array of tids for all the pages for this
> toasted field." - Hannu Krosing in an email to me after
> PGConf.dev/2025

Sure, you could do that as well, but I suspect that we'll need the
steps of at least up to 0003 to be able to handle more easily multiple
external TOAST pointer types, or the code will be messier than it
currently is.  :D
--
Michael

Attachment

signature.asc

pgsql-hackers by date:

From: Michael Paquier
Date: 09 July, 01:57:39
Subject: Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)

From: Michael Paquier
Date: 09 July, 02:28:00
Subject: Re: mention unused_oids script in pg_proc.dat

Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem) - Mailing list pgsql-hackers

Attachment

Previous

Next