Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem) - Mailing list pgsql-hackers

From Nikita Malakhov
Subject Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
Date
Msg-id CAN-LCVOh9DRWNqoDUx+Q1ZDM_O3VyX6ctRuEX0mzA+_JTkUKvg@mail.gmail.com
Whole thread Raw
In response to Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)  (Hannu Krosing <hannuk@google.com>)
Responses Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
List pgsql-hackers
Hi!

Michael and Hannu, here's a POC patch with direct TIDs TOAST.
The simplest implementation where we store a chain of TIDs, each
chunk stores the next TID to be fetched. Patch applies on top of
commit 998b0b51d5ea763be081804434f177082ba6772b (origin/toast_64bit_v2)
Author: Michael Paquier <michael@paquier.xyz>
Date:   Thu Jun 19 13:09:11 2025 +0900

While it is very fast on small data - I see several disadvantages:
- first of all, VACUUM should be revised to work with such tables;
- problematic batch insertion due to necessity to store TID chain.

It is just a POC implementation, so please don't blame me for
questionable decisions.

Any opinions and feedback welcome!

PS: Hannu, just seen your latest message, will check it out now.

On Mon, Jul 21, 2025 at 3:15 AM Hannu Krosing <hannuk@google.com> wrote:
I have been evolving details for Direct TOAST design in
https://wiki.postgresql.org/wiki/DirectTOAST

The top level goals are

* 8-byte TOAST pointer - just (header:1, tag:1 and TID:6)
* all other info moved from toast pointer to actual toast record(s),
so heap rows are smaller and faster.
* all extra fields are bytea with internal encoding (maybe will create
full new types for these, or maybe just introspection functions are
enough)
  the reasons for this are
  - PostgresSQL arrays add 20 byte overhead
  - bytea gives other freedoms in encoding for minimal space usage

No solution yet for va_toastrelid , but hope is
- to use some kind of mapping and find one or two free bits somewhere
(tid has one free),
- or add a 12-byte toast pointer just for this.
- or to make sure that CLUSTER and VACUUM FULL can be done without
needing va_toastrelid. I assume it is there for clustering the TOAST
which will be not possible separately from the main heap with direct
toast tid pointers anyway.

Please take a look and poke holes in it !




--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
Attachment

pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Skipping schema changes in publication
Next
From: Nikita Malakhov
Date:
Subject: Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)