Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format - Mailing list pgsql-hackers

From Dharin Shah
Subject Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
Date
Msg-id CAOj6k6f2B3hNxDcnB5AgHX4kaTW8XTAfMAjRx4upDBOugxqF4w@mail.gmail.com
Whole thread Raw
In response to Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hello,

Following up on my earlier patch submission, I've reworked the zstd TOAST compression implementation based on our discussion here. The new patch now avoids the 20-byte extended header.

Current Approach
- New `VARTAG_ONDISK_ZSTD` (value 19) for ZSTD external storage
- Maintains existing 16-byte varatt_external structure
- ZSTD external-only (no inline compression)

Note: Using a dedicated VARTAG_ONDISK_ZSTD keeps the on-disk TOAST pointer payload at 16 bytes, but it is not a general extensible metadata carrier. If PostgreSQL later adopts a more general extensible TOAST framework, this change should not block it; VARTAG_ONDISK_ZSTD would remain as a supported legacy encoding, while new toasted values could be written using the newer framework and old values rewritten via normal table rewrites.

Storage (170 MB uncompressed):
    ZSTD: 22 MB (7.60x) - 38.7% space savings vs LZ4
    PGLZ: 36 MB (4.76x)
    LZ4:  36 MB (4.66x)

Key findings:
- Large values (>50KB): ZSTD 33% better compression than PGLZ (~30% better than LZ4)
- Low-entropy data: ZSTD compresses what LZ77 methods cannot
- Small values: ZSTD pays external overhead vs inline PGLZ/LZ4
While ZSTD uses slightly less space overall, the external storage mechanism incurs a TOAST fetch overhead for small values, potentially impacting performance.
Backwards Compatibility Tests
- Mixed compression: Rows with PGLZ, LZ4, and ZSTD coexist and decompress correctly
- Lazy recompression: ALTER COLUMN ... SET COMPRESSION zstd affects new data; existing data is lazily recompressed upon UPDATE or VACUUM FULL.
- Inline vs external: Small values remain inline; large values use appropriate external compression.
Data integrity: All data decompresses correctly across all methods.

Trade-offs and Design Considerations

- External-only avoids consuming cmid=3 and extended header complexity

- Slice access: no ZSTD-specific optimization (follow-up area)

- Hybrid inline/external for small values: not in this patch (feedback welcome)

Reviewer Questions - Is vartag-based external-only acceptable?
- Should compression level (currently 3) be configurable? - Is the external storage overhead for small values acceptable, or is hybrid inline/external behavior needed? Thanks, Dharin


On Thu, Dec 18, 2025 at 11:44 PM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Dec 18, 2025 at 10:44:22PM +0100, Dharin Shah wrote:
> I want to make sure I understand your main point: you're OK with a new
> `vartag_external`, but prefer we avoid increasing the heap TOAST pointer
> from 16 -> 20 bytes since every zstd-toasted value would pay +4 bytes in
> the main heap tuple.

That would be my choice, yes.  Not sure about the opinion of others on
this matter.

> I also realize the "compatibility" of the extended header doesn't buy us
> much — we'll need to support the existing 16-byte varatt_external forever
> for backward compatibility. Adding a 20-byte structure just means two
> formats to maintain indefinitely.

Yes.  Patches have to maintain on-disk compatibility.

> A couple clarifying questions if we go with new vartag (e.g.,
> `VARTAG_ONDISK_ZSTD`), same 16-byte `varatt_external` payload, vartag as
> discriminator
> 1. How should we handle future methods beyond zstd? One tag per method, or
> store a method id elsewhere (e.g., in TOAST chunk header)?

My suspicion would be that we could either use a new set of vartags in
the future for each compression method.  When it comes to zstd there
is something that comes in play: we could set some bits related to
dictionnaries at tuple level.  Not sure if this is the best design or
if using an attribute-level option is more adapted (for example a
JSONB blob could be applied as an attribute with common keys in a
dictionnary saving a lot of on-disk space even before compression),
but keeping some bits free in the 16-byte header leaves this option
open with a new vartag_external.  Saying that, zstd is good enough
that I strongly suspect that we would not regret it for quite a few
years.  One issue that has pushed towards the addition of lz4 as an
option for toast compression is that pglz was worse in terms of CPU
cost.  zlib is also more expensive than lz4 or zstd, especially at
very high compression level for usually little compression gains.

> 2. And re: "as long as the TOAST value is 32 bits" — are you referring to
> the 30-bit extsize field in va_extinfo (i.e., avoid stealing bits from
> extsize for method encoding)?

I mean extending the TOAST value to 8 bytes, as per the following
issues:
https://www.postgresql.org/message-id/764273.1669674269%40sss.pgh.pa.us
https://commitfest.postgresql.org/patch/5830/

> *Key findings (i guess well known at this point):*
> - ZSTD excels for repetitive/pattern-heavy data (6.7x better than PGLZ)
> - For low-redundancy data (MD5 hashes), ZSTD still achieves ~2x better
> - The T4 result showing zstd as "worse" is not about compression quality -
> it's about missing inline storage support. ZSTD actually compresses better,
> but pays unnecessary TOAST overhead.
>
> I'll share the detailed benchmark script with the next patch revision. But
> also a potential path forward could be that we could just fully replace
> pglz (can bring it up later in different thread)

I don't think that we will ever be able to remove pglz.  It would be
nice, as final result of course, but I also expect that not being able
to decompress pglz data is going to lead to a lot of user pain.  That
would be also very expensive to check at upgrade for large instances.

> *On Testing and Patch Structure*
> Agreed on both points:
> - I'll use `compression_zstd.sql` following the `compression_lz4.sql`
> pattern (removing the test_toast_ext module)

Okay.

> - I'll split the GUC refactoring into a separate preparatory patch

This refactoring, if done nicely, is worth an independent piece.  It's
something that I have actually done for the sake of the other thread,
though the result was not really much liked by others.  Perhaps I'm
just lacking imagination with this abstraction, and I'd surely welcome
different ideas.
--
Michael
Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [PATCH] Add enable_copy_program GUC to control COPY PROGRAM
Next
From: "David G. Johnston"
Date:
Subject: Re: Improve documentation of publication privilege checks