Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].
A bit of a background in the 2022 thread [3], The overall suggestion was to have something extensible for the TOAST header
i.e. something like: 00 = PGLZ 01 = LZ4 10 = reserved for future emergencies 11 = extended header with additional type byte
This patch implements that idea. The new header format:
struct varatt_external_extended { int32 va_rawsize; /* same as legacy */ uint32 va_extinfo; /* cmid=3 signals extended format */ uint8 va_flags; /* feature flags */ uint8 va_data[3]; /* va_data[0] = compression method */ Oid va_valueid; /* same as legacy */ Oid va_toastrelid; /* same as legacy */ };
A few notes:
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values. - A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency). - Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.