Re: ZStandard (with dictionaries) compression support for TOAST compression - Mailing list pgsql-hackers
From | Nikhil Kumar Veldanda |
---|---|
Subject | Re: ZStandard (with dictionaries) compression support for TOAST compression |
Date | |
Msg-id | CAFAfj_HPvuzXqbkgQ6F=ocR=JHe9PA7iVazGaCcrtPm-hKZ6pQ@mail.gmail.com Whole thread Raw |
In response to | Re: ZStandard (with dictionaries) compression support for TOAST compression (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
Hi Robert > But I don't quite understand the point of this > response: it seems like you're just restating what the design does > without really justifying it. The question here isn't whether a 3-byte > header can describe a length up to 16MB; I think we all know our > powers of two well enough to agree on the answer to that question. The > question is whether it's a good use of 3 bytes, and I don't think it > is. My initial decision to include a 3‑byte length field was driven by two goals: 1. Avoid introducing separate callbacks for each algorithm. 2. Provide a single, algorithm-agnostic mechanism for handling metadata length. After re-evaluating based on your feedback, I agree that the fixed overhead of a 3-byte length field outweighs its benefit; per-algorithm callbacks deliver the same functionality while saving three bytes per datum. > I did consider the fact that future compression algorithms might want > to use variable-length headers; but I couldn't see a reason why we > shouldn't let each of those compression algorithms decide for > themselves how to encode whatever information they need. If a > compression algorithm needs a variable-length header, then it just > needs to make that header self-describing. Worst case scenario, it can > make the first byte of that variable-length header a length byte, and > then go from there; but it's probably possible to be even smarter and > use less than a full byte. Say for example we store a dictionary ID > that in concept is a 32-bit quantity but we use a variable-length > integer representation for it. It's easy to see that we shouldn't ever > need more than 3 bits for that so a full length byte is overkill and, > in fact, would undermine the value of a variable-length representation > rather severely. (I suspect it's a bad idea anyway, but it's a worse > idea if you burn a full byte on a length header.) > I agree. Each compression algorithm can decide its own metadata size overhead. Callbacks can provide this information as well rather than storing in fixed length bytes(3 bytes). The revised patch introduces a "toast_cmpid_meta_size(const varatt_cmp_extended *hdr)", which calculates the metadata size. > But there's an even larger question here too, which is why we're > having some kind of discussion about generalized metadata when the > current project seemingly only requires a 4-byte dictionary OID. If > you have some other use of this space in mind, I don't think you've > told us what it is. If you don't, then I'm not sure why we're > designing around an up-to-16MB variable-length quantity when what we > have before us is a 4-byte fixed-length quantity. This project only requires 4 bytes of fixed-size metadata to store the dictionary ID. Updated design for extending varattrib_4b compression 1. extensible header /* * varatt_cmp_extended: an optional per‐datum header for extended compression method. * Only used when va_tcinfo's top two bits are "11". */ typedef struct varatt_cmp_extended { uint8 cmp_alg; char cmp_meta[FLEXIBLE_ARRAY_MEMBER]; /* algorithm‐specific metadata */ } varatt_cmp_extended; 2. Algorithm registry and metadata size dispatch static inline uint32 unsupported_meta_size(const varatt_cmp_extended *hdr) { elog(ERROR, "toast_cmpid_meta_size called for unsupported compression algorithm"); return 0; /* unreachable */ } /* no metadata for plain-ZSTD */ static inline uint32 zstd_nodict_meta_size(const varatt_cmp_extended *hdr) { return 0; } static inline uint32 zstd_dict_meta_size(const varatt_cmp_extended *hdr) { return sizeof(Oid); } /* * TOAST compression methods enumeration. * * NAME : algorithm identifier * VALUE : enum value * META-SIZE-FN : Calculates algorithm metadata size. */ #define TOAST_COMPRESSION_LIST \ X(PGLZ, 0, unsupported_meta_size) \ X(LZ4, 1, unsupported_meta_size) \ X(ZSTD_NODICT, 2, zstd_nodict_meta_size) \ X(ZSTD_DICT, 3, zstd_dict_meta_size) \ X(INVALID, 4, unsupported_meta_size) /* sentinel */ /* Compression algorithm identifiers */ typedef enum ToastCompressionId { #define X(name,val,fn) TOAST_##name##_COMPRESSION_ID = (val), TOAST_COMPRESSION_LIST #undef X } ToastCompressionId; /* lookup table to check if compression method uses extended format */ static const bool toast_cmpid_extended[] = { #define X(name,val,fn) \ /* PGLZ, LZ4 don't use extended format */ \ [TOAST_##name##_COMPRESSION_ID] = \ ((val) != TOAST_PGLZ_COMPRESSION_ID && \ (val) != TOAST_LZ4_COMPRESSION_ID && \ (val) != TOAST_INVALID_COMPRESSION_ID), TOAST_COMPRESSION_LIST #undef X }; #define TOAST_CMPID_EXTENDED(alg) (toast_cmpid_extended[alg]) /* * Prototype for a per-datum metadata-size callback: * given a pointer to the extended header, return * how many metadata bytes follow it. */ typedef uint32 (*ToastMetaSizeFn) (const varatt_cmp_extended *hdr); /* Callback table—indexed by ToastCompressionId */ static const ToastMetaSizeFn toast_meta_size_fns[] = { #define X(name,val,fn) [TOAST_##name##_COMPRESSION_ID] = fn, TOAST_COMPRESSION_LIST #undef X }; /* Calculates algorithm metadata size */ static inline uint32 toast_cmpid_meta_size(const varatt_cmp_extended *hdr) { Assert(hdr != NULL); return toast_meta_size_fns[hdr->cmp_alg] (hdr); } Each compression algorithm provides a static callback that returns the size of its metadata, given a pointer to the varatt_cmp_extended header. Algorithms with fixed-size metadata return a constant, while algorithms with variable-length metadata are responsible for defining and parsing their own internal headers to compute the metadata size. 3. Resulting on-disk layouts for zstd ZSTD (nodict) — datum on‑disk layout +----------------------------------+ | va_header (uint32) | +----------------------------------+ | va_tcinfo (uint32) | ← top two bits = 11 (extended) +----------------------------------+ | cmp_alg (uint8) | ← (ZSTD_NODICT) +----------------------------------+ | compressed bytes … | ← ZSTD frame +----------------------------------+ ZSTD(dict) — datum on‑disk layout +----------------------------------+ | va_header (uint32) | +----------------------------------+ | va_tcinfo (uint32) | ← top two bits = 11 (extended) +----------------------------------+ | cmp_alg (uint8) | ← (ZSTD_DICT) +----------------------------------+ | dict_id (uint32) | ← dictionary OID +----------------------------------+ | compressed bytes … | ← ZSTD frame +----------------------------------+ I hope this updated design addresses your concerns. I would appreciate any further feedback you may have. Thanks again for your guidance—it's been very helpful. v20-0001-varattrib_4b-design-proposal-to-make-it-extended.patch: varattrib_4b extensibility – adds varatt_cmp_extended, metadata size dispatch and useful macros; behaviour unchanged. v20-0002-zstd-nodict-compression.patch: Plain ZSTD (non dict) support. -- Nikhil Veldanda
Attachment
pgsql-hackers by date: