Refactor code around GUC default_toast_compression - Mailing list pgsql-hackers

From Michael Paquier
Subject Refactor code around GUC default_toast_compression
Date
Msg-id afRbVhoYuw4RStIO@paquier.xyz
Whole thread
Responses Re: Refactor code around GUC default_toast_compression
List pgsql-hackers
Hi all,

While hacking on the TOAST code, I have been annoyed more than once
with the following piece in toast_compression.h:
/*
 * Built-in compression method ID.  The toast compression header will store
 * this in the first 2 bits of the raw length.  These built-in compression
 * method IDs are directly mapped to the built-in compression methods.
 *
 * Don't use these values for anything other than understanding the meaning
 * of the raw bits from a varlena; in particular, if the goal is to identify
 * a compression method, use the constants TOAST_PGLZ_COMPRESSION, etc.
 * below. We might someday support more than 4 compression methods, but
 * we can never have more than 4 values in this enum, because there are
 * only 2 bits available in the places where this is stored.
 */
typedef enum ToastCompressionId
{
    TOAST_PGLZ_COMPRESSION_ID = 0,
    TOAST_LZ4_COMPRESSION_ID = 1,
    TOAST_INVALID_COMPRESSION_ID = 2,
} ToastCompressionId;

This is due the fact that we have only two bits that can be used in
va_tcinfo or va_extinfo.  While looking at the addition of a new
compression method, this was causing a mess, so I have hacked the
attached patch, that makes the addition of more compression methods
easier.  The idea is centralized in toast_compression.c, with the
addition of a registry that knows about all the TOAST compression
methods and its meta-data:
- name
- GUC enum values.
- attcompression char value.
- varatt on-disk value.

This is coupled with a set of translation routines, used in other code
paths.  This has also the merit to remove TOAST_INVALID_COMPRESSION_ID
from the list of GUC values, which did not really make sense to begin
with.  I don't deny that the addition of a new compression method
would require more tweaks, particularly for the decompression part,
but I think that this is a nice cleanup anyway.  This is added to the
next commit fest, to be considered for v20.

Thanks,
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Andreas Karlsson
Date:
Subject: Re: [PATCH] Fix pg_dump emitting OVERRIDING SYSTEM VALUE for tables with dropped identity columns
Next
From: Alexander Lakhin
Date:
Subject: Re: Startup process deadlock: WaitForProcSignalBarriers vs aux process