Home > mailing lists

Re: pglz performance - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: pglz performance
Date	August 5, 2019 10:26:25
Msg-id	20190805072625.pzebnrs3fvk5wjen@alap3.anarazel.de Whole thread Raw
In response to	Re: pglz performance (Petr Jelinek <petr@2ndquadrant.com>)
Responses	Re: pglz performance
List	pgsql-hackers

Tree view

Hi,

On 2019-08-05 00:19:28 +0200, Petr Jelinek wrote:
> It carries that information inside the compressed value, like I said in the
> other reply, that's why the extra byte.

I'm not convinced that that is a good plan - imo the reference to the
compressed data should carry that information.

I.e. in the case of toast, at least toast pointers should hold enough
information to determine the compression algorithm. And in the case of
WAL, the WAL record should contain that.

Consider e.g. adding support for slice fetching of datums compressed
with some algorithm - we should be able to determine whether that's
possible without fetching the datum (so we can take a non-exceptional
path for datums compressed otherwise). Similarly, for WAL, we should be
able to detect whether an incompatible compression format is used,
without having to invoke a generic compression routine that then fails
in some way. Or adding compression reporting for WAL to xlogdump.

I also don't particularly like baking in the assumption that we don't
support tuples larger than 1GB in further places. To me it seems likely
that we're going to have to fix that, and it's hard enough already... I
know that my patch did that too...

For external datums I suggest encoding the compression method as a
distinct VARTAG_ONDISK_COMPRESSED, and then have that include the
compression method as a field.

For in-line compressed values (so VARATT_IS_4B_C), doing something
roughly like you did, indicating the type of metadata following using
the high bit sounds reasonable. But I think I'd make it so that if the
highbit is set, the struct is instead entirely different, keeping a full
4 byte byte length, and including the compression type header inside the
struct. Perhaps I'd combine the compression type with the high-bit-set
part?  So when the high bit is set, it'd be something like

{
    int32        vl_len_;        /* varlena header (do not touch directly!) */

    /*
     * Actually only 7 bit, the high bit determines whether this
     * is the old compression header (if unset), or this type of header
     * (if set).
     */
    uint8 type;

    /*
     * Stored as uint8[4], to avoid unnecessary alignment padding.
     */
    uint8[4] length;

    char va_data[FLEXIBLE_ARRAY_MEMBER];
}

I think it's worth spending some effort trying to get this right - we'll
be bound by design choices for a while.

It's kinda annoying that toast datums aren't better designed. Creating
them from scratch, I'd make it something like:

1) variable-width integer describing the "physical length", so that
   tuple deforming can quickly determine length - all the ifs necessary
   to determine lengths are a bottleneck. I'd probably just use a 127bit
   encoding, if the high bit is set, there's a following length byte.

2) type of toasted datum, probably also in a variable width encoding,
   starting at 1byte. Not that I think it's likely we'd overrun 256
   types - but it's cheap enough to just declare the high bit as an
   length extension bit.

These are always stored unaligned. So there's no need to deal with
padding bytes having to be zero to determine whether we're dealing with
a 1byte datum etc.

Then, type dependant:

For In-line uncompressed datums
3a) alignment padding, amount determined by 2) above, i.e. we'd just
    have different types for different amounts of alignment. Probably
    using some heuristic to use unaligned when either dealing with data
    that doesn't need alignment, or when the datum is fairly small, so
    copying to get the data as unaligned won't be a significant penalty.
4a) data

For in-line compressed datums
3b) compression metadata {varint rawsize, varint compression algorithm}
4b) unaligned compressed data - there's no benefit in keeping it aligned

For External toast for uncompressed data:
3d) {toastrelid, valueid, varint rawsize}

For External toast for compressed data:
3e) {valueid, toastrelid, varint compression_algorithm, varint rawsize, varint extsize}

That'd make it a lot more extensible, easier to understand, faster to
decode in a lot of cases, remove a lot of arbitrary limits.  Yes, it'd
increase the header size for small datums to two bytes, but I think
that'd be more than bought back by the other improvements.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Daniel Gustafsson
Date: 05 August 2019, 10:20:07
Subject: Re: Ought to use heap_multi_insert() for pg_attribute/dependinsertions?

From: Kyotaro Horiguchi
Date: 05 August 2019, 10:28:24
Subject: Re: Feature improvement: can we add queryId forpg_catalog.pg_stat_activity view?

Re: pglz performance - Mailing list pgsql-hackers

Previous

Next