Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Custom compression methods
Date
Msg-id CA+TgmoYszJaQivv1eG4JAV3S19t61Y68KLJPNiRUABNWurQANA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Custom compression methods  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [HACKERS] Custom compression methods
List pgsql-hackers
On Wed, Dec 13, 2017 at 5:10 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>> 2. If several data types can benefit from a similar approach, it has
>> to be separately implemented for each one.
>
> I don't think the current solution improves that, though. If you want to
> exploit internal features of individual data types, it pretty much
> requires code customized to every such data type.
>
> For example you can't take the tsvector compression and just slap it on
> tsquery, because it relies on knowledge of internal tsvector structure.
> So you need separate implementations anyway.

I don't think that's necessarily true.  Certainly, it's true that *if*
tsvector compression depends on knowledge of internal tsvector
structure, *then* that you can't use the implementation for anything
else (this, by the way, means that there needs to be some way for a
compression method to reject being applied to a column of a data type
it doesn't like).  However, it seems possible to imagine compression
algorithms that can work for a variety of data types, too.  There
might be a compression algorithm that is theoretically a
general-purpose algorithm but has features which are particularly
well-suited to, say, JSON or XML data, because it looks for word
boundaries to decide on what strings to insert into the compression
dictionary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: incorrect error message, while dropping PROCEDURE
Next
From: Merlin Moncure
Date:
Subject: Re: procedures and plpgsql PERFORM