Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers
From | Ildus Kurbangaliev |
---|---|
Subject | Re: [HACKERS] Custom compression methods |
Date | |
Msg-id | 20171102124101.5a28ecab@wp.localdomain Whole thread Raw |
In response to | Re: [HACKERS] Custom compression methods (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) |
Responses |
Re: [HACKERS] Custom compression methods
|
List | pgsql-hackers |
On Wed, 1 Nov 2017 17:05:58 -0400 Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 9/12/17 10:55, Ildus Kurbangaliev wrote: > >> The patch also includes custom compression method for tsvector > >> which is used in tests. > >> > >> [1] > >> https://www.postgresql.org/message-id/CAPpHfdsdTA5uZeq6MNXL5ZRuNx%2BSig4ykWzWEAfkC6ZKMDy6%3DQ%40mail.gmail.com > > Attached rebased version of the patch. Added support of pg_dump, the > > code was simplified, and a separate cache for compression options > > was added. > > I would like to see some more examples of how this would be used, so > we can see how it should all fit together. > > So far, it's not clear to me that we need a compression method as a > standalone top-level object. It would make sense, perhaps, to have a > compression function attached to a type, so a type can provide a > compression function that is suitable for its specific storage. In this patch compression methods is suitable for MAIN and EXTENDED storages like in current implementation in postgres. Just instead only of LZ4 you can specify any other compression method. Idea is not to change compression for some types, but give the user and extension developers opportunity to change how data in some attribute will be compressed because they know about it more than database itself. > > The proposal here is very general: You can use any of the eligible > compression methods for any attribute. That seems very complicated to > manage. Any attribute could be compressed using either a choice of > general compression methods or a type-specific compression method, or > perhaps another type-specific compression method. That's a lot. Is > this about packing certain types better, or trying out different > compression algorithms, or about changing the TOAST thresholds, and > so on? It is about extensibility of postgres, for example if you need to store a lot of time series data you can create an extension that stores array of timestamps in more optimized way, using delta encoding or something else. I'm not sure that such specialized things should be in core. In case of array of timestamps in could look like this: CREATE EXTENSION timeseries; -- some extension that provides compression method Extension installs a compression method: CREATE OR REPLACE FUNCTION timestamps_compression_handler(INTERNAL) RETURNS COMPRESSION_HANDLER AS 'MODULE_PATHNAME', 'timestamps_compression_handler' LANGUAGE C STRICT; CREATE COMPRESSION METHOD cm1 HANDLER timestamps_compression_handler; And user can specify it in his table: CREATE TABLE t1 (time_series_data timestamp[] COMPRESSED cm1; ) I think generalization of some method to a type is not a good idea. For some attribute you could be happy with builtin LZ4, for other you can need more compressibility and so on. -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: