Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers
From | Ildus Kurbangaliev |
---|---|
Subject | Re: [HACKERS] Custom compression methods |
Date | |
Msg-id | 20180423213928.7c704810@gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Custom compression methods (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
List | pgsql-hackers |
On Mon, 23 Apr 2018 19:34:38 +0300 Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > > > Sorry, I really looking at this patch under the different angle. > And this is why I have some doubts about general idea. > Postgres allows to defined custom types, access methods,... > But do you know any production system using some special data types > or custom indexes which are not included in standard Postgres > distribution or popular extensions (like postgis)? > > IMHO end-user do not have skills and time to create their own > compression algorithms. And without knowledge of specific of > particular data set, > it is very hard to implement something more efficient than universal > compression library. > But if you think that it is not a right place and time to discuss it, > I do not insist. > > But in any case, I think that it will be useful to provide some more > examples of custom compression API usage. > From my point of view the most useful will be integration with zstd. > But if it is possible to find some example of data-specific > compression algorithms which show better results than universal > compression, it will be even more impressive. > > Ok, let me clear up the purpose of this patch. I understand that you want to compress everything by it but now the idea is just to bring basic functionality to compress toasting values with external compression algorithms. It's unlikely that compression algorithms like zstd, snappy and others will be in postgres core but with this patch it's really easy to make an extension and start to compress values using it right away. And the end-user should not be expert in compression algorithms to make such extension. One of these algorithms could end up in core if its license will allow it. I'm not trying to cover all the places in postgres which will benefit from compression, and this patch only is the first step. It's quite big already and with every new feature that will increase its size, chances of its reviewing and commiting will decrease. The API is very simple now and contains what an every compression method can do - get some block of data and return a compressed form of the data. And it can be extended with streaming and other features in the future. Maybe the reason of your confusion is that there is no GUC that changes pglz to some custom compression so all new attributes will use it. I will think about adding it. Also there was a discussion about specifying the compression for the type and it was decided that's better to do it later by a separate patch. As an example of specialized compression could be time series compression described in [1]. [2] contains an example of an extension that adds lz4 compression using this patch. [1] http://www.vldb.org/pvldb/vol8/p1816-teller.pdf [2] https://github.com/zilder/pg_lz4 -- ---- Regards, Ildus Kurbangaliev
pgsql-hackers by date: