Re: Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Re: [HACKERS] Custom compression methods |
Date | |
Msg-id | CAFiTN-tkGOtVSJD0FDZCuVrAYCCURWwRtBxT_4KVduJu-9fueQ@mail.gmail.com Whole thread Raw |
In response to | Re: Re: [HACKERS] Custom compression methods (Amit Khandekar <amitdkhan.pg@gmail.com>) |
List | pgsql-hackers |
On Mon, Aug 31, 2020 at 10:45 AM Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > > On Thu, 13 Aug 2020 at 17:18, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I have rebased the patch on the latest head and currently, broken into 3 parts. > > > > v1-0001: As suggested by Robert, it provides the syntax support for > > setting the compression method for a column while creating a table and > > adding columns. However, we don't support changing the compression > > method for the existing column. As part of this patch, there is only > > one built-in compression method that can be set (pglz). In this, we > > have one in-build am (pglz) and the compressed attributes will directly > > store the oid of the AM. In this patch, I have removed the > > pg_attr_compresion as we don't support changing the compression > > for the existing column so we don't need to preserve the old > > compressions. > > v1-0002: Add another built-in compression method (zlib) > > v1:0003: Remaining patch set (nothing is changed except rebase on the > > current head, stabilizing check-world and 0001 and 0002 are pulled > > out of this) > > > > Next, I will be working on separating out the remaining patches as per > > the suggestion by Robert. > > Thanks for this new feature. Looks promising and very useful, with so > many good compression libraries already available. Thanks for looking into it. > I see that with the patch-set, I would be able to create an extension > that defines a PostgreSQL C handler function which assigns all the > required hook function implementations for compressing, decompressing > and validating, etc. In short, I would be able to use a completely > different compression algorithm to compress toast data if I write such > an extension. Correct me if I am wrong with my interpretation. > > Just a quick superficial set of review comments .... > > A minor re-base is required due to a conflict in a regression test Okay, I will do this. > ------------- > > In heap_toast_insert_or_update() and in other places, the comments for > new parameter preserved_am_info are missing. > > ------------- ok > +toast_compress_datum(Datum value, Oid acoid) > { > struct varlena *tmp = NULL; > int32 valsize; > - CompressionAmOptions cmoptions; > + CompressionAmOptions *cmoptions = NULL; > > I think tmp and cmoptions need not be initialized to NULL Right > ------------- > > - TOAST_COMPRESS_SET_RAWSIZE(tmp, valsize); > - SET_VARSIZE_COMPRESSED(tmp, len + TOAST_COMPRESS_HDRSZ); > /* successful compression */ > + toast_set_compressed_datum_info(tmp, amoid, valsize); > return PointerGetDatum(tmp); > > Any particular reason why is this code put in a new extern function ? > Is there a plan to re-use it ? Otherwise, it's not necessary to do > this. > > ------------ > > Also, not sure why "HTAB *amoptions_cache" and "MemoryContext > amoptions_cache_mcxt" aren't static declarations. They are being used > only in toast_internals.c > ----------- > > The tab-completion doesn't show COMPRESSION : > postgres=# create access method my_method TYPE > INDEX TABLE > postgres=# create access method my_method TYPE > > Also, the below syntax also would better be tab-completed so as to > display all the installed compression methods, in line with how we > show all the storage methods like plain,extended,etc: > postgres=# ALTER TABLE lztab ALTER COLUMN t SET COMPRESSION > > ------------ I will fix these comments in the next version of the patch. > I could see the differences in compression ratio, and the compression > and decompression speed when I use lz versus zib : > > CREATE TABLE zlibtab(t TEXT COMPRESSION zlib WITH (level '4')); > create table lztab(t text); > ALTER TABLE lztab ALTER COLUMN t SET COMPRESSION pglz; > > pgg:s2:pg$ time psql -c "\copy zlibtab from text.data" > COPY 13050 > > real 0m1.344s > user 0m0.031s > sys 0m0.026s > > pgg:s2:pg$ time psql -c "\copy lztab from text.data" > COPY 13050 > > real 0m2.088s > user 0m0.008s > sys 0m0.050s > > > pgg:s2:pg$ time psql -c "select pg_table_size('zlibtab'::regclass), > pg_table_size('lztab'::regclass)" > pg_table_size | pg_table_size > ---------------+--------------- > 1261568 | 1687552 > > pgg:s2:pg$ time psql -c "select NULL from zlibtab where t like '0000'" > > /dev/null > > real 0m0.127s > user 0m0.000s > sys 0m0.002s > > pgg:s2:pg$ time psql -c "select NULL from lztab where t like '0000'" > > /dev/null > > real 0m0.050s > user 0m0.002s > sys 0m0.000s > Thanks for testing this. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: