Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: [HACKERS] Custom compression methods |
Date | |
Msg-id | CAFiTN-vp=or=1x=DjttV=WfLPsSWnKUVecL2PFGJHOqrrXTsrg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Custom compression methods (Dilip Kumar <dilipbalaut@gmail.com>) |
List | pgsql-hackers |
On Mon, Oct 5, 2020 at 11:17 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Oct 5, 2020 at 3:37 AM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > Thanks, Tomas for your feedback. > > > 9) attcompression ... > > > > The main issue I see is what the patch does with attcompression. Instead > > of just using it to store a the compression method, it's also used to > > store the preserved compression methods. And using NameData to store > > this seems wrong too - if we really want to store this info, the correct > > way is either using text[] or inventing charvector or similar. > > The reason for using the NameData is the get it in the fixed part of > the data structure. > > > But to me this seems very much like a misuse of attcompression to track > > dependencies on compression methods, necessary because we don't have a > > separate catalog listing compression methods. If we had that, I think we > > could simply add dependencies between attributes and that catalog. > > Basically, up to this patch, we are having only built-in compression > methods and those can not be dropped so we don't need any dependency > at all. We just want to know what is the current compression method > and what is the preserve compression methods supported for this > attribute. Maybe we can do it better instead of using the NameData > but I don't think it makes sense to add a separate catalog? > > > Moreover, having the catalog would allow adding compression methods > > (from extensions etc) instead of just having a list of hard-coded > > compression methods. Which seems like a strange limitation, considering > > this thread is called "custom compression methods". > > I think I forgot to mention while submitting the previous patch that > the next patch I am planning to submit is, Support creating the custom > compression methods wherein we can use pg_am catalog to insert the new > compression method. And for dependency handling, we can create an > attribute dependency on the pg_am row. Basically, we will create the > attribute dependency on the current compression method AM as well as > on the preserved compression methods AM. As part of this, we will > add two build-in AMs for zlib and pglz, and the attcompression field > will be converted to the oid_vector (first OID will be of the current > compression method, followed by the preserved compression method's > oids). > > > 10) compression parameters? > > > > I wonder if we could/should allow parameters, like compression level > > (and maybe other stuff, depending on the compression method). PG13 > > allowed that for opclasses, so perhaps we should allow it here too. > > Yes, that is also in the plan. For doing this we are planning to add > an extra column in the pg_attribute which will store the compression > options for the current compression method. The original patch was > creating an extra catalog pg_column_compression, therein it maintains > the oid of the compression method as well as the compression options. > The advantage of creating an extra catalog is that we can keep the > compression options for the preserved compression methods also so that > we can support the options which can be used for decompressing the > data as well. Whereas if we want to avoid this extra catalog then we > can not use that compression option for decompressing. But most of > the options e.g. compression level are just for the compressing so it > is enough to store for the current compression method only. What's > your thoughts? > > Other comments look fine to me so I will work on them and post the > updated patch set. I have fixed the other comments except this, > 2) I see index_form_tuple does this: > > Datum cvalue = toast_compress_datum(untoasted_values[i], > DefaultCompressionMethod); > which seems wrong - why shouldn't the indexes use the same compression > method as the underlying table? I will fix this in the next version of the patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: