Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

From Ildar Musin
Subject Re: [HACKERS] Custom compression methods
Date
Msg-id 6fd5dea3-8889-c7bb-9df2-79493b2b6ab0@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] Custom compression methods  (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
Responses Re: [HACKERS] Custom compression methods
List pgsql-hackers
Hello Ildus,

On 23.01.2018 16:04, Ildus Kurbangaliev wrote:
> On Mon, 22 Jan 2018 23:26:31 +0300
> Ildar Musin <i.musin@postgrespro.ru> wrote:
>
> Thanks for review! Attached new version of the patch. Fixed few bugs,
> added more documentation and rebased to current master.
>
>> You need to rebase to the latest master, there are some conflicts.
>> I've applied it to the three days old master to try it.
>
> Done.
>
>>
>> As I can see the documentation is not yet complete. For example, there
>> is no section for ALTER COLUMN ... SET COMPRESSION in ddl.sgml; and
>> section "Compression Access Method Functions" in compression-am.sgml
>> hasn't been finished.
>
> Not sure about ddl.sgml, it contains more common things, but since
> postgres contains only pglz by default there is not much to show.
>
>>
>> I've implemented an extension [1] to understand the way developer
>> would go to work with new infrastructure. And for me it seems clear.
>> (Except that it took me some effort to wrap my mind around varlena
>> macros but it is probably a different topic).
>>
>> I noticed that you haven't cover 'cmdrop' in the regression tests and
>> I saw the previous discussion about it. Have you considered using
>> event triggers to handle the drop of column compression instead of
>> 'cmdrop' function? This way you would kill two birds with one stone:
>> it still provides sufficient infrastructure to catch those events
>> (and it something postgres already has for different kinds of ddl
>> commands) and it would be easier to test.
>
> I have added support for event triggers for ALTER SET COMPRESSION in
> current version. Event trigger on ALTER can be used to replace cmdrop
> function but it will be far from trivial. There is not easy way to
> understand that's attribute compression is really dropping in the
> command.
>

I've encountered unexpected behavior in command 'CREATE TABLE ... (LIKE 
...)'. It seems that it copies compression settings of the table 
attributes no matter which INCLUDING options are specified. E.g.

create table xxx(id serial, msg text compression pg_lz4);
alter table xxx alter column msg set storage external;
\d+ xxx
                 Table "public.xxx"
  Column |  Type   |  ...  | Storage  | Compression |
--------+---------+  ...  +----------+-------------+
  id     | integer |  ...  | plain    |             |
  msg    | text    |  ...  | external | pg_lz4      |

Now copy the table structure with "INCLUDING ALL":

create table yyy (like xxx including all);
\d+ yyy
                 Table "public.yyy"
  Column |  Type   |  ...  | Storage  | Compression |
--------+---------+  ...  +----------+-------------+
  id     | integer |  ...  | plain    |             |
  msg    | text    |  ...  | external | pg_lz4      |

And now copy without "INCLUDING ALL":

create table zzz (like xxx);
\d+ zzz
                 Table "public.zzz"
  Column |  Type   |  ...  | Storage  | Compression |
--------+---------+  ...  +----------+-------------+
  id     | integer |  ...  | plain    |             |
  msg    | text    |  ...  | extended | pg_lz4      |

As you see, compression option is copied anyway. I suggest adding new 
INCLUDING COMPRESSION option to enable user to explicitly specify 
whether they want or not to copy compression settings.


I found a few phrases in documentation that can be improved. But the 
documentation should be checked by a native speaker.

In compression-am.sgml:
"an compression access method" -> "a compression access method"
"compression method method" -> "compression method"
"compability" -> "compatibility"
Probably "local-backend cached state" would be better to replace with 
"per backend cached state"?
"Useful to store the parsed view of the compression options" -> "It 
could be useful for example to cache compression options"
"and stores result of" -> "and stores the result of"
"Called when CompressionAmOptions is creating." -> "Called when 
<structname>CompressionAmOptions</structname> is being initialized"

"Note that in any system cache invalidation related with 
pg_attr_compression relation the options will be cleaned" -> "Note that 
any <literal>pg_attr_compression</literal> relation invalidation will 
cause all the cached <literal>acstate</literal> options cleared."
"Function used to ..." -> "Function is used to ..."

I think it would be nice to mention custom compression methods in 
storage.sgml. At this moment it only mentions built-in pglz compression.

-- 
Ildar Musin
i.musin@postgrespro.ru


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
Next
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index