Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] Custom compression methods
Date
Msg-id CAPpHfduP77qxdETcPRc_FGOVJpeSCwNGoSUgNU3BCHXu57RVMA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Custom compression methods  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Custom compression methods  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Dec 11, 2017 at 8:46 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Dec 11, 2017 at 12:41 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Thus, in your example if user would like to further change awesome
> compression for evenbetter compression, she should write.
>
> SET COMPRESSION evenbetter PRESERVE pglz, awesome; -- full list of previous
> compression methods

Right.

> I wonder what should we do if user specifies only part of previous
> compression methods?  For instance, pglz is specified but awesome is
> missing.
>
> SET COMPRESSION evenbetter PRESERVE pglz; -- awesome is missing
>
> I think we should trigger an error in this case.  Because query is specified
> in form that is assuming to work without table rewrite, but we're unable to
> do this without table rewrite.

I think that should just rewrite the table in that case.  PRESERVE
should specify the things that are allowed to be preserved -- its mere
presence should not be read to preclude a rewrite.  And it's
completely reasonable for someone to want to do this, if they are
thinking about de-installing awesome.
 
OK, but NOTICE that presumably unexpected table rewrite takes place could be still useful.

Also we probably should add some view that will expose compression methods whose are currently preserved for columns.  So that user can correctly construct SET COMPRESSION query that doesn't rewrites table without digging into internals (like directly querying pg_depend).

> I also think that we need some way to change compression method for multiple
> columns in a single table rewrite.  Because it would be way more efficient
> than rewriting table for each of columns.  So as an alternative of
>
> ALTER TABLE tbl ALTER COLUMN c1 SET COMPRESSION awesome; -- first table
> rewrite
> ALTER TABLE tbl ALTER COLUMN c2 SET COMPRESSION awesome; -- second table
> rewrite
>
> we could also provide
>
> ALTER TABLE tbl ALTER COLUMN c1 SET COMPRESSION awesome PRESERVE pglz; -- no
> rewrite
> ALTER TABLE tbl ALTER COLUMN c2 SET COMPRESSION awesome PRESERVE pglz; -- no
> rewrite
> VACUUM FULL tbl RESET COMPRESSION PRESERVE c1, c2; -- rewrite with
> recompression of c1 and c2 and removing depedencies
>
> ?

Hmm.  ALTER TABLE allows multi comma-separated subcommands, so I don't
think we need to drag VACUUM into this.  The user can just say:

ALTER TABLE tbl ALTER COLUMN c1 SET COMPRESSION awesome, ALTER COLUMN
c2 SET COMPRESSION awesome;

If this is properly integrated into tablecmds.c, that should cause a
single rewrite affecting both columns.

OK.  Sorry, I didn't notice we can use multiple subcommands for ALTER TABLE in this case...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Rethinking MemoryContext creation
Next
From: Tom Lane
Date:
Subject: Re: Rethinking MemoryContext creation