Home > mailing lists

Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

From	Ildus Kurbangaliev
Subject	Re: [HACKERS] Custom compression methods
Date	November 24, 2017 12:38:00
Msg-id	20171124123800.034c9208@wp.localdomain Whole thread Raw
In response to	Re: [HACKERS] Custom compression methods (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: [HACKERS] Custom compression methods
List	pgsql-hackers

Tree view

On Thu, 23 Nov 2017 21:54:32 +0100
Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> 
> Hmm, this seems to have fixed it, but only in one direction. Consider
> this:
> 
>     create table t_pglz (v text);
>     create table t_lz4 (v text compressed lz4);
> 
>     insert into t_pglz select repeat(md5(i::text),300)
>     from generate_series(1,100000) s(i);
> 
>     insert into t_lz4 select repeat(md5(i::text),300)
>     from generate_series(1,100000) s(i);
> 
>     \d+
> 
>      Schema |  Name  | Type  | Owner | Size  | Description
>     --------+--------+-------+-------+-------+-------------
>      public | t_lz4  | table | user  | 12 MB |
>      public | t_pglz | table | user  | 18 MB |
>     (2 rows)
> 
>     truncate t_pglz;
>     insert into t_pglz select * from t_lz4;
> 
>     \d+
> 
>      Schema |  Name  | Type  | Owner | Size  | Description
>     --------+--------+-------+-------+-------+-------------
>      public | t_lz4  | table | user  | 12 MB |
>      public | t_pglz | table | user  | 18 MB |
>     (2 rows)
> 
> which is fine. But in the other direction, this happens
> 
>     truncate t_lz4;
>     insert into t_lz4 select * from t_pglz;
> 
>      \d+
>                        List of relations
>      Schema |  Name  | Type  | Owner | Size  | Description
>     --------+--------+-------+-------+-------+-------------
>      public | t_lz4  | table | user  | 18 MB |
>      public | t_pglz | table | user  | 18 MB |
>     (2 rows)
> 
> which means the data is still pglz-compressed. That's rather strange,
> I guess, and it should compress the data using the compression method
> set for the target table instead.

That's actually an interesting issue. It happens because if tuple fits
to page then postgres just moves it as is. I've just added
recompression if it has custom compressed datums to keep dependencies
right. But look:
 create table t1(a text); create table t2(a text); alter table t2 alter column a set storage external; insert into t1
selectrepeat(md5(i::text),300) from   generate_series(1,100000) s(i); \d+
 
                     List of relations   Schema | Name | Type  | Owner |    Size    | Description
--------+------+-------+-------+------------+------------- public | t1   | table | ildus | 18 MB      |   public | t2
|table | ildus | 8192 bytes |  (2 rows)
 
 insert into t2 select * from t1;
 \d+
                   List of relations  Schema | Name | Type  | Owner | Size  | Description
--------+------+-------+-------+-------+------------- public | t1   | table | ildus | 18 MB |   public | t2   | table |
ildus| 18 MB |  (2 rows)
 

That means compressed datums now in the column with storage specified as
external. I'm not sure that's a bug or a feature. Lets insert them
usual way:
 delete from t2; insert into t2 select repeat(md5(i::text),300) from   generate_series(1,100000) s(i); \d+
                    List of relations  Schema | Name | Type  | Owner |  Size   | Description
--------+------+-------+-------+---------+------------- public | t1   | table | ildus | 18 MB   |   public | t2   |
table| ildus | 1011 MB | 
 

Maybe there should be more common solution like comparison of attribute
properties?

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

pgsql-hackers by date:

From: 高增琦
Date: 24 November 2017, 10:17:24
Subject: Re: How is the PostgreSQL debuginfo file generated

From: Michael Paquier
Date: 24 November 2017, 12:40:09
Subject: Re: documentation is now XML

Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

Previous

Next