Thread: Pg and compress

Pg and compress

From

Jov

Date:

26 September 2011, 10:59:54

Hi all,
We are going to use pg as data warehouse,but after some test,we found that plain text with csv format is 3 times bigger when load to pg.we use copy to load data.we try some optimize and it reduce to 2.5 times bigger.other db can avarage compress to 1/3 of the plain text.bigger data means heavy io.
So my question is how to make data compressed in pg?is some fs such as zfs,berfs with compression feature can work well with pg?

Re: Pg and compress

From

John R Pierce

Date:

26 September 2011, 16:33:48

On 09/26/11 6:59 AM, Jov wrote:
>
> Hi all,
> We are going to use pg as data warehouse,but after some test,we found
> that plain text with csv format is 3 times bigger when load to pg.we
> use copy to load data.we try some optimize and it reduce to 2.5 times
> bigger.other db can  avarage compress  to 1/3 of the plain text.bigger
> data means heavy io.
> So my question is how to make data compressed in pg?is some fs  such
> as zfs,berfs with compression feature can work well with pg?
>

your source data is CSV, what data types are the fields in the table(s)
? do you have a lot of indexes on this table(s)?



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast

Re: Pg and compress

From

Jov

Date:

26 September 2011, 21:54:22

Most are bigint and one field is varchar.
There is no index.

在 2011-9-27 上午3:34，"John R Pierce" <pierce@hogranch.com>写道：
>
> On 09/26/11 6:59 AM, Jov wrote:
>>
>>
>> Hi all,
>> We are going to use pg as data warehouse,but after some test,we found that plain text with csv format is 3 times bigger when load to pg.we use copy to load data.we try some optimize and it reduce to 2.5 times bigger.other db can avarage compress to 1/3 of the plain text.bigger data means heavy io.
>> So my question is how to make data compressed in pg?is some fs such as zfs,berfs with compression feature can work well with pg?
>>
>
> your source data is CSV, what data types are the fields in the table(s) ? do you have a lot of indexes on this table(s)?
>
>
>
> --
> john r pierce N 37, W 122
> santa cruz ca mid-left coast
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Re: Pg and compress

From

John R Pierce

Date:

26 September 2011, 22:09:35

On 09/26/11 5:53 PM, Jov wrote:
>
> Most are bigint and one field is varchar.
> There is no index.
>
>

well, scalar bigint values will be 8 bytes, plus a bit or 2 of overhead
per field. each complete tuple has a dozen bytes of header overhead.
tuples are stored as many as fit in a 8K block, unless you've specified
a fillfactor, whereupon that % of space is left free in each block.

if your CSV has mostly small integer values that are just 1-2-3 digits,
yeah, bigint will take more space than ascii.

--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast