Re: Table and Index compression - Mailing list pgsql-hackers

From Pierre Frédéric Caillaud
Subject Re: Table and Index compression
Date
Msg-id op.ux997pe7cke6l8@soyouz
Whole thread Raw
In response to Re: Table and Index compression  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
> On Thu, Aug 6, 2009 at 4:03 PM, Greg Stark<gsstark@mit.edu> wrote:
>> I like the idea too, but I think there are some major problems to
>> solve. In particular I think we need a better solution to blocks
>> growing than sparse files.
>
> How much benefit does this approach have over using TOAST compression
> more aggressively?
>
> ...Robert
>


The two are different :

- TOAST compresses a large column value.

To store a 100KB text file, TOAST is great.

- page compression compresses whole pages.

Suppose you have a table with a TIMESTAMP, and a few INT columns. The rows  
are small enough to make per-row compression useless, and TOAST cannot  
compress non-varlenas anyway. However, if (for instance) the timestamp is  
the row insertion date, and you INSERT several rows per second, most  
timestamps on a page will have lots of bytes in common. Also, row headers  
(which are larger than the rows) will have much redundant data. Page  
compression can exploit this, without the need for the rest of the code to  
know about it.

Page compression can also handle indexes, etc.

Also, External TOAST is nice if you seldom need the field : for instance,  
you search on in-page columns, get the row you need, and fetch it.
Suppose you have a forum : in this case, when you display a topic page,  
you need all the posts text. It would be a very bad idea to store them in  
a separate TOAST table, because it would create more random IO. Storing  
the posts in the page means less IO, and if you regularly CLUSTER your  
table, all the posts you need to display a topic page are on the same (or  
adjacent) postgres page. In this case, individual post text can be  
TOASTed, too, but compression tends to work better with longer blocks, so  
compressing the whole page will be more efficient.




pgsql-hackers by date:

Previous
From: Pierre Frédéric Caillaud
Date:
Subject: Re: Table and Index compression
Next
From: Sam Mason
Date:
Subject: Re: Table and Index compression