> On Thu, Aug 6, 2009 at 4:03 PM, Greg Stark<gsstark@mit.edu> wrote:
>> I like the idea too, but I think there are some major problems to
>> solve. In particular I think we need a better solution to blocks
>> growing than sparse files.
>
> How much benefit does this approach have over using TOAST compression
> more aggressively?
>
> ...Robert
>
The two are different :
- TOAST compresses a large column value.
To store a 100KB text file, TOAST is great.
- page compression compresses whole pages.
Suppose you have a table with a TIMESTAMP, and a few INT columns. The rows
are small enough to make per-row compression useless, and TOAST cannot
compress non-varlenas anyway. However, if (for instance) the timestamp is
the row insertion date, and you INSERT several rows per second, most
timestamps on a page will have lots of bytes in common. Also, row headers
(which are larger than the rows) will have much redundant data. Page
compression can exploit this, without the need for the rest of the code to
know about it.
Page compression can also handle indexes, etc.
Also, External TOAST is nice if you seldom need the field : for instance,
you search on in-page columns, get the row you need, and fetch it.
Suppose you have a forum : in this case, when you display a topic page,
you need all the posts text. It would be a very bad idea to store them in
a separate TOAST table, because it would create more random IO. Storing
the posts in the page means less IO, and if you regularly CLUSTER your
table, all the posts you need to display a topic page are on the same (or
adjacent) postgres page. In this case, individual post text can be
TOASTed, too, but compression tends to work better with longer blocks, so
compressing the whole page will be more efficient.