Home > mailing lists

Re: Significantly larger toast tables on 8.4? - Mailing list pgsql-hackers

From	Gregory Maxwell
Subject	Re: Significantly larger toast tables on 8.4?
Date	January 7, 2009 10:45:18
Msg-id	e692861c0901070644y6f55f441gb39397ab4aca736b@mail.gmail.com Whole thread Raw
In response to	Re: Significantly larger toast tables on 8.4? (Martijn van Oosterhout <kleptog@svana.org>)
List	pgsql-hackers

Tree view

On Fri, Jan 2, 2009 at 5:48 PM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> So you compromise. You split the data into say 1MB blobs and compress
> each individually. Then if someone does a substring at offset 3MB you
> can find it quickly. This barely costs you anything in the compression
> ratio mostly.
>
> Implementation though, that's harder. The size of the blobs is tunable
> also. I imagine the optimal value will probably be around 100KB. (12
> blocks uncompressed).

Or have the database do that internally:  With the available fast
compression algorithms (zlib; lzo; lzf; etc) the diminishing return
from larger compression block sizes kicks in rather quickly. Other
algos like LZMA or BZIP gain more from bigger block sizes, but I
expect all of them are too slow to ever consider using in PostgreSQL.

So, I expect that the compression loss from compressing in chunks of
64kbytes would be minimal. The database could then include a list of
offsets for the 64kbyte chunks at the beginning of the field, or
something like that.  A short substring would then require
decompressing just one or two blocks, far less overhead then
decompressing everything.

It would probably be worthwhile to graph compression ratio vs block
size for some reasonable input.  I'd offer to do it; but I doubt I
have a reasonable test set for this.

pgsql-hackers by date:

From: Tom Lane
Date: 07 January 2009, 10:44:32
Subject: Re: reducing statistics write overhead

From: Bruce Momjian
Date: 07 January 2009, 10:48:02
Subject: Re: Multiplexing SUGUSR1

Re: Significantly larger toast tables on 8.4? - Mailing list pgsql-hackers

Previous

Next