Re: Repair cosmetic damage (done by pg_indent?) - Mailing list pgsql-patches

From Decibel!
Subject Re: Repair cosmetic damage (done by pg_indent?)
Date
Msg-id 20070803231209.GW25704@nasby.net
Whole thread Raw
In response to Re: Repair cosmetic damage (done by pg_indent?)  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Repair cosmetic damage (done by pg_indent?)  (Decibel! <decibel@decibel.org>)
List pgsql-patches
On Sun, Jul 29, 2007 at 12:06:50PM +0100, Gregory Stark wrote:
> "Decibel!" <decibel@decibel.org> writes:
>
> > On Fri, Jul 27, 2007 at 04:07:01PM +0100, Gregory Stark wrote:
> >> Fwiw, do we really not want to compress anything smaller than 256 bytes
> >> (everyone in Postgres uses the default strategy, not the always strategy).
> >
> > Is there actually a way to specify always compressing? I'm not seeing it
> > on http://www.postgresql.org/docs/8.2/interactive/storage-toast.html
>
> In the code there's an "always" strategy, but nothing in Postgres uses it so
> there's no way to set it using ALTER TABLE ... SET STORAGE.
>
> That might be an interesting approach though. We could add another SET STORAGE
> value "COMPRESSIBLE" which says to use the always strategy. The neat thing
> about this is we could set bpchar to use this storage type by default.

Yeah, we should have that. I'll add it to my TODO...

> >> ISTM that with things like CHAR(n) around we might very well have some
> >> databases where compression for smaller sized datums would be beneficial. I
> >> would suggest 32 for the minimum.
> >
> > CPU is generally cheaper than IO now-a-days, so I agree with something
> > less than 256. Not sure what would be best though.
>
> Well it depends a lot on how large your database is. If your whole database
> fits in RAM and you use datatypes like CHAR(n) only for storing data which is
> exactly b characters long then there's really no benefit to trying to compress
> smaller data.

Well, something else to consider is that this could make a big
difference between a database fitting in memory and not...

> If on the other hand your database is heavily I/O-bound and you're using
> CHAR(n) or storing other highly repetitive short strings then compressing data
> will save I/O bandwidth at the expense of cpu cycles.
>
> > I do have a database that has both user-entered information as well as
> > things like email addresses, so I could do some testing on that if
> > people want.
>
> You would have to recompile with the value at line 214 of
> src/backend/utils/adt/pg_lzcompress.c set to a lower value.

Ok, I'll work up some numbers. The only tests that come to mind are how long it
takes to load a dump (with indexes) and the resulting table size. Other ideas?
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

pgsql-patches by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: Async Commit, v21 (now: v22)
Next
From: Decibel!
Date:
Subject: Re: strpos() && KMP