On Fri, 12 Nov 1999, Tom Lane wrote:
> wieck@debis.com (Jan Wieck) writes:
> > Tom Lane wrote:
> >> It occurred to me last night that applying compression to individual
> >> fields might not be the best approach. Certainly a "bytez" data type
> >> is the easiest thing to fit into the existing system, but it's leaving
> >> some space savings on the table. What about compressing the *whole*
> >> data contents of a tuple on-disk, as a single entity? That should save
> >> more space than field-by-field compression.
>
> > But it requires decompression of every tuple into palloc()'d
> > memory during heap access. AFAIK, the heap access routines
> > currently return a pointer to the tuple inside the shm
> > buffer. Don't know what it's performance impact would be.
>
> Good point, but the same will be needed when a tuple is split across
> multiple blocks. I would expect that (given a reasonably fast
> decompressor) there will be a net performance *gain* due to having
> less disk I/O to do.
Right now, we're dealing theory...my concern is what Jan points
out "what it's performance impact would be"...would much harder would it
be to extent our "CREATE TABLE" syntax to do something like:
CREATE TABLE classname ( .. ) compressed;
Or something similar? Something that leaves the ability to do
this in the core, but makes the use of this the choice of the admin?
*Assuming* that I'm also reading this thread correctly, it should
almost be extended into "ALTER TABLE classname SET COMPRESSED on;", or
something like that. Where all new records are *written* compressed (or
uncompressed), but any reads checks if compressed size == uncompressed
size, and decompresses accordingly...
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org