Home > mailing lists

Compressing Fields? - Mailing list pgsql-hackers

From	Christopher Browne
Subject	Compressing Fields?
Date	May 31, 2003 01:55:10
Msg-id	20030531025256.409265EDC8@cbbrowne.com Whole thread Raw
Responses	Re: Compressing Fields?
List	pgsql-hackers

Tree view

I was wondering if anyone has had occasion to hook up a compression scheme 
(Huffman/gzip/zlib/whatever) as a PostgreSQL function.

I've got a case where there is a need to store fairly large chunks of XML in a 
database.  There seems little reason to parse it beforehand, as many seem wont 
to do.  In fact, there will often be no need to look at it at all.  Typically, 
it will be ignored, only to be looked at if a human specifically asks for it.

Ergo, it would be a slick idea to cut a 1700 byte field down to 253 by 
[somehow compressing it] before sticking it into a "bytea" field.  That will 
save on disk space, pack more records into pages, and generally lead to 
queries being a bit cheaper.

insert into log_table (id, txn_date, metadata, xml) values(4271324, '2003-07-01', 'Useless Data', compress('<xml> <a>
<b>This is worthless data </b> <c status="139"/> <d> <e>Foo </e> </d> <bar> And 8K of futile murmurings </bar> </a>
</xml>');

90% of the time, we need only:

select id, txn_date, metadata from log_table;

And the other 10%, we do
select id, txn_date, metadata, uncompress(xml) from logtable where id in 
(871009, 873281, 8321947);

It would surely be possible for the client software to do the compression, but 
it seems an interesting thought to do it on the server, thereby making it 
"language-neutral" such that I could write client software in Perl, Python, or 
even  use Pierre Mai's binding to CMU/CL without having to worry about whether 
or not there's a binding of the compression algorithm to whatever client 
language I might imagine using.

I don't see anything in contrib for this.  pgcrypto obviously does something 
similar for cryptographic functions, but I don't see compression on the list.

Presumably something could be constructed using zlib; if anyone has done this 
already, it would be nice to know of...
--
output = reverse("moc.enworbbc" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/advocacy.html
"Bureaucracies interpret communication as damage and route around it"
-- Jamie Zawinski

pgsql-hackers by date:

From: Bruce Momjian
Date: 31 May 2003, 01:54:54
Subject: Re: Testing the return value of fclose() in the backend

From: Gavin Sherry
Date: 31 May 2003, 02:08:28
Subject: Re: Testing the return value of fclose() in the backend

Compressing Fields? - Mailing list pgsql-hackers

Previous

Next