Re: Call for Google Summer of Code (GSoC) 2012: Project ideas? - Mailing list pgsql-general

From Andy Colson
Subject Re: Call for Google Summer of Code (GSoC) 2012: Project ideas?
Date
Msg-id 4F5A2DAA.3020100@squeakycode.net
Whole thread Raw
In response to Re: Call for Google Summer of Code (GSoC) 2012: Project ideas?  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Call for Google Summer of Code (GSoC) 2012: Project ideas?  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-general
On 3/9/2012 9:47 AM, Merlin Moncure wrote:
> On Thu, Mar 8, 2012 at 2:01 PM, Andy Colson<andy@squeakycode.net>  wrote:
>> I know toast compresses, but I believe its only one row.  page level would
>> compress better because there is more data, and it would also decrease the
>> amount of IO, so it might speed up disk access.
>
> er, but when data is toasted it's spanning pages.  page level
> compression is a super complicated problem.
>
> something that is maybe more attainable on the compression side of
> things is a userland api for compression -- like pgcrypto is for
> encryption.  even if it didn't make it into core, it could live on
> reasonably as a pgfoundry project.
>
> merlin

Agreed its probably too difficult for a GSoC project.  But userland api
would still be row level, which, in my opinion is useless.  Consider
rows from my apache log that I'm dumping to database:

date, url, status
2012-3-9 10:15:00, '/index.php?id=4', 202
2012-3-9 10:15:01, '/index.php?id=5', 202
2012-3-9 10:15:02, '/index.php?id=6', 202

That wont compress at all on a row level.  But it'll compress 99% on a
"larger" (page/multirow/whatever/?) level.

-Andy

pgsql-general by date:

Previous
From: Frank Church
Date:
Subject: Re: How to erase transaction logs on PostgreSQL
Next
From: Scott Marlowe
Date:
Subject: Re: autovacuum and transaction id wraparound