On Fri, Apr 19, 2013 at 10:17 AM, Will Childs-Klein <willck93@gmail.com> wrote:
> Hello All,
> I'm writing today to inquire about finding the exact point in the source
> where postgres writes to disk. I'm trying to implement some compression in
> postgres. The idea is to compress the data right when its written to disk,
> to reduce the amount of data written to disk, reducing the amount of time of
> disk read/write. I'm hoping that this reduction in disk IO latency is
> greater than the CPU cost incurred by compression, resulting in a speedup. I
> will be testing various compression libraries to see which (if any) work
> well for various query types. I've been looking through the source code, in
> src/backend/storage specifically. I'm thinking something in smgr is where i
> want to make my de/compress calls. Specifically in
> src/backend/storage/smgr/md.c in the functions mdwrite(...) and mdread(...).
> Am I in the right place? If not where should I look?
this is not going to work. postgres tables are page organized -- if
you compress pages as they are written out they become variable
length. this in turn would cause the entire file to have to shift up
if you wrote a page back and the size grew.
as noted, postgres already compresses the most interesting case --
when a single tuple spans pages (aka toast).
merlin