On Sat, Sep 11, 2021 at 8:31 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> I've prototyped Random Access Compressed File for fun[0]. The code is very dirty proof-of-concept.
> I compress Buffile by one block at a time. There are directory pages to store information about the size of each
compressedblock. If any byte of the block is changed - whole block is recompressed. Wasted space is never reused. If
compressedblock is more then BLCSZ - unknown bad things will happen :)
Just reading this description, I suppose it's also Bad if the block is
recompressed and the new compressed size is larger than the previous
compressed size. Or do you have some way to handle that?
I think it's probably quite tricky to make this work if the temporary
files can be modified after the data is first written. If you have a
temporary file that's never changed after the fact, then you could
compress all the blocks and maintain, on the side, an index that says
where the compressed version of each block starts. That could work
whether or not the blocks expand when you try to compress them, and
you could even skip compression for blocks that get bigger when
"compressed" or which don't compress nicely, just by including a
boolean flag in your index saying whether that particular block is
compressed or not. But as soon as you have a case where the blocks can
get modified after they are created, then I don't see how to make it
work nicely. You can't necessarily fit the new version of the block in
the space allocated for the old version of the block, and putting it
elsewhere could turn sequential I/O into random I/O.
Leaving all that aside, I think this feature has *some* potential,
because I/O is expensive and compression could let us do less of it.
The problem is that a lot of the I/O that PostgreSQL thinks it does
isn't real I/O. Everybody is pretty much forced to set work_mem
conservatively to avoid OOM, which means a large proportion of
operations that exceed work_mem and thus spill to files don't actually
result in real I/O. They end up fitting in memory after all; it's only
that the memory in question belongs to the OS rather than to
PostgreSQL. And for operations of that type, which I believe to be
very common, compression is strictly a loss. You're doing extra CPU
work to avoid I/O that isn't actually happening.
--
Robert Haas
EDB: http://www.enterprisedb.com