On Thu, Feb 13, 2014 at 10:20:46AM +0530, Amit Kapila wrote:
> > Why not *only* prefix/suffix?
>
> To represent prefix/suffix match, we atleast need a way to tell
> that the offset and len of matched bytes and then how much
> is the length of unmatched bytes we have copied.
> I agree that a simpler format could be devised if we just want to
> do prefix-suffix match, but that would require much more test
> during recovery to ensure everything is fine, advantage with LZ
> format is that we don't need to bother about decoding, it will work
> as without any much change in LZ decode routine.
Based on the numbers I think prefix/suffix-only needs to be explored.
Consider if you just change one field of a row --- prefix/suffix would
find all the matching parts. If you change the first and last fields,
you get no compression at all, but your prefix/suffix test isn't going
to get that either.
As I understand it, the only place prefix/suffix with LZ compression is
a win over prefix/suffix-only is when you change two middle fields, and
there are common fields unchanged between them. If we are looking at
11% CPU overhead for that, it isn't worth it.
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://enterprisedb.com
+ Everyone has their own god. +