Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CABOikdOc7J3t5DX+oGOxo6YNhGOFcpx_jMe36Meeqv0bxH6xTw@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Rahila Syed <rahilasyed90@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers

I'm trying to understand what would it take to have this patch in an acceptable form before the next commitfest. Both Abhijit and Andres has done some extensive review of the patch and have given many useful suggestions to Rahila. While she has incorporated most of them, I feel we are still some distance away from having something which can be committed. Here are my observations based on the discussion on this thread so far.

1. Need for compressing full page backups:
There are good number of benchmarks done by various people on this list which clearly shows the need of the feature. Many people have already voiced their agreement on having this in core, even as a configurable parameter. There had been some requests to have more benchmarks such as response times immediately after a checkpoint or CPU consumption which I'm not entirely sure if already done.

2. Need for different compression algorithms:
There were requests for comparing different compression algorithms such as LZ4 and snappy. Based on the numbers that Rahila has posted, I can see LZ4 has the best compression ratio, at least for TPC-C benchmarks she tried. Having said that, I was hoping to see more numbers in terms of CPU resource utilization which will demonstrate the trade-off, if any. Anyways, there were also apprehensions expressed about whether to have pluggable algorithm in the final patch that gets committed. If we do decide to support more compression algorithms, I like what Andres had done before i.e. store the compression algorithm information in the varlena header. So basically, we should have a abstract API which can take a buffer and the desired algorithm and returns compressed data, along with varlena header with encoded information. ISTM that the patch Andres had posted earlier was focused primarily on toast data, but I think we can make it more generic so that both toast and FPW can use it.

Having said that, IMHO we should go one step at a time. We are using pglz for compressing toast data for long, so we can continue to use the same for compressing full page images. We can simultaneously work on adding more algorithms to core and choose the right candidate for different scenarios such as toast or FPW based on test evidences. But that work can happen independent of this patch.

3. Compressing one block vs all blocks:
Andres suggested that compressing all backup blocks in one go may give us better compression ratio. This is worth trying. I'm wondering what would the best way to do so without minimal changes to the xlog insertion code. Today, we add more rdata items for backup block header(s) and backup blocks themselves (if there is a "hole" then 2 per backup block) beyond what the caller has supplied. If we have to compress all the backup blocks together, then one approach is to copy the backup block headers and the blocks to a temp buffer, compress that and replace the rdata entries added previously with a single rdata. Is there a better way to handle multiple blocks in one go?

We still need a way to tell the restore path that the wal data is compressed. One way is to always add a varlena header irrespective of whether the blocks are compressed or not. This looks overkill. Another way to add a new field to XLogRecord to record this information. Looks like we can do this without increasing the size of the header since there are 2 bytes padding after the xl_rmid field.

4. Handling holes in backup blocks:
I think we address (3) then this can be easily done. Alternatively, we can also memzero the "hole" and then compress the entire page. The compression algorithm should handle that well.

Thoughts/comments?

Thanks,
Pavan

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal (9.5) : psql unicode border line styles
Next
From: Christoph Berg
Date:
Subject: Re: [TODO] Process pg_hba.conf keywords as case-insensitive