Re: pglz performance - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: pglz performance |
Date | |
Msg-id | 20190802144345.d62jtiyyx6r2y73f@development Whole thread Raw |
In response to | Re: pglz performance (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: pglz performance
|
List | pgsql-hackers |
On Fri, Aug 02, 2019 at 04:45:43PM +0300, Konstantin Knizhnik wrote: > > >On 27.06.2019 21:33, Andrey Borodin wrote: >> >>>13 мая 2019 г., в 12:14, Michael Paquier <michael@paquier.xyz> написал(а): >>> >>>Decompression can matter a lot for mostly-read workloads and >>>compression can become a bottleneck for heavy-insert loads, so >>>improving compression or decompression should be two separate >>>problems, not two problems linked. Any improvement in one or the >>>other, or even both, is nice to have. >>Here's patch hacked by Vladimir for compression. >> >>Key differences (as far as I see, maybe Vladimir will post more complete list of optimizations): >>1. Use functions instead of macro-functions: not surprisingly it's easier to optimize them and provide less constraintsfor compiler to optimize. >>2. More compact hash table: use indexes instead of pointers. >>3. More robust segment comparison: like memcmp, but return index of first different byte >> >>In weighted mix of different data (same as for compression), overall speedup is x1.43 on my machine. >> >>Current implementation is integrated into test_pglz suit for benchmarking purposes[0]. >> >>Best regards, Andrey Borodin. >> >>[0] https://github.com/x4m/test_pglz > >It takes me some time to understand that your memcpy optimization is >correct;) >I have tested different ways of optimizing this fragment of code, but >failed tooutperform your implementation! >Results at my computer is simlar with yours: > >Decompressor score (summ of all times): >NOTICE: Decompressor pglz_decompress_hacked result 6.627355 >NOTICE: Decompressor pglz_decompress_hacked_unrolled result 7.497114 >NOTICE: Decompressor pglz_decompress_hacked8 result 7.412944 >NOTICE: Decompressor pglz_decompress_hacked16 result 7.792978 >NOTICE: Decompressor pglz_decompress_vanilla result 10.652603 > >Compressor score (summ of all times): >NOTICE: Compressor pglz_compress_vanilla result 116.970005 >NOTICE: Compressor pglz_compress_hacked result 89.706105 > > >But ... below are results for lz4: > >Decompressor score (summ of all times): >NOTICE: Decompressor lz4_decompress result 3.660066 >Compressor score (summ of all times): >NOTICE: Compressor lz4_compress result 10.288594 > >There is 2 times advantage in decompress speed and 10 times advantage >in compress speed. >So may be instead of "hacking" pglz algorithm we should better switch >to lz4? > I think we should just bite the bullet and add initdb option to pick compression algorithm. That's been discussed repeatedly, but we never ended up actually doing that. See for example [1]. If there's anyone willing to put some effort into getting this feature over the line, I'm willing to do reviews & commit. It's a seemingly small change with rather insane potential impact. But even if we end up doing that, it still makes sense to optimize the hell out of pglz, because existing systems will still use that (pg_upgrade can't switch from one compression algorithm to another). regards [1] https://www.postgresql.org/message-id/flat/55341569.1090107%402ndquadrant.com -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: