On Fri, Jan 6, 2017 at 11:07 PM, Magnus Hagander <magnus@hagander.net> wrote: > A few further notes:
Thanks for the review. > You are using the filemode to gzopen and the mode_compression variable to > set the compression level. The pre-existing code in pg_basebackup uses > gzsetparams(). Is there a particular reason you didn't do it the same way? > > Small comment: > - if (pad_to_size) > + if (pad_to_size && dir_data->compression == 0) > { > /* Always pre-pad on regular files */ > > > That "always" is not true anymore. Commit-time cleanup can be done of that. > > The rest of this looks good to me, but please comment on the gzopen part > before we proceed to commit :)
Yes using gzsetparams() looks cleaner. I actually thought about using the same logic as pg_dump. Attached is an updated patch.
There is something I forgot. With this patch, FindStreamingStart()@pg_receivexlog.c is actually broken. In short it forgets to consider files that have been compressed at the last run of pg_receivexlog and will try to stream changes from the beginning. I can see that gzip -l provides this information... But I have yet to find something in zlib that allows a cheap lookup as startup of streaming should be fast. Looking at how gzip -l does it may be faster than looking at the docs.
Do we really care though? As in, does statup of streaming have to be *that* fast? Even gunziping 16Mb (worst case) doesn't exactly take a long time. If your pg_receivexlog is restarting so often that it becomes a problem, I think you already have another and much bigger problem on your hands.
I found another problem with it -- it is completely broken in sync mode. You need to either forbid sync mode together with compression, or teach dir_sync() about it. The later would sound better, but I wonder how much that's going to kill compression due to the small blocks? Is it a reasonable use-case?