Re: refactoring basebackup.c - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: refactoring basebackup.c |
Date | |
Msg-id | CA+Tgmobm+k0VSeyCa9_TbB9FJfZM6evqfeB7_qC+fMrvUuVvCw@mail.gmail.com Whole thread Raw |
In response to | Re: refactoring basebackup.c (Dipesh Pandit <dipesh.pandit@gmail.com>) |
List | pgsql-hackers |
On Wed, Jan 19, 2022 at 7:16 AM Dipesh Pandit <dipesh.pandit@gmail.com> wrote: > I have added support for decompressing a gzip compressed tar file > at client. pg_basebackup can enable server side compression for > plain format backup with this change. > > Added a gzip extractor which decompresses the compressed archive > and forwards it to the next streamer. I have done initial testing and > working on updating the test coverage. Cool. It's going to need some documentation changes, too. I don't like the way you coded this in CreateBackupStreamer(). I would like the decision about whether to use bbstreamer_gzip_extractor_new(), and/or throw an error about not being able to parse an archive, to based on the file type i.e. "did we get a .tar.gz file?" rather than on whether we asked for server-side compression. Notice that the existing logic checks whether we actually got a .tar file from the server rather than assuming that's what must have happened. As a matter of style, I don't think it's good for the only thing inside of an "if" statement to be another "if" statement. The two could be merged, but we also don't want to have the "if" conditional be too complex. I am imagining that this should end up saying something like if (must_parse_archive && !is_tar && !is_tar_gz) { pg_log_error(... + * "windowBits" must be greater than or equal to "windowBits" value + * provided to deflateInit2 while compressing. It would be nice to clarify why we know the value we're using is safe. Maybe we're using the maximum possible value, in which case you could just add that to the end of the comment: "...so we use the maximum possible value for safety." + /* + * End of the stream, if there is some pending data in output buffers then + * we must forward it to next streamer. + */ + if (res == Z_STREAM_END) { + bbstreamer_content(mystreamer->base.bbs_next, member, mystreamer->base.bbs_buffer.data, + mystreamer->bytes_written, context); + } Uncuddle the brace. It probably doesn't make much difference, but I would be inclined to do the final flush in bbstreamer_gzip_extractor_finalize() rather than here. That way we rely on our own notion of when there's no more input data rather than zlib's notion. Probably terrible things are going to happen if those two ideas don't match up .... but there might be some other compression algorithm that doesn't return a distinguishing code at end-of-stream. Such an algorithm would have to take care of any leftover data in the finalize function, so I think we should do that here too, so the code can be similar in all cases. Perhaps we should move all the gzip stuff to a new file bbstreamer_gzip.c. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: