[HACKERS] pg_basebackup: Allow use of arbitrary compression program - Mailing list pgsql-hackers

From Michael Harris
Subject [HACKERS] pg_basebackup: Allow use of arbitrary compression program
Date
Msg-id CADofcAX2=f=hW7-E_sVWObXRN80t0rvMeDr43WbZBrGx-v6y2w@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] pg_basebackup: Allow use of arbitrary compression program  (Jeff Janes <jeff.janes@gmail.com>)
Re: [HACKERS] pg_basebackup: Allow use of arbitrary compression program  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
Hello,

Back in pg 9.2, we hacked a copy of pg_basebackup to add a command
line option which would allow the user to specify an arbitrary
external program (potentially including arguments) to be used to
compress the tar backup.

Our motivation was to be able to use pigz (parallel gzip
implementation) to speed up the compression. It also allows using
tools like bzip2, xz, etc instead of the inbuilt zlib.

I never ended up submitting that upstream, but now it looks like I
will have to repeat the exercise for 9.6, so I was wondering if such a
feature would be welcomed.

I found one or two references to people asking for this, eg:
https://www.commandprompt.com/blog/a_pg_basebackup_wish_list/

To do it properly would require:

1) Adding command line option as follows:
 -C, --compressprog=PROG                        Use supplied program for compression

2) The current logic either uses zlib if compiled in, or offers no
compression at all, controlled by a series of #ifdef/#endif. I would
prefer that the user can either use zlib or an external program
without having to recompile, so I would remove the #ifdefs and replace
them with run time branching.

3) When opening the output file, if the -C option was used, use popen
to open a child process and write to that.

My questions are:
- Has anything like this already been discussed?
- Would this be a welcome contribution?
- Can anyone see any problems with the above approach?

Thanks!

Regards
Mike Harris



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] Performance improvement for joins where outer side is unique
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] No-op case in ExecEvalConvertRowtype