Thread: Simplifications for error messages related to compression

Simplifications for error messages related to compression

From
Michael Paquier
Date:
Hi all,

While looking at a different patch, I have noticed that the error
messages produced by pg_basebackup and pg_dump are a bit inconsistent
with the other others.  Why not switching all of them as of the
attached?  This reduces the overall translation effort, using more:
"this build does not support compression with %s"

Thoughts or objections?
--
Michael

Attachment

Re: Simplifications for error messages related to compression

From
Michael Paquier
Date:
On Mon, Dec 19, 2022 at 02:42:13PM +0900, Michael Paquier wrote:
> Thoughts or objections?

Hearing nothing, done..
--
Michael

Attachment

Re: Simplifications for error messages related to compression

From
Justin Pryzby
Date:
On Wed, Dec 21, 2022 at 10:43:19AM +0900, Michael Paquier wrote:
> On Mon, Dec 19, 2022 at 02:42:13PM +0900, Michael Paquier wrote:
> > Thoughts or objections?
> 
> Hearing nothing, done..

-                       pg_fatal("not built with zlib support");
+                       pg_fatal("this build does not support compression with %s", "gzip");

I tried to say in the other thread that gzip != zlib.

This message may be better for translation, but (for WriteDataToArchive
et al) the message is now less accurate, and I suspect will cause some
confusion.

5e73a6048 introduced a similar user-facing issue: pg_dump -Fc -Z gzip
does not output a gzip.

$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Fc -Z gzip regression |xxd |head
00000000: 5047 444d 5001 0e00 0408 0101 0100 0000  PGDMP...........

I'm okay with it if you think this is no problem - maybe it's enough to
document that the output is zlib and not gzip.

Otherwise, one idea was to reject "gzip" with -Fc.  It could accept
integers only.

BTW I think it's helpful to include the existing participants when
forking a thread.

-- 
Justin



Re: Simplifications for error messages related to compression

From
Michael Paquier
Date:
On Tue, Dec 20, 2022 at 08:29:32PM -0600, Justin Pryzby wrote:
> -                       pg_fatal("not built with zlib support");
> +                       pg_fatal("this build does not support compression with %s", "gzip");
>
> I tried to say in the other thread that gzip != zlib.
>
> This message may be better for translation, but (for WriteDataToArchive
> et al) the message is now less accurate, and I suspect will cause some
> confusion.

Compression specifications use this term, so, like bbstreamer_gzip.c,
that does not sound like a big difference to me as everything depends
on HAVE_LIBZ, still we use gzip for all the user-facing terms.

> 5e73a6048 introduced a similar user-facing issue: pg_dump -Fc -Z gzip
> does not output a gzip.

We've never mentioned any compression method in the past docs, just
that things can be compressed.

> $ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Fc -Z gzip regression |xxd |head
> 00000000: 5047 444d 5001 0e00 0408 0101 0100 0000  PGDMP...........
>
> I'm okay with it if you think this is no problem - maybe it's enough to
> document that the output is zlib and not gzip.

Perhaps.

> Otherwise, one idea was to reject "gzip" with -Fc.  It could accept
> integers only.

I am not sure what we would gain by doing that, except complications
with the code surrounding the handling of compression specifications,
which is a backward-compatible thing as it can handle integer-only
inputs.

> BTW I think it's helpful to include the existing participants when
> forking a thread.

Err, okay..  Sorry about that.
--
Michael

Attachment

Re: Simplifications for error messages related to compression

From
Justin Pryzby
Date:
On Wed, Dec 21, 2022 at 01:52:21PM +0900, Michael Paquier wrote:
> On Tue, Dec 20, 2022 at 08:29:32PM -0600, Justin Pryzby wrote:
> > -                       pg_fatal("not built with zlib support");
> > +                       pg_fatal("this build does not support compression with %s", "gzip");
> > 
> > I tried to say in the other thread that gzip != zlib.
> >
> > This message may be better for translation, but (for WriteDataToArchive
> > et al) the message is now less accurate, and I suspect will cause some
> > confusion.
> 
> Compression specifications use this term, so, like bbstreamer_gzip.c,

Yes, and its current users (basebackup) output a gzip file, right ?

pg_dump -Fc doesn't output a gzip file, but now it's using user-facing
compression specifications referring to it as "gzip".

> that does not sound like a big difference to me as everything depends
> on HAVE_LIBZ, still we use gzip for all the user-facing terms.

postgres is using -lz to write both gzip files and non-gzip libz files;
its associated compiletime constant has nothing to do with which header
format is being output.

    * This file includes two APIs for dealing with compressed data. The first
    * provides more flexibility, using callbacks to read/write data from the
    * underlying stream. The second API is a wrapper around fopen/gzopen and
    * friends, providing an interface similar to those, but abstracts away
    * the possible compression. Both APIs use libz for the compression, but
    * the second API uses gzip headers, so the resulting files can be easily
    * manipulated with the gzip utility.

> > 5e73a6048 introduced a similar user-facing issue: pg_dump -Fc -Z gzip
> > does not output a gzip.
> 
> We've never mentioned any compression method in the past docs, just
> that things can be compressed.

What do you mean ?

The commit added:
+        The compression method can be set to <literal>gzip</literal> or ...

And the docs still say:

-        For plain text output, setting a nonzero compression level causes
-        the entire output file to be compressed, as though it had been
-        fed through <application>gzip</application>; but the default is not to compress.

If you tell someone they can write -Z gzip, they'll be justified in
expecting to see "gzip" as output.

-- 
Justin



Re: Simplifications for error messages related to compression

From
Michael Paquier
Date:
On Tue, Dec 20, 2022 at 11:12:22PM -0600, Justin Pryzby wrote:
> Yes, and its current users (basebackup) output a gzip file, right ?
>
> pg_dump -Fc doesn't output a gzip file, but now it's using user-facing
> compression specifications referring to it as "gzip".

Not all of them are compressed either, like the base TOC file.

> If you tell someone they can write -Z gzip, they'll be justified in
> expecting to see "gzip" as output.

That's the point where my interpretation is different than yours,
where I don't really see as an issue that we do not generate a gzip
file all the time in the output.  Honestly, I am not sure that there
is anything to win here by not using the same option interface for all
the binaries or have tweaks to make pg_dump cope with that (like using
zlib as an extra alias).  The custom, directory and tar formats of
pg_dumps have their own idea of the files to compress or not (like
the base TOC file is never compressed so as one can do a pg_restore
-l).
--
Michael

Attachment