Re: refactoring basebackup.c (zstd workers) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: refactoring basebackup.c (zstd workers)
Date
Msg-id CA+TgmoaaDXOT+vFDcPrnXjPTU7oFPY2_pkeQFTdSGJfrqO2LNQ@mail.gmail.com
Whole thread Raw
In response to Re: refactoring basebackup.c  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: refactoring basebackup.c (zstd workers)
Re: refactoring basebackup.c (zstd workers)
List pgsql-hackers
On Thu, Mar 17, 2022 at 3:41 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> gzip comma

I think it's fine the way it's written. If we made that change, then
we'd have a comma for gzip and not for the other two algorithms. Also,
I'm just moving that sentence, so any change that there is to be made
here is a job for some other patch.

> alphabetical

Fixed.

> -                                                errmsg("unrecognized compression algorithm: \"%s\"",
> +                                                errmsg("unrecognized compression algorithm \"%s\"",
>
> Most other places seem to say "compression method".  So I'd suggest to change
> that here, and in doc/src/sgml/ref/pg_basebackup.sgml.

I'm not sure that's really better, and I don't think this patch is
introducing an altogether novel usage. I think I would probably try to
standardize on algorithm rather than method if I were standardizing
the whole source tree, but I think we can leave that discussion for
another time.

> -       if (o_compression_level && !o_compression)
> +       if (o_compression_detail && !o_compression)
>                 ereport(ERROR,
>                                 (errcode(ERRCODE_SYNTAX_ERROR),
>                                  errmsg("compression level requires compression")));
>
> s/level/detail/

Fixed.


> It'd be great if this were re-usable for wal_compression, which I hope in pg16 will
> support at least level=N.  And eventually pg_dump.  But those clients shouldn't
> accept a client/server prefix.  Maybe the way to handle that is for those tools
> to check locationres and reject it if it was specified.

One thing I forgot to mention in my previous response is that I think
the parsing code is actually well set up for this the way I have it.
server- and client- gets parsed off in a different place than we
interpret the rest, which fits well with your observation that other
cases wouldn't have a client or server prefix.

> sp: sandwich

Fixed.

> star

Fixed.

> should be const ?

OK.

>
> +       /* As a special case, the specification can be a bare integer. */
> +       bare_level = strtol(specification, &bare_level_endp, 10);
>
> Should this call expect_integer_value()?
> See below.

I don't think that would be useful. We have no keyword to pass for the
error message, nor would we use the error message if one got
constructed.

> +                       result->parse_error =
> +                               pstrdup("found empty string where a compression option was expected");
>
> Needs to be localized with _() ?
> Also, document that it's pstrdup'd.

Did the latter. The former would need to be fixed in a bunch of places
and while I'm happy to accept an expert opinion on exactly what needs
to be done here, I don't want to try to do it and do it wrong. Better
to let someone with good knowledge of the subject matter patch it up
later than do a crummy job now.

> -1 isn't great, since it's also an integer, and, also a valid compression level
> for zstd (did you see my message about that?).  Maybe INT_MIN is ok.

It really doesn't matter. Could just return 42. The client shouldn't
use the value if there's an error.

> +{
> +       int             ivalue;
> +       char   *ivalue_endp;
> +
> +       ivalue = strtol(value, &ivalue_endp, 10);
>
> Should this also set/check errno ?
> And check if value != ivalue_endp ?
> See strtol(3)

Even after reading the man page for strtol, it's not clear to me that
this is needed. That page represents checking *endptr != '\0' as
sufficient to tell whether an error occurred. Maybe it wouldn't catch
an out of range value, but in practice all of the algorithms we
support now and any we support in the future are going to catch
something clamped to LONG_MIN or LONG_MAX as out of range and display
the correct error message. What's your specific thinking here?

> +       unsigned        options;                /* OR of BACKUP_COMPRESSION_OPTION constants */
>
> Should be "unsigned int" or "bits32" ?

I do not see why either of those would be better.

> The server crashes if I send an unknown option - you should hit that in the
> regression tests.

Turns out I was testing this on the client side but not the server
side. Fixed and added more tests.

v2 attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: support for MERGE
Next
From: Tom Lane
Date:
Subject: Re: refactoring basebackup.c (zstd workers)