On 6/16/25 19:56, Tom Lane wrote:
> ...
>
> After looking around a bit more I realized that this API is a complete
> disaster: it's not only bug-prone, but there are actual bugs in
> multiple callers, eg _ReadBuf() totally fails to detect early EOF as
> it intends to. We need to fix it, not slap some band-aids on.
> Draft patch attached.
>
> The write_func side is a bit of a mess too. I fixed some obvious API
> violations in the attached, but I think we need to do more, because
> it seems that zero thought was given to reporting errors sanely.
> The callers all assume that strerror() is what to use to report an
> incomplete write, but this is not appropriate in every case, for
> instance we need to pay attention to gzerror() in gzip cases.
> I'm inclined to think that the best answer is to move error reporting
> into the write_funcs, and just make the API spec be "write or die".
> I've not tackled that here though.
>
> I was depressed to find that Zstd_getc reads one byte into
> an otherwise-uninitialized int and then returns the whole int.
> Even if we're lucky enough for the variable to start off zero,
> this would fail utterly on big-endian machines. The only
> reason we've not noticed is that the regression tests do not
> reach Zstd_getc, nor most of the other getc_func implementations.
>
> Now I'm afraid to look at the rest of the compress_io.h API.
> If it's as badly written as these parts, there are more bugs
> to be found.
>
Well, that doesn't sound great :-( Most of this is likely my fault, as
the one who pushed e9960732a961 (and some commits that built on it).
Sorry about that. I'll take a fresh look at the commits this week, but
considering I missed the issues before commit ...
For a moment I was worried about breaking ABI when fixing this in the
backbranches, but I guess that's not an issue for tools like pg_dump.
regards
--
Tomas Vondra