Thread: hash join error improvement (old)

hash join error improvement (old)

From

Alvaro Herrera

Date:

26 May 2020, 02:02:20

I recently noticed this in a customer log file:
  ERROR:  could not read from hash-join temporary file: Success

The problem is we're reporting with %m when the problem is a partial
read or write.

I propose the attached patch to solve it: report "wrote only X of X
bytes".  This caused a lot of other trouble, the cause of which I've
been unable to pinpoint as yet.  But in the meantime, this is already a
small improvement.

-- 
Álvaro Herrera

Attachment

hashjoin.patch

Re: hash join error improvement (old)

From

Tom Lane

Date:

26 May 2020, 02:33:12

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I recently noticed this in a customer log file:
>   ERROR:  could not read from hash-join temporary file: Success
> The problem is we're reporting with %m when the problem is a partial
> read or write.
> I propose the attached patch to solve it: report "wrote only X of X
> bytes".  This caused a lot of other trouble, the cause of which I've
> been unable to pinpoint as yet.  But in the meantime, this is already a
> small improvement.

+1 for the idea, but there's still one small problem with what you did
here: errcode_for_file_access() looks at errno, which we can presume
will not have a relevant value in the "wrote only %d bytes" paths.

Most places that are taking pains to deal with this scenario
do something like

    errno = 0;
    if (write(fd, data, len, xlrec->offset) != len)
    {
        /* if write didn't set errno, assume problem is no disk space */
        if (errno == 0)
            errno = ENOSPC;
        ereport(ERROR,
                (errcode_for_file_access(),
                 errmsg("could not write to file \"%s\": %m", path)));
    }

I don't mind if you want to extend that paradigm to also use "wrote only
%d bytes" wording, but the important point is to get the SQLSTATE set on
the basis of ENOSPC rather than whatever random value errno will have
otherwise.

            regards, tom lane

Re: hash join error improvement (old)

From

Alvaro Herrera

Date:

26 May 2020, 16:55:50

Hi Tom, thanks for looking.

On 2020-May-25, Tom Lane wrote:

> I don't mind if you want to extend that paradigm to also use "wrote only
> %d bytes" wording, but the important point is to get the SQLSTATE set on
> the basis of ENOSPC rather than whatever random value errno will have
> otherwise.

Hmm, right -- I was extending the partial read case to apply to a
partial write, and we deal with those very differently.  I changed the
write case to use our standard approach.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

hashjoin.patch

Re: hash join error improvement (old)

From

Tom Lane

Date:

26 May 2020, 17:43:31

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Hmm, right -- I was extending the partial read case to apply to a
> partial write, and we deal with those very differently.  I changed the
> write case to use our standard approach.

Actually ... looking more closely, this proposed change in
ExecHashJoinSaveTuple flat out doesn't work, because it assumes that
BufFileWrite reports errors the same way as write(), which is not the
case.  In particular, written < 0 can't happen; moreover, you've
removed detection of a short write as opposed to a completely failed
write.

Digging further down, it looks like BufFileWrite calls BufFileDumpBuffer
which calls FileWrite which takes pains to set errno correctly after a
short write --- so other than the lack of commentary about these
functions' error-reporting API, I don't think there's any actual bug here.
Are you sure you correctly identified the source of the bogus error
report?

Similarly, I'm afraid you introduced rather than removed problems
in ExecHashJoinGetSavedTuple.  BufFileRead doesn't use negative
return values either.

            regards, tom lane

Re: hash join error improvement (old)

From

Alvaro Herrera

Date:

26 May 2020, 22:17:10

On 2020-May-26, Tom Lane wrote:

> Digging further down, it looks like BufFileWrite calls BufFileDumpBuffer
> which calls FileWrite which takes pains to set errno correctly after a
> short write --- so other than the lack of commentary about these
> functions' error-reporting API, I don't think there's any actual bug here.

Doh, you're right, this patch is completely broken ... aside from
carelessly writing the wrong "if" test, my unfamiliarity with the stdio
fread/fwrite interface is showing.  I'll look more carefully.

> Are you sure you correctly identified the source of the bogus error
> report?

Nope.  And I wish the bogus error report was all there was to it.  The
actual problem is a server crash.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: hash join error improvement (old)

From

Alvaro Herrera

Date:

27 May 2020, 01:27:53

On 2020-May-26, Tom Lane wrote:

> Are you sure you correctly identified the source of the bogus error
> report?

This version's better.  It doesn't touch the write side at all.
On the read side, only report a short read as such if errno's not set.

This error isn't frequently seen.  This page
https://blog.csdn.net/pg_hgdb/article/details/106279303
(A Postgres fork; blames the error on the temp hash files being encrypted,
suggests to increase temp_buffers) is the only one I found.

There are more uses of BufFileRead that don't bother to distinguish
these two cases apart, though -- logtape.c, tuplestore.c,
gistbuildbuffers.c all do the same.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

hashjoin-3.patch

Re: hash join error improvement (old)

From

Tom Lane

Date:

27 May 2020, 04:30:21

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> There are more uses of BufFileRead that don't bother to distinguish
> these two cases apart, though -- logtape.c, tuplestore.c,
> gistbuildbuffers.c all do the same.

Yeah.  I rather suspect that callers of BufFileRead/Write are mostly
expecting that those functions will throw an ereport() for any interesting
error condition.  Maybe we should make it so, instead of piecemeal fixing
the callers?

            regards, tom lane

Re: hash join error improvement (old)

From

Thomas Munro

Date:

27 May 2020, 04:46:54

On Wed, May 27, 2020 at 1:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > There are more uses of BufFileRead that don't bother to distinguish
> > these two cases apart, though -- logtape.c, tuplestore.c,
> > gistbuildbuffers.c all do the same.
>
> Yeah.  I rather suspect that callers of BufFileRead/Write are mostly
> expecting that those functions will throw an ereport() for any interesting
> error condition.  Maybe we should make it so, instead of piecemeal fixing
> the callers?

Yeah.  I proposed that over here:

https://www.postgresql.org/message-id/CA+hUKGK0w+GTs8aDvsKDVu7cFzSE5q+0NP_9kPSxg2NA1NeZew@mail.gmail.com

But I got stuck trying to figure out whether to back-patch (arguably
yes: there are bugs here, but arguably no: the interfaces change).