Home > mailing lists

Re: BUG #17619: AllocSizeIsValid violation in parallel hash join - Mailing list pgsql-bugs

From	Dmitry Astapov
Subject	Re: BUG #17619: AllocSizeIsValid violation in parallel hash join
Date	September 22, 2022 16:38:00
Msg-id	CAFQUnFhEFTaMafweOG2P7FNToLeiDFrsYRfSBk7TrrkTv+3fMA@mail.gmail.com Whole thread
In response to	Re: BUG #17619: AllocSizeIsValid violation in parallel hash join (Dmitry Astapov <dastapov@gmail.com>)
Responses	Re: BUG #17619: AllocSizeIsValid violation in parallel hash join
List	pgsql-bugs

Tree view

And I think that if on line 322 should be ">" and not ">=" as well.

On Thu, 22 Sep 2022, 17:11 Dmitry Astapov, <dastapov@gmail.com> wrote:

Thomas, I think I know what is going on, and it is happening in sts_puttuple when it needs to write 32760 bytes exacly. All references below a taken in the tag REL_13_8.

Lets say that we enter sts_puttuple and on line 321 we compute a size of 32760 bytes. Lets say that buffer is currently partially filled so a write of this size would not fit.

We proceed to flush chunk on line 338, which resets the write pointer to 8 bytes past the start of SharedTuplestoreChunk, leaving exactly 32760 bytes free.

We take the if on line 342 (the one with the comment "It still might not be enough space", and then proceed to write whole 32760 bytes and reach the line 373 with size == 0.

It looks like this codepath is written with assumption that size >0 here and sts_flush_chunk would be called immediately after inside the loop on line 375. But because size == 0, loop will not execute, sts_flush_chunk will not be called, and since we haven't updated write pointer so far, it will stay at the beginning of the chunk. At this point we exit sts_puttuple, and next call of sts_putttuple would overwrite the tuple we just memcopied into the chank. If this new tuple is shorter, then remaining part of the chunk would be garbage from the overwritten tuple.

I think >= on line 342 should be just > . I tried this change locally, and it fixed the issue for me.

Do you agree with my analysis?

On Thu, Sep 22, 2022 at 1:51 PM Dmitry Astapov <dastapov@gmail.com> wrote:

On Thu, Sep 22, 2022 at 9:44 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Sep 22, 2022 at 7:46 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
> (gdb) p size
> $2 = 1702125924

Thanks for the detailed report. Hmm. That size, on a little-endian
system, is equivalent to the byte sequence "date\0\0\0\0", which looks
pretty suspiciously like the inside of a tuple, and not its size. We
must have got out of sync somehow.

Great catch! I added some debugging output to the BufFileRead and sts_read_tuple and it looks to me that the sequence of events immediately before the problem happens is like this:

sts_read_tuple is called multiple times to read from a particular file, and a bunch of tuples are read this way
sts_read_tuple is called once more. In my experiments, I always had read_ntuples_available - 2 == read_ntuples at this point, but maybe this is just a coincidence.
inside this sts_read_tuple_call, BufFileRead is called to read metadata, and succeeds (and this chunk of the file looks -- to my naive eye -- exactly like any other tuple start)
Second BufFileRead call happens to get the size, and reads the correct size (I see what's in the file, I could find corresponding values in the database tables, and it all lines up).
Third BufFileRead happens to read the actual tuple. So far so good, and sts_read_tuple finishes.
At this point, the position in the file (buf_file -> currOffset + post) points right after the last tuple read, but what's there does not look like the start of the next tuple to me. It seems like there was another, longer tuple written there before (with quite long rhs.payload value) and somehow the tuple that was just read was written over it, so we have "<tuple read on step 2 above><current read position here><what looks like leftovers from another tuple><a bunch of zeroes><new tuple start>"
So another sts_read_tuple call happens, and it reads 4 bytes of this garbage as metadata, and the next 4 bytes as length, and this is when everything derails...
I also double-checked at the logs of the (test) database where this issue was first spotted (and where a lot of queries similar to the one I am using to trigger this issue are happening) and found out that along with "invalid memory alloc request size" we also got a couple of occasions of "Unexpected chunk in shared tuplestore temporary file. Expected overflow chunk", which looks like another error that could be potentially explained by BufFileRead's inside sts_read_tuple reading "garbage".

I'm happy to build a patched version of 13.8 (or another revision if you want me to) to collect more debug information for you if this could help.

--
D. Astapov

--
D. Astapov

pgsql-bugs by date:

From: PG Bug reporting form
Date: 22 September 2022, 16:34:28
Subject: BUG #17621: PostgreSQL 15 Windows installer still downloads Beta 3

From: Alvaro Herrera
Date: 22 September 2022, 17:31:02
Subject: Re: Regression in pipeline mode in libpq 14.5

Re: BUG #17619: AllocSizeIsValid violation in parallel hash join - Mailing list pgsql-bugs

Previous

Next