Home > mailing lists

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes. - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.
Date	February 22, 2018 00:55:20
Msg-id	CAH2-Wz=SRX3uPYghPoP4oQwq8zwyBxCuA+zEpAk2PoCR_eOaMg@mail.gmail.com Whole thread Raw
In response to	Re: pgsql: Avoid valgrind complaint about write() of uninitalizedbytes. (Andres Freund <andres@anarazel.de>)
Responses	Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes. Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.
List	pgsql-hackers

Tree view

On Tue, Feb 20, 2018 at 9:34 PM, Andres Freund <andres@anarazel.de> wrote:
> Doesn't appear to have fixed the problem entirely:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-02-20%2017%3A10%3A01
>
> relevant excerpt:
> ==12452== Syscall param write(buf) points to uninitialised byte(s)
> ==12452==    at 0x4E49C64: write (write.c:26)
> ==12452==    by 0x4BF8BF: FileWrite (fd.c:2017)
> ==12452==    by 0x4C1B69: BufFileDumpBuffer (buffile.c:513)
> ==12452==    by 0x4C1C61: BufFileFlush (buffile.c:657)
> ==12452==    by 0x4C21D6: BufFileRead (buffile.c:561)
> ==12452==    by 0x63ADA8: ltsReadBlock (logtape.c:274)
> ==12452==    by 0x63AEF6: ltsReadFillBuffer (logtape.c:304)
> ==12452==    by 0x63B560: LogicalTapeRewindForRead (logtape.c:771)
> ==12452==    by 0x640567: mergeruns (tuplesort.c:2671)
> ==12452==    by 0x645122: tuplesort_performsort (tuplesort.c:1866)
> ==12452==    by 0x23357D: _bt_parallel_scan_and_sort (nbtsort.c:1626)
> ==12452==    by 0x234F71: _bt_parallel_build_main (nbtsort.c:1527)

> Note that the above path doesn't appear to go through
> LogicalTapeFreeze(), therfore not hitting the VALGRIND_MAKE_MEM_DEFINED
> added in the above commit.

Sure, but it looks like it has the exact same underlying cause to the
LogicalTapeFreeze() issue. It shouldn't be very hard to write an
equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
this could happen there instead before the first patch went in, in
fact. My mistake was to imagine that that could never happen during
the regression tests.

I think I see why this happened, and why it happens inconsistently on
skink. Because memtuples is generally much smaller during merging than
it is during initial run generation, you can have enough tuples to
have to spill a run to disk, and yet not enough to fully fill even a
small buffer during the subsequent merge (logtape.c buffers also don't
have per-tuple palloc() header overhead, which is a huge proportion of
total memory used for an int4 primary key CREATE INDEX). This failure
is sensitive to the scheduling of worker processes, which is why we
only see it occasionally on skink.

The multiple-row-versions isolation test, which is where we see the
failure, seems to want to use multiple parallel workers for its ALTER
TABLE ... ADD PRIMARY KEY, in part because the table fillfactor is
rather low -- you end up with a case that gets a parallel sort, and
yet still has very few tuples to sort, a bit like the
LogicalTapeFreeze() issue already fixed.

Should we even be doing a parallel external sort here? It's hardly
part of an intentional effort to test the code. I had to push to get
us to give external sorts test coverage at one point about 18 months
ago, because of concerns about the overhead/duration of external
sorts. Not that I feel strongly about this myself.

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Andres Freund
Date: 22 February 2018, 00:14:06
Subject: Re: Add PGDLLIMPORT to enable_hashagg

From: "Rady, Doug"
Date: 22 February 2018, 01:11:15
Subject: Re: [PATCH] pgbench - refactor some connection finish/null intocommon function

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes. - Mailing list pgsql-hackers

Previous

Next