Thread: Re: pgsql: Avoid valgrind complaint about write() of uninitalizedbytes.

Re: pgsql: Avoid valgrind complaint about write() of uninitalizedbytes.

From

Andres Freund

Date:

21 February 2018, 11:34:26

On 2018-02-06 19:25:04 +0000, Robert Haas wrote:
> Avoid valgrind complaint about write() of uninitalized bytes.
> 
> LogicalTapeFreeze() may write out its first block when it is dirty but
> not full, and then immediately read the first block back in from its
> BufFile as a BLCKSZ-width block.  This can only occur in rare cases
> where very few tuples were written out, which is currently only
> possible with parallel external tuplesorts.  To avoid valgrind
> complaints, tell it to treat the tail of logtape.c's buffer as
> defined.
> 
> Commit 9da0cc35284bdbe8d442d732963303ff0e0a40bc exposed this problem
> but did not create it.  LogicalTapeFreeze() has always tended to write
> out some amount of garbage bytes, but previously never wrote less than
> one block of data in total, so the problem was masked.
> 
> Per buildfarm members lousyjack and skink.
> 
> Peter Geoghegan, based on a suggestion from Tom Lane and me.  Some
> comment revisions by me.

Doesn't appear to have fixed the problem entirely:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-02-20%2017%3A10%3A01

relevant excerpt:
==12452== Syscall param write(buf) points to uninitialised byte(s)
==12452==    at 0x4E49C64: write (write.c:26)
==12452==    by 0x4BF8BF: FileWrite (fd.c:2017)
==12452==    by 0x4C1B69: BufFileDumpBuffer (buffile.c:513)
==12452==    by 0x4C1C61: BufFileFlush (buffile.c:657)
==12452==    by 0x4C21D6: BufFileRead (buffile.c:561)
==12452==    by 0x63ADA8: ltsReadBlock (logtape.c:274)
==12452==    by 0x63AEF6: ltsReadFillBuffer (logtape.c:304)
==12452==    by 0x63B560: LogicalTapeRewindForRead (logtape.c:771)
==12452==    by 0x640567: mergeruns (tuplesort.c:2671)
==12452==    by 0x645122: tuplesort_performsort (tuplesort.c:1866)
==12452==    by 0x23357D: _bt_parallel_scan_and_sort (nbtsort.c:1626)
==12452==    by 0x234F71: _bt_parallel_build_main (nbtsort.c:1527)
==12452==  Address 0xc3cd368 is 744 bytes inside a block of size 8,272 client-defined
==12452==    at 0x6362D1: palloc (mcxt.c:858)
==12452==    by 0x4C1DA6: BufFileCreateShared (buffile.c:249)
==12452==    by 0x63B062: LogicalTapeSetCreate (logtape.c:571)
==12452==    by 0x64489A: inittapes (tuplesort.c:2419)
==12452==    by 0x644B1C: puttuple_common (tuplesort.c:1695)
==12452==    by 0x644E01: tuplesort_putindextuplevalues (tuplesort.c:1545)
==12452==    by 0x233391: _bt_spool (nbtsort.c:514)
==12452==    by 0x2333CA: _bt_build_callback (nbtsort.c:574)
==12452==    by 0x286004: IndexBuildHeapRangeScan (index.c:2879)
==12452==    by 0x286366: IndexBuildHeapScan (index.c:2419)
==12452==    by 0x23356F: _bt_parallel_scan_and_sort (nbtsort.c:1615)
==12452==    by 0x234F71: _bt_parallel_build_main (nbtsort.c:1527)
==12452==  Uninitialised value was created by a heap allocation
==12452==    at 0x6362D1: palloc (mcxt.c:858)
==12452==    by 0x63B20E: LogicalTapeWrite (logtape.c:634)
==12452==    by 0x63EA46: writetup_index (tuplesort.c:4206)
==12452==    by 0x643769: dumptuples (tuplesort.c:2994)
==12452==    by 0x64511A: tuplesort_performsort (tuplesort.c:1865)
==12452==    by 0x23357D: _bt_parallel_scan_and_sort (nbtsort.c:1626)
==12452==    by 0x234F71: _bt_parallel_build_main (nbtsort.c:1527)
==12452==    by 0x252E2A: ParallelWorkerMain (parallel.c:1397)
==12452==    by 0x4614DF: StartBackgroundWorker (bgworker.c:841)
==12452==    by 0x46F400: do_start_bgworker (postmaster.c:5741)
==12452==    by 0x46F541: maybe_start_bgworkers (postmaster.c:5954)
==12452==    by 0x4700A6: sigusr1_handler (postmaster.c:5134)
==12452== 
==12452== VALGRINDERROR-END

Note that the above path doesn't appear to go through
LogicalTapeFreeze(), therfore not hitting the VALGRIND_MAKE_MEM_DEFINED
added in the above commit.

Greetings,

Andres Freund

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Peter Geoghegan

Date:

22 February 2018, 00:55:20

On Tue, Feb 20, 2018 at 9:34 PM, Andres Freund <andres@anarazel.de> wrote:
> Doesn't appear to have fixed the problem entirely:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-02-20%2017%3A10%3A01
>
> relevant excerpt:
> ==12452== Syscall param write(buf) points to uninitialised byte(s)
> ==12452==    at 0x4E49C64: write (write.c:26)
> ==12452==    by 0x4BF8BF: FileWrite (fd.c:2017)
> ==12452==    by 0x4C1B69: BufFileDumpBuffer (buffile.c:513)
> ==12452==    by 0x4C1C61: BufFileFlush (buffile.c:657)
> ==12452==    by 0x4C21D6: BufFileRead (buffile.c:561)
> ==12452==    by 0x63ADA8: ltsReadBlock (logtape.c:274)
> ==12452==    by 0x63AEF6: ltsReadFillBuffer (logtape.c:304)
> ==12452==    by 0x63B560: LogicalTapeRewindForRead (logtape.c:771)
> ==12452==    by 0x640567: mergeruns (tuplesort.c:2671)
> ==12452==    by 0x645122: tuplesort_performsort (tuplesort.c:1866)
> ==12452==    by 0x23357D: _bt_parallel_scan_and_sort (nbtsort.c:1626)
> ==12452==    by 0x234F71: _bt_parallel_build_main (nbtsort.c:1527)

> Note that the above path doesn't appear to go through
> LogicalTapeFreeze(), therfore not hitting the VALGRIND_MAKE_MEM_DEFINED
> added in the above commit.

Sure, but it looks like it has the exact same underlying cause to the
LogicalTapeFreeze() issue. It shouldn't be very hard to write an
equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
this could happen there instead before the first patch went in, in
fact. My mistake was to imagine that that could never happen during
the regression tests.

I think I see why this happened, and why it happens inconsistently on
skink. Because memtuples is generally much smaller during merging than
it is during initial run generation, you can have enough tuples to
have to spill a run to disk, and yet not enough to fully fill even a
small buffer during the subsequent merge (logtape.c buffers also don't
have per-tuple palloc() header overhead, which is a huge proportion of
total memory used for an int4 primary key CREATE INDEX). This failure
is sensitive to the scheduling of worker processes, which is why we
only see it occasionally on skink.

The multiple-row-versions isolation test, which is where we see the
failure, seems to want to use multiple parallel workers for its ALTER
TABLE ... ADD PRIMARY KEY, in part because the table fillfactor is
rather low -- you end up with a case that gets a parallel sort, and
yet still has very few tuples to sort, a bit like the
LogicalTapeFreeze() issue already fixed.

Should we even be doing a parallel external sort here? It's hardly
part of an intentional effort to test the code. I had to push to get
us to give external sorts test coverage at one point about 18 months
ago, because of concerns about the overhead/duration of external
sorts. Not that I feel strongly about this myself.

-- 
Peter Geoghegan

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Peter Geoghegan

Date:

22 February 2018, 00:55:20

On Tue, Feb 20, 2018 at 9:34 PM, Andres Freund <andres@anarazel.de> wrote:
> Doesn't appear to have fixed the problem entirely:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-02-20%2017%3A10%3A01
>
> relevant excerpt:
> ==12452== Syscall param write(buf) points to uninitialised byte(s)
> ==12452==    at 0x4E49C64: write (write.c:26)
> ==12452==    by 0x4BF8BF: FileWrite (fd.c:2017)
> ==12452==    by 0x4C1B69: BufFileDumpBuffer (buffile.c:513)
> ==12452==    by 0x4C1C61: BufFileFlush (buffile.c:657)
> ==12452==    by 0x4C21D6: BufFileRead (buffile.c:561)
> ==12452==    by 0x63ADA8: ltsReadBlock (logtape.c:274)
> ==12452==    by 0x63AEF6: ltsReadFillBuffer (logtape.c:304)
> ==12452==    by 0x63B560: LogicalTapeRewindForRead (logtape.c:771)
> ==12452==    by 0x640567: mergeruns (tuplesort.c:2671)
> ==12452==    by 0x645122: tuplesort_performsort (tuplesort.c:1866)
> ==12452==    by 0x23357D: _bt_parallel_scan_and_sort (nbtsort.c:1626)
> ==12452==    by 0x234F71: _bt_parallel_build_main (nbtsort.c:1527)

> Note that the above path doesn't appear to go through
> LogicalTapeFreeze(), therfore not hitting the VALGRIND_MAKE_MEM_DEFINED
> added in the above commit.

Sure, but it looks like it has the exact same underlying cause to the
LogicalTapeFreeze() issue. It shouldn't be very hard to write an
equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
this could happen there instead before the first patch went in, in
fact. My mistake was to imagine that that could never happen during
the regression tests.

I think I see why this happened, and why it happens inconsistently on
skink. Because memtuples is generally much smaller during merging than
it is during initial run generation, you can have enough tuples to
have to spill a run to disk, and yet not enough to fully fill even a
small buffer during the subsequent merge (logtape.c buffers also don't
have per-tuple palloc() header overhead, which is a huge proportion of
total memory used for an int4 primary key CREATE INDEX). This failure
is sensitive to the scheduling of worker processes, which is why we
only see it occasionally on skink.

The multiple-row-versions isolation test, which is where we see the
failure, seems to want to use multiple parallel workers for its ALTER
TABLE ... ADD PRIMARY KEY, in part because the table fillfactor is
rather low -- you end up with a case that gets a parallel sort, and
yet still has very few tuples to sort, a bit like the
LogicalTapeFreeze() issue already fixed.

Should we even be doing a parallel external sort here? It's hardly
part of an intentional effort to test the code. I had to push to get
us to give external sorts test coverage at one point about 18 months
ago, because of concerns about the overhead/duration of external
sorts. Not that I feel strongly about this myself.

-- 
Peter Geoghegan

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Peter Geoghegan

Date:

22 February 2018, 02:18:01

On Wed, Feb 21, 2018 at 10:55 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> Sure, but it looks like it has the exact same underlying cause to the
> LogicalTapeFreeze() issue. It shouldn't be very hard to write an
> equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
> this could happen there instead before the first patch went in, in
> fact. My mistake was to imagine that that could never happen during
> the regression tests.

Attached patch does this. I cannot recreate this issue locally, but
this should still fix it on skink.

> Should we even be doing a parallel external sort here? It's hardly
> part of an intentional effort to test the code. I had to push to get
> us to give external sorts test coverage at one point about 18 months
> ago, because of concerns about the overhead/duration of external
> sorts. Not that I feel strongly about this myself.

I suppose that we should commit this patch, even if we subsequently
suppress parallel CREATE INDEX in the multiple-row-versions isolation
test.

-- 
Peter Geoghegan

Attachment

0001-Avoid-another-Valgrind-complaint-about-a-logtape.c-w.patch

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Peter Geoghegan

Date:

22 February 2018, 02:18:01

On Wed, Feb 21, 2018 at 10:55 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> Sure, but it looks like it has the exact same underlying cause to the
> LogicalTapeFreeze() issue. It shouldn't be very hard to write an
> equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
> this could happen there instead before the first patch went in, in
> fact. My mistake was to imagine that that could never happen during
> the regression tests.

Attached patch does this. I cannot recreate this issue locally, but
this should still fix it on skink.

> Should we even be doing a parallel external sort here? It's hardly
> part of an intentional effort to test the code. I had to push to get
> us to give external sorts test coverage at one point about 18 months
> ago, because of concerns about the overhead/duration of external
> sorts. Not that I feel strongly about this myself.

I suppose that we should commit this patch, even if we subsequently
suppress parallel CREATE INDEX in the multiple-row-versions isolation
test.

-- 
Peter Geoghegan

Attachment

0001-Avoid-another-Valgrind-complaint-about-a-logtape.c-w.patch

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Robert Haas

Date:

22 February 2018, 20:32:32

On Wed, Feb 21, 2018 at 3:18 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Wed, Feb 21, 2018 at 10:55 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>> Sure, but it looks like it has the exact same underlying cause to the
>> LogicalTapeFreeze() issue. It shouldn't be very hard to write an
>> equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
>> this could happen there instead before the first patch went in, in
>> fact. My mistake was to imagine that that could never happen during
>> the regression tests.
>
> Attached patch does this. I cannot recreate this issue locally, but
> this should still fix it on skink.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Robert Haas

Date:

22 February 2018, 20:32:32

On Wed, Feb 21, 2018 at 3:18 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Wed, Feb 21, 2018 at 10:55 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>> Sure, but it looks like it has the exact same underlying cause to the
>> LogicalTapeFreeze() issue. It shouldn't be very hard to write an
>> equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
>> this could happen there instead before the first patch went in, in
>> fact. My mistake was to imagine that that could never happen during
>> the regression tests.
>
> Attached patch does this. I cannot recreate this issue locally, but
> this should still fix it on skink.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Peter Geoghegan

Date:

28 February 2018, 04:33:08

On Thu, Feb 22, 2018 at 6:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Attached patch does this. I cannot recreate this issue locally, but
>> this should still fix it on skink.
>
> Committed.

Looks like it worked.

-- 
Peter Geoghegan

Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

From

Peter Geoghegan

Date:

28 February 2018, 04:33:08

On Thu, Feb 22, 2018 at 6:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Attached patch does this. I cannot recreate this issue locally, but
>> this should still fix it on skink.
>
> Committed.

Looks like it worked.

-- 
Peter Geoghegan