Re: [HACKERS] valgrind errors around dsa.c - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] valgrind errors around dsa.c
Date
Msg-id CAEepm=0W8u+t52zgQkXvN-1yuCauZCbZmHy7F2ZmxYtj5zEN=A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] valgrind errors around dsa.c  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] valgrind errors around dsa.c  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Apr 8, 2017 at 8:57 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Sat, Apr 8, 2017 at 4:49 AM, Andres Freund <andres@anarazel.de> wrote:
>> Hi,
>>
>> newly added tests exercise parallel bitmap scans.  And they trigger
>> valgrind errors:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2017-04-07%2007%3A10%3A01
>>
>>
>> ==4567== VALGRINDERROR-BEGIN
>> ==4567== Conditional jump or move depends on uninitialised value(s)
>> ==4567==    at 0x5FD62A: check_for_freed_segments (dsa.c:2219)
>> ==4567==    by 0x5FD97E: dsa_get_address (dsa.c:934)
>> ==4567==    by 0x5FDA2A: init_span (dsa.c:1339)
>> ==4567==    by 0x5FE6D1: ensure_active_superblock (dsa.c:1696)
>> ==4567==    by 0x5FEBBD: alloc_object (dsa.c:1452)
>> ==4567==    by 0x5FEBBD: dsa_allocate_extended (dsa.c:693)
>> ==4567==    by 0x3C7A83: pagetable_allocate (tidbitmap.c:1536)
>> ==4567==    by 0x3C7A83: pagetable_create (simplehash.h:342)
>> ==4567==    by 0x3C7A83: tbm_create_pagetable (tidbitmap.c:323)
>> ==4567==    by 0x3C8DAD: tbm_get_pageentry (tidbitmap.c:1246)
>> ==4567==    by 0x3C98A1: tbm_add_tuples (tidbitmap.c:432)
>> ==4567==    by 0x22510C: btgetbitmap (nbtree.c:460)
>> ==4567==    by 0x21A8D1: index_getbitmap (indexam.c:726)
>> ==4567==    by 0x38AD48: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:91)
>> ==4567==    by 0x37D353: MultiExecProcNode (execProcnode.c:621)
>> ==4567==  Uninitialised value was created by a heap allocation
>> ==4567==    at 0x602FD5: palloc (mcxt.c:872)
>> ==4567==    by 0x5FF73B: create_internal (dsa.c:1242)
>> ==4567==    by 0x5FF8F5: dsa_create_in_place (dsa.c:473)
>> ==4567==    by 0x37CA32: ExecInitParallelPlan (execParallel.c:532)
>> ==4567==    by 0x38C324: ExecGather (nodeGather.c:152)
>> ==4567==    by 0x37D247: ExecProcNode (execProcnode.c:551)
>> ==4567==    by 0x39870F: ExecNestLoop (nodeNestloop.c:156)
>> ==4567==    by 0x37D1B7: ExecProcNode (execProcnode.c:512)
>> ==4567==    by 0x3849D4: fetch_input_tuple (nodeAgg.c:686)
>> ==4567==    by 0x387764: agg_retrieve_direct (nodeAgg.c:2306)
>> ==4567==    by 0x387A11: ExecAgg (nodeAgg.c:2117)
>> ==4567==    by 0x37D217: ExecProcNode (execProcnode.c:539)
>> ==4567==
>>
>> It could be that these are spurious due to shared memory - valgrind
>> doesn't track definedness across processes - but the fact that memory
>> allocated by palloc is the source of the undefined memory makes me doubt
>> that.
>
> Thanks.  Will post a fix for this later today.

Fix attached.

Explanation:  Whenever segments are destroyed because they no longer
contain any live blocks, the shared variable
control->freed_segment_counter advances.  Each attached backend has
its own local variable area->freed_segment_counter, and if it sees
that the former differs from the latter it checks all attached
segments to see if any need to be detached.  I failed to initialise
the backend-local version, with the consequence that if you were very
unlucky your backend could fail to detach from a no-longer needed
segment until a another segment was eventually freed causing the
shared counter to move again.  More likely, it would notice that they
are different because one holds uninitialised junk, perform a spurious
scan for dead segments, and then get them in sync.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: [HACKERS] partitioned tables and contrib/sepgsql
Next
From: Kevin Grittner
Date:
Subject: Re: [HACKERS] [PATCH] Add GUCs for predicate lock promotion thresholds