Re: [HACKERS] valgrind errors around dsa.c - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: [HACKERS] valgrind errors around dsa.c |
Date | |
Msg-id | CAEepm=0W8u+t52zgQkXvN-1yuCauZCbZmHy7F2ZmxYtj5zEN=A@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] valgrind errors around dsa.c (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: [HACKERS] valgrind errors around dsa.c
(Andres Freund <andres@anarazel.de>)
|
List | pgsql-hackers |
On Sat, Apr 8, 2017 at 8:57 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Sat, Apr 8, 2017 at 4:49 AM, Andres Freund <andres@anarazel.de> wrote: >> Hi, >> >> newly added tests exercise parallel bitmap scans. And they trigger >> valgrind errors: >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2017-04-07%2007%3A10%3A01 >> >> >> ==4567== VALGRINDERROR-BEGIN >> ==4567== Conditional jump or move depends on uninitialised value(s) >> ==4567== at 0x5FD62A: check_for_freed_segments (dsa.c:2219) >> ==4567== by 0x5FD97E: dsa_get_address (dsa.c:934) >> ==4567== by 0x5FDA2A: init_span (dsa.c:1339) >> ==4567== by 0x5FE6D1: ensure_active_superblock (dsa.c:1696) >> ==4567== by 0x5FEBBD: alloc_object (dsa.c:1452) >> ==4567== by 0x5FEBBD: dsa_allocate_extended (dsa.c:693) >> ==4567== by 0x3C7A83: pagetable_allocate (tidbitmap.c:1536) >> ==4567== by 0x3C7A83: pagetable_create (simplehash.h:342) >> ==4567== by 0x3C7A83: tbm_create_pagetable (tidbitmap.c:323) >> ==4567== by 0x3C8DAD: tbm_get_pageentry (tidbitmap.c:1246) >> ==4567== by 0x3C98A1: tbm_add_tuples (tidbitmap.c:432) >> ==4567== by 0x22510C: btgetbitmap (nbtree.c:460) >> ==4567== by 0x21A8D1: index_getbitmap (indexam.c:726) >> ==4567== by 0x38AD48: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:91) >> ==4567== by 0x37D353: MultiExecProcNode (execProcnode.c:621) >> ==4567== Uninitialised value was created by a heap allocation >> ==4567== at 0x602FD5: palloc (mcxt.c:872) >> ==4567== by 0x5FF73B: create_internal (dsa.c:1242) >> ==4567== by 0x5FF8F5: dsa_create_in_place (dsa.c:473) >> ==4567== by 0x37CA32: ExecInitParallelPlan (execParallel.c:532) >> ==4567== by 0x38C324: ExecGather (nodeGather.c:152) >> ==4567== by 0x37D247: ExecProcNode (execProcnode.c:551) >> ==4567== by 0x39870F: ExecNestLoop (nodeNestloop.c:156) >> ==4567== by 0x37D1B7: ExecProcNode (execProcnode.c:512) >> ==4567== by 0x3849D4: fetch_input_tuple (nodeAgg.c:686) >> ==4567== by 0x387764: agg_retrieve_direct (nodeAgg.c:2306) >> ==4567== by 0x387A11: ExecAgg (nodeAgg.c:2117) >> ==4567== by 0x37D217: ExecProcNode (execProcnode.c:539) >> ==4567== >> >> It could be that these are spurious due to shared memory - valgrind >> doesn't track definedness across processes - but the fact that memory >> allocated by palloc is the source of the undefined memory makes me doubt >> that. > > Thanks. Will post a fix for this later today. Fix attached. Explanation: Whenever segments are destroyed because they no longer contain any live blocks, the shared variable control->freed_segment_counter advances. Each attached backend has its own local variable area->freed_segment_counter, and if it sees that the former differs from the latter it checks all attached segments to see if any need to be detached. I failed to initialise the backend-local version, with the consequence that if you were very unlucky your backend could fail to detach from a no-longer needed segment until a another segment was eventually freed causing the shared counter to move again. More likely, it would notice that they are different because one holds uninitialised junk, perform a spurious scan for dead segments, and then get them in sync. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: