Re: pgsql: Add parallel-aware hash joins. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pgsql: Add parallel-aware hash joins.
Date
Msg-id 20180104192033.irv4ppf44i2fesze@alap3.anarazel.de
Whole thread Raw
In response to Re: pgsql: Add parallel-aware hash joins.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql: Add parallel-aware hash joins.
List pgsql-hackers
On 2018-01-04 12:11:37 -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Thu, Jan 4, 2018 at 11:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Also, what the devil is happening on skink?
> 
> > So, skink is apparently dying during shutdown of a user-connected
> > backend, and specifically the one that executed the 'tablespace' test.
> 
> Well, yeah, valgrind is burping: the postmaster log is full of
> 
> ==10544== VALGRINDERROR-BEGIN
> ==10544== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
> ==10544==    at 0x7011490: epoll_pwait (epoll_pwait.c:42)
> ==10544==    by 0x4BF40B: WaitEventSetWaitBlock (latch.c:1048)
> ==10544==    by 0x4BF40B: WaitEventSetWait (latch.c:1000)
> ==10544==    by 0x3C0B3B: secure_read (be-secure.c:166)
> ==10544==    by 0x3CCD9E: pq_recvbuf (pqcomm.c:963)
> ==10544==    by 0x3CDA07: pq_getbyte (pqcomm.c:1006)
> ==10544==    by 0x4E2A2D: SocketBackend (postgres.c:339)
> ==10544==    by 0x4E444E: ReadCommand (postgres.c:512)
> ==10544==    by 0x4E7588: PostgresMain (postgres.c:4085)
> ==10544==    by 0x4641D0: BackendRun (postmaster.c:4412)
> ==10544==    by 0x467308: BackendStartup (postmaster.c:4084)
> ==10544==    by 0x4675F7: ServerLoop (postmaster.c:1757)
> ==10544==    by 0x4689D4: PostmasterMain (postmaster.c:1365)
> ==10544==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==10544== 
> ==10544== VALGRINDERROR-END
> 
> But (a) this is happening in multiple branches, and (b) we've not
> changed anything near that code in awhile.  I think something broke
> in valgrind itself.

Some packages on skink have been upgraded. It appears that there either
was a libc or valgrind change that made valgrind not recognize that a
pointer of 0 might not point anywhere :(

Let me check whether valgrind accept multiple suppression files, in
which case I could add a suppression for this error to all
branches. Will also check whether I can reproduce locally.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] Proposal: Local indexes for partitioned table
Next
From: Tom Lane
Date:
Subject: Re: bug? import foreign schema forgets to import column description