Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free orcorruption (!prev) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free orcorruption (!prev)
Date
Msg-id 20190826085701.y7rfxq3nxh2hzdju@development
Whole thread Raw
In response to Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free orcorruption (!prev)  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Mon, Aug 26, 2019 at 02:34:31PM +1200, Thomas Munro wrote:
>On Mon, Aug 26, 2019 at 1:44 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>> On Mon, Aug 26, 2019 at 01:09:19PM +1200, Thomas Munro wrote:
>> > On Sun, Aug 25, 2019 at 3:15 PM Peter Geoghegan <pg@bowt.ie> wrote:
>> > > I was reminded of this issue from last year, which also appeared to
>> > > involve BufFileClose() and a double-free:
>> > >
>> > > https://postgr.es/m/87y3hmee19.fsf@news-spur.riddles.org.uk
>> > >
>> > > That was a BufFile that was under the control of a tuplestore, so it
>> > > was similar to but different from your case. I suspect it's related.
>> >
>> > Hmm.  tuplestore.c follows the same coding pattern as nodeHashjoin.c:
>> > it always nukes its pointer after calling BufFileFlush(), so it
>> > shouldn't be capable of calling it twice for the same pointer, unless
>> > we have two copies of that pointer somehow.
>> >
>> > Merlin's reported a double-free apparently in ExecHashJoin(), not
>> > ExecHashJoinNewBatch() like this report.  Unfortunately that tells us
>> > very little.
>
>Here's another one:
>
>https://www.postgresql.org/message-id/flat/20170601081104.1500.56202%40wrigleys.postgresql.org
>
>Hmm.  Also on RHEL/CentOS 6, and also involving sorting, hashing,
>BufFileClose() but this time the glibc double free error is in
>repalloc().
>
>And another one (repeatedly happening):
>
>https://www.postgresql.org/message-id/flat/3976998C-8D3B-4825-9B10-69ECB70A597A%40appnexus.com
>
>Also on RHEL/CentOS 6, this time a sort in once case and a hash join
>in another case.
>
>Of course it's entirely possible that we have a bug here and I'm very
>keen to find it, but I can't help noticing the common factor here is
>that they're all running ancient RHEL 6.x releases, except Merlin who
>didn't say.  Merlin?
>

It'd be interesting to know the exact glibc version for those machines.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-hackers by date:

Previous
From: Andrey Borodin
Date:
Subject: Yet another fast GiST build
Next
From: Fabien COELHO
Date:
Subject: Re: refactoring - share str2*int64 functions