Re: pgsql: Generational memory allocator - Mailing list pgsql-committers

From Tomas Vondra
Subject Re: pgsql: Generational memory allocator
Date
Msg-id bf84d940-90d4-de91-19dd-612e011007f4@fuzzy.cz
Whole thread Raw
In response to Re: pgsql: Generational memory allocator  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql: Generational memory allocator  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-committers
Hi,

On 11/25/2017 02:25 AM, Tom Lane wrote:
> I wrote:
>> For me, this patch fixes the valgrind failures inside generation.c
>> itself, but I still see one more in the test_decoding run: ...
>> Not sure what to make of this: the stack traces make it look unrelated
>> to the GenerationContext changes, but if it's not related, how come
>> skink was passing before that patch went in?
> 
> I've pushed fixes for everything that I could find wrong in generation.c
> (and there was a lot :-().  But I'm still seeing the "invalid read in
> SnapBuildProcessNewCid" failure when I run test_decoding under valgrind.
> Somebody who has more familiarity with the logical decoding stuff than
> I do needs to look into that.
> 
> I tried to narrow down exactly which fetch in SnapBuildProcessNewCid was
> triggering the failure, with the attached patch.  Weirdly, *it does not
> fail* with this.  I have no explanation for that.
> 

I have no explanation for that either. FWIW I don't think this is 
related to the new memory contexts. I can reproduce it on 3bae43c (i.e. 
before the Generation memory context was introduced), and with Slab 
removed from ReorderBuffer.

I wonder if this might be a valgrind issue. I'm not sure which version 
skink is using, but I'm running with valgrind-3.12.0-9.el7_4.x86_64.

BTW I also see these failures in hstore:

==15168== Source and destination overlap in memcpy(0x5d0fed0, 0x5d0fed0, 40)
==15168==    at 0x4C2E00C: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1018)
==15168==    by 0x15419A06: hstoreUniquePairs (hstore_io.c:343)
==15168==    by 0x15419EE4: hstore_in (hstore_io.c:416)
==15168==    by 0x9ED11A: InputFunctionCall (fmgr.c:1635)
==15168==    by 0x9ED3C2: OidInputFunctionCall (fmgr.c:1738)
==15168==    by 0x6014A2: stringTypeDatum (parse_type.c:641)
==15168==    by 0x5E1ADC: coerce_type (parse_coerce.c:304)
==15168==    by 0x5E17A9: coerce_to_target_type (parse_coerce.c:103)
==15168==    by 0x5EDD6D: transformTypeCast (parse_expr.c:2724)
==15168==    by 0x5E8860: transformExprRecurse (parse_expr.c:203)
==15168==    by 0x5E8601: transformExpr (parse_expr.c:156)
==15168==    by 0x5FCF95: transformTargetEntry (parse_target.c:103)
==15168==    by 0x5FD15D: transformTargetList (parse_target.c:191)
==15168==    by 0x5A5EEC: transformSelectStmt (analyze.c:1214)
==15168==    by 0x5A4453: transformStmt (analyze.c:297)
==15168==    by 0x5A4381: transformOptionalSelectInto (analyze.c:242)
==15168==    by 0x5A423F: transformTopLevelStmt (analyze.c:192)
==15168==    by 0x5A4097: parse_analyze (analyze.c:112)
==15168==    by 0x87E0AF: pg_analyze_and_rewrite (postgres.c:664)
==15168==    by 0x87E6EE: exec_simple_query (postgres.c:1045)

Seems hstoreUniquePairs may call memcpy with the same pointers in some 
cases (which looks a bit dubious). But the code is ancient, so it's 
strange it didn't fail before.

regards
Tomas


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Generational memory allocator
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Generational memory allocator