Re: Memory leak in GIN index build - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Memory leak in GIN index build
Date
Msg-id 5712B509.7040403@dalibo.com
Whole thread Raw
In response to Re: Memory leak in GIN index build  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Memory leak in GIN index build
List pgsql-hackers
On 16/04/2016 20:45, Tom Lane wrote:
> Julien Rouhaud <julien.rouhaud@dalibo.com> writes:
> 
>> Also, in dataPlaceToPageLeaf() and ginVacuumPostingTreeLeaf(), shouldn't
>> the START_CRIT_SECTION() calls be placed before the xlog code?
> 
> Yeah, they should.  Evidently somebody kluged it to avoid doing a palloc
> inside a critical section, while ignoring every other rule about where to
> place critical sections.  (Maybe we should put an assert about being
> within a critical section into XLogBeginInsert.)
> 

After a quick test, it appears that at least log_smgrcreate() calls
XLogBeginInsert() without being in a critical section, from the various
DDL commands.

> This code is pretty fundamentally broken, isn't it.  elog() calls
> inside a critical section?  Really?  Even worse, a MemoryContextDelete,
> which hardly seems like something that should be assumed error-proof.
> 
> Not to mention the unbelievable ugliness of a function that sometimes
> returns with an open critical section (and WAL insertion started) and
> sometimes doesn't.
> 
> It kinda looks like this was hacked up without bothering to keep the
> comment block in ginPlaceToPage in sync with reality, either.
> 
> I think this needs to be redesigned so that the critical section and WAL
> insertion calls all happen within a single straight-line piece of code.
> 
> We could try making that place be ginPlaceToPage().  So far as
> registerLeafRecompressWALData is concerned, that does not look that hard;
> it could palloc and fill the WAL-data buffer the same as it's doing now,
> then pass it back up to be registered (and eventually pfreed) in
> ginPlaceToPage.  However, that would also mean postponing the call of
> dataPlaceToPageLeafRecompress(), which seems much messier.  I assume
> the data it's working from is mostly in the tmpCtx that
> dataPlaceToPageLeaf wants to free before returning.  Maybe we could
> move creation/destruction of the tmpCtx out to ginPlaceToPage, as well?
> 
> The other line of thought is to move the WAL work that ginPlaceToPage
> does down into the subroutines.  That would probably result in some
> duplication of code, but it might end up being a cleaner solution.
> 
> Anyway, I think memory leakage is just the start of what's broken here,
> and fixing it won't be a very small patch.
> 

I'm not really confident, but I'll give a try.  Probably with your
second solution which looks easier.

Any pointer is welcome, unless someone more familiar with this code
wants to take it.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Next
From: Andres Freund
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <