Re: Memory leak in GIN index build - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Memory leak in GIN index build
Date
Msg-id 17456.1460832307@sss.pgh.pa.us
Whole thread Raw
In response to Memory leak in GIN index build  (Julien Rouhaud <julien.rouhaud@dalibo.com>)
Responses Re: Memory leak in GIN index build  (Julien Rouhaud <julien.rouhaud@dalibo.com>)
List pgsql-hackers
Julien Rouhaud <julien.rouhaud@dalibo.com> writes:
> After some digging, the leak comes from walbufbegin palloc in
> registerLeafRecompressWALData().
> IIUC, walbufbegin isn't pfree-d and can't be before XLogInsert() is
> called, which happens in ginPlaceToPage().

Hmm.

> I don't see a simple way to fix that. My first idea would be to change
> GinBtreeData's placeToPage (and all other needed) functions signature to
> pass a pointer to pfree in ginPlaceToPage() if needed, but it doesn't
> really seem appealing. In any case, I'd be happy to work on a patch if
> needed.

> Also, in dataPlaceToPageLeaf() and ginVacuumPostingTreeLeaf(), shouldn't
> the START_CRIT_SECTION() calls be placed before the xlog code?

Yeah, they should.  Evidently somebody kluged it to avoid doing a palloc
inside a critical section, while ignoring every other rule about where to
place critical sections.  (Maybe we should put an assert about being
within a critical section into XLogBeginInsert.)

This code is pretty fundamentally broken, isn't it.  elog() calls
inside a critical section?  Really?  Even worse, a MemoryContextDelete,
which hardly seems like something that should be assumed error-proof.

Not to mention the unbelievable ugliness of a function that sometimes
returns with an open critical section (and WAL insertion started) and
sometimes doesn't.

It kinda looks like this was hacked up without bothering to keep the
comment block in ginPlaceToPage in sync with reality, either.

I think this needs to be redesigned so that the critical section and WAL
insertion calls all happen within a single straight-line piece of code.

We could try making that place be ginPlaceToPage().  So far as
registerLeafRecompressWALData is concerned, that does not look that hard;
it could palloc and fill the WAL-data buffer the same as it's doing now,
then pass it back up to be registered (and eventually pfreed) in
ginPlaceToPage.  However, that would also mean postponing the call of
dataPlaceToPageLeafRecompress(), which seems much messier.  I assume
the data it's working from is mostly in the tmpCtx that
dataPlaceToPageLeaf wants to free before returning.  Maybe we could
move creation/destruction of the tmpCtx out to ginPlaceToPage, as well?

The other line of thought is to move the WAL work that ginPlaceToPage
does down into the subroutines.  That would probably result in some
duplication of code, but it might end up being a cleaner solution.

Anyway, I think memory leakage is just the start of what's broken here,
and fixing it won't be a very small patch.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Memory leak in GIN index build
Next
From: Noah Misch
Date:
Subject: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <