Re: Should pg 11 use a lot more memory building an spgist index? - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Should pg 11 use a lot more memory building an spgist index?
Date
Msg-id 023391a7-55db-6948-71ff-328218b540de@lab.ntt.co.jp
Whole thread Raw
In response to Re: Should pg 11 use a lot more memory building an spgist index?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2018/10/30 4:48, Tom Lane wrote:
> I was confused about why the memory leak in Bruno's example is so much
> larger in HEAD than v11; spgbeginscan does allocate more stuff than
> before, but only if numberOfOrderBys > 0, which surely doesn't apply for
> the exclusion-check code path.  Eventually I realized that the difference
> is because commit 2a6368343 made SpGistScanOpaqueData a good deal larger
> than it had been (10080 vs 6264 bytes, on my x86_64 box), so it was just
> the failure to pfree that struct that accounted for the bigger leak.
> 
> There's another issue with 2a6368343, though: it added a couple of
> fmgr_info_copy calls to spgbeginscan.  I don't understand why it'd be a
> good idea to do that rather than using the FmgrInfos in the index's
> relcache entry directly.  Making a copy every time risks a memory leak,
> because spgendscan has no way to free any fn_extra data that gets
> allocated by the called function; and by the same token it's inefficient,
> because the function has no way to cache fn_extra data across queries.
> 
> If we consider only the leak aspect, this coding presents a reason why
> we should try to change things as Alvaro suggests.  However, because of
> the poor-caching aspect, I think this code is pretty broken anyway,
> and once we fix it the issue is much less clear-cut.
> 
> (Looking around, it seems like we're very schizophrenic about whether
> to copy index relcache support function FmgrInfos or just use them
> directly.  But nbtree and hash seem to always do the latter, so I think
> it's probably reasonable to standardize on that.)

Agreed about trying to avoid fmgr_info_copy(), at least in this case.

By the way, do I get it right that the reason to want to avoid copying in
this instance is that the currently active context could be a long-lived
one, as in the case of create index, alter table add constraint, etc.
calling the copying code for every tuple?  There are other instances of
fmgr_info_copy throughout index AM implementations, including the helper
function ScanKeyEntryInitializeWithInfo() called from them, but the copies
made in those cases don't appear very leak-prone.

> Anyway, the attached proposed patch for HEAD makes Bruno's problem go
> away, and it also fixes an unrelated leak introduced by 2a6368343
> because it reallocates so->orderByTypes each time through spgrescan.
> I think we should apply this, and suitable subsets in the back branches,
> to fix the immediate problem.  Then we can work at leisure on a more
> invasive HEAD-only patch, if anyone is excited about that.

Makes sense to fix it like this for back-patching.

Thanks,
Amit



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: ToDo: show size of partitioned table
Next
From: David Rowley
Date:
Subject: Super PathKeys (Allowing sort order through precision loss functions)