Re: Should pg 11 use a lot more memory building an spgist index? - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Should pg 11 use a lot more memory building an spgist index?
Date
Msg-id CA+HiwqEYqPsNnxYUDL6J=CJJnzOz-S_pJ4EJt+-hX0cDf1q4OA@mail.gmail.com
Whole thread Raw
In response to Re: Should pg 11 use a lot more memory building an spgist index?  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: Should pg 11 use a lot more memory building an spgist index?  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
On Tue, Oct 30, 2018 at 7:11 PM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 2018/10/30 4:48, Tom Lane wrote:
> > I wrote:
> >> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> >>> How about modifying SysScanDescData to have a memory context member,
> >>> which is created by systable_beginscan and destroyed by endscan?
> >
> >> I think it would still have problems, in that it would affect in which
> >> context index_getnext delivers its output.  We could reasonably make
> >> these sorts of changes in places where the entire index_beginscan/
> >> index_getnext/index_endscan call structure is in one place, but that
> >> doesn't apply for the systable functions.
> >
> > Actually, after looking more closely, index_getnext doesn't return a
> > palloc'd object anyway, or at least not one that the caller is responsible
> > for managing.  So maybe that could work.
>
> Instead of SysScanDescData, could we add one to IndexScanData?  It's
> somewhat clear that catalog scans don't leak much, but user index scans
> can, as seen in this case.
>
> In this particular case, I see that it's IndexCheckExclusion() that
> invokes check_unique_or_exclusion_constraint() repeatedly for each heap
> tuple after finishing building index on the heap.  The latter performs
> index_beginscan/endscan() for every heap tuple, but doesn't bother to
> release the memory allocated for IndexScanDesc and its members.
>
> I've tried to fix that with the attached patches.
>
> 0001 adds the ability for the callers of index_beginscan to specify a
> memory context.  index_beginscan_internals switches to that context before
> calling ambeginscan and stores into a new field xs_scan_cxt of
> IndexScanData, but maybe storing it is unnecessary.
>
> 0002 builds on that and adds the ability for the callers of
> check_exclusion_or_unique_constraints to specify a memory context.  Using
> that, it fixes the leak by teaching IndexCheckExclusion and
> ExecIndexInsertTuples to pass a memory context that's reset before calling
> check_exclusion_or_unique_constraints() for the next tuple.
>
> The following example shows that the patch works.
>
> With HEAD:
>
> create table foo (a int4range);
> alter table foo add exclude using spgist (a with &&);
> -- this consumes about 1GB
> insert into foo select int4range(i,i+1) from generate_series(1, 100000) i;
> alter table foo drop constraint foo_a_excl;
> -- this too
> alter table foo add exclude using spgist (a with &&);
>
> Patched:
>
> create table foo (a int4range);
> alter table foo add exclude using spgist (a with &&);
> -- memory consumption doesn't grow above 37MB
> insert into foo select int4range(i,i+1) from generate_series(1, 100000) i;
> alter table foo drop constraint foo_a_excl;
> -- ditto
> alter table foo add exclude using spgist (a with &&);

Sorry I forgot to try the example with your patch.  Maybe, it will
have more or less the same behavior as mine, although I didn't realize
that when I started writing my patch.

Thanks,
Amit


pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: ToDo: show size of partitioned table
Next
From: Tomas Vondra
Date:
Subject: Re: shared-memory based stats collector