Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree - Mailing list pgsql-bugs

From Arseniy Mukhin
Subject Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree
Date
Msg-id CAE7r3MJV=2DTdO0GLe8Ear=eFaPWmT4qtSHJUrV8h4R+yEisAg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree
List pgsql-bugs
On Wed, Aug 13, 2025 at 7:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Aug 12, 2025 at 5:44 PM PG Bug reporting form
> <noreply@postgresql.org> wrote:
> >
> > The following bug has been logged on the website:
> >
> > Bug reference:      19018
> > Logged by:          Joseph Silva
> > Email address:      dull.bananas0@gmail.com
> > PostgreSQL version: 17.5
> > Operating system:   Fedora
> > Description:
> >
> > If I run this, then the postgres process's memory usage approaches 6 GB, and
> > the insertion when
> > i=253 fails with "stack depth limit exceeded":
> >
> > ```
> > CREATE EXTENSION ltree;
> >
> > CREATE TABLE comment (path ltree);
> >
> > CREATE INDEX ON comment USING gist (path);
> >
> > DO $$
> >     DECLARE
> >         i int := 1;
> >         p text := '0';
> >     BEGIN
> >         WHILE i < 1000 LOOP
> >             p := p || '.' || i::text;
> >             i := i + 1;
> >             INSERT INTO comment (path) VALUES (p::ltree);
> >             COMMIT;
> >         END LOOP;
> >     END
> > $$;
> > ```
> >
> > If index creation is delayed until after insertions, then the insertions
> > succeed but index creation
> > fails.
> >
>
> Thanks for reporting, I didn't analyze it fully but here is what I
> have analyzed so far.  While debugging I have noticed that it is
> recursively trying to complete the previously incomplete split,
> ideally there should not be any incomplete split because I am just
> running this from a single sessions so there should not be any issue
> in acquiring parent page lock and there is no crash so
> GistFollowRight() must be cleared but it is not in some cases and its
> keep recursively splitting until it hits the stack overflow [2].  So
> this seems like somewhere we have missed to call
> GistClearFollowRight() after splitting.

Hi,

I managed to reproduce it too. I agree that the problem here is an
endless split. I encountered similar issues while I was trying to
insert big tuples into the gist index. And the main problem I think
here is that GIST insert code does not limit index tuple size (ofc it
could not be larger then page size). There is a macros
GISTMaxIndexTupleSize, but it's not used. Probably there are some
limits on the operator classes side, but it doesn't seem to work here.

I added several logs while investigating it. So how stack overflow is
happening in this case:

During the insert that leads to stack overflow, we have a split. To
finish the split we need to insert into the parent last two downlinks.
Here the sizes of these downlinks (all sizes here are in bytes):

2025-08-13 17:38:38.614 MSK [829680] DEBUG:  finish split: insert
child left size: 3776, right size: 4016

You can see that it's a huge index tuples.

At the moment of insert of the two new downlinks, the parent has 2 tuples in it:
2025-08-13 17:38:38.614 MSK [829680] DEBUG:  number of tups on the
page before split: 2

We want to add two more. But it does not fit into the page, so we have
a parent split. Split algorithm decided that we need 3 pages in this
split:
2025-08-13 17:38:38.614 MSK [829680] DEBUG:  split during insert,
children number: 3

And guess what, here are sizes of new downlinks that we have for these 3 pages:
2025-08-13 17:38:38.614 MSK [829680] DEBUG:  downlink size: 3776
2025-08-13 17:38:38.614 MSK [829680] DEBUG:  downlink size: 4016
2025-08-13 17:38:38.614 MSK [829680] DEBUG:  downlink size: 4160

Here we can see that sizes of the last two downlinks that we will try
to insert into the parent of the parent are 3776 and 4016. The same
that we were trying to insert into the initial parent.

In short, when we try to insert 3 huge tuples in the parent page, we
have a new split, which results in new 3 huge tuples that we need to
insert into the parent of the parent etc.

This way we have neverending split.

Here is a draft patch that checks index size before insert and during
the split, and now the reproducer fails on the check.


Best regards,
Arseniy Mukhin

Attachment

pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #19019: Feature Request: allow the use of column reference in DEFAULT expression
Next
From: Dilip Kumar
Date:
Subject: Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree