Thread: [PATCH] nodeindexscan with reorder memory leak

[PATCH] nodeindexscan with reorder memory leak

From
Aliaksandr Kalenik
Date:
Hey!

I was investigating a leak reported in the PostGIS issues tracker [1] which led me to the Postgres side where the problem really is. The leak is reproducible with query from original ticket [1]:
WITH latitudes AS (	SELECT generate_series AS latitude	FROM generate_series(-90, 90, 0.1)
), longitudes AS (	SELECT generate_series AS longitude	FROM generate_series(-180, 180, 0.1)
), points AS (	SELECT ST_SetSRID(ST_Point(longitude, latitude), 4326)::geography AS geog	FROM latitudes	CROSS JOIN longitudes
)
SELECT	geog,	(		SELECT name		FROM ne_110m_admin_0_countries AS ne		ORDER BY p.geog <-> ne.geog		LIMIT 1	)
FROM points AS p
;
The leak is only noticeable when index scan with reorder happens as part of subquery plan which is explained by the fact that heap tuples cloned in reorderqueue_push are not freed during flush of reorder queue in ExecReScanIndex.

[1] https://trac.osgeo.org/postgis/ticket/4720
Attachment

Re: [PATCH] nodeindexscan with reorder memory leak

From
Tom Lane
Date:
Aliaksandr Kalenik <akalenik@kontur.io> writes:
> The leak is only noticeable when index scan with reorder happens as part of
> subquery plan which is explained by the fact that heap tuples cloned in
> reorderqueue_push are not freed during flush of reorder queue in
> ExecReScanIndex.

Hmm ... I see from the code coverage report[1] that that part of
ExecReScanIndexScan isn't even reached by our existing regression
tests.  Seems bad :-(

            regards, tom lane

[1] https://coverage.postgresql.org/src/backend/executor/nodeIndexscan.c.gcov.html#577



Re: [PATCH] nodeindexscan with reorder memory leak

From
Tom Lane
Date:
Aliaksandr Kalenik <akalenik@kontur.io> writes:
> I was investigating a leak reported in the PostGIS issues tracker [1] which
> led me to the Postgres side where the problem really is. The leak is
> reproducible with query from original ticket [1]:
> ...
> The leak is only noticeable when index scan with reorder happens as part of
> subquery plan which is explained by the fact that heap tuples cloned in
> reorderqueue_push are not freed during flush of reorder queue in
> ExecReScanIndex.

Actually, that code has got worse problems than that.  I tried to improve
our regression tests to exercise that code path, as attached.  What I got
was

+SELECT point(x,x), (SELECT circle_center(f1) FROM gcircle_tbl ORDER BY f1 <-> p
oint(x,x) LIMIT 1) as c FROM generate_series(0,1000,1) x;
+ERROR:  index returned tuples in wrong order

(The error doesn't always appear depending on what generate_series
parameters you use, but it seems to show up consistently with
a step of 1 and a limit of 1000 or more.)

Fixing this is well beyond my knowledge of that code, so I'm punting
it to the original authors.

            regards, tom lane

diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 8b353be16e..0b60caabe4 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -255,6 +255,10 @@ EXPLAIN (COSTS OFF)
 SELECT circle_center(f1), round(radius(f1)) as radius FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
 SELECT circle_center(f1), round(radius(f1)) as radius FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;

+EXPLAIN (COSTS OFF)
+SELECT point(x,x), (SELECT circle_center(f1) FROM gcircle_tbl ORDER BY f1 <-> point(x,x) LIMIT 1) as c FROM
generate_series(0,1000,1)x; 
+SELECT point(x,x), (SELECT circle_center(f1) FROM gcircle_tbl ORDER BY f1 <-> point(x,x) LIMIT 1) as c FROM
generate_series(0,1000,1)x; 
+
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;

Re: [PATCH] nodeindexscan with reorder memory leak

From
Aliaksandr Kalenik
Date:
Thanks for your review!

On Sun, Jan 30, 2022 at 7:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Actually, that code has got worse problems than that.  I tried to improve
> our regression tests to exercise that code path, as attached.  What I got
> was
>
> +SELECT point(x,x), (SELECT circle_center(f1) FROM gcircle_tbl ORDER BY f1 <-> p
> oint(x,x) LIMIT 1) as c FROM generate_series(0,1000,1) x;
> +ERROR:  index returned tuples in wrong order

I tried to figure out what is the problem with this query. This error
happens when actual distance is less than estimated distance.
For this specific query it happened while comparing these values:
50.263279680219099532223481219262 (actual distance returned by
dist_cpoint in geo_ops.c) and 50.263279680219113743078196421266
(bounding box distance returned by computeDistance in gistproc.c).

So for me it looks like this error is not really related to KNN scan
code but to some floating-arithmetic issue in distance calculation
functions for geometry type.Would be great to figure out a fix but
for now I didn’t manage to find a better way than comparing the
difference of distance with FLT_EPSILON which definitely doesn't seem
like the way to fix :(



Re: [PATCH] nodeindexscan with reorder memory leak

From
Alexander Korotkov
Date:
Hi!

On Mon, Feb 7, 2022 at 11:42 AM Aliaksandr Kalenik <akalenik@kontur.io> wrote:
> Thanks for your review!
>
> On Sun, Jan 30, 2022 at 7:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Actually, that code has got worse problems than that.  I tried to improve
> > our regression tests to exercise that code path, as attached.  What I got
> > was
> >
> > +SELECT point(x,x), (SELECT circle_center(f1) FROM gcircle_tbl ORDER BY f1 <-> p
> > oint(x,x) LIMIT 1) as c FROM generate_series(0,1000,1) x;
> > +ERROR:  index returned tuples in wrong order
>
> I tried to figure out what is the problem with this query. This error
> happens when actual distance is less than estimated distance.
> For this specific query it happened while comparing these values:
> 50.263279680219099532223481219262 (actual distance returned by
> dist_cpoint in geo_ops.c) and 50.263279680219113743078196421266
> (bounding box distance returned by computeDistance in gistproc.c).
>
> So for me it looks like this error is not really related to KNN scan
> code but to some floating-arithmetic issue in distance calculation
> functions for geometry type.Would be great to figure out a fix but
> for now I didn’t manage to find a better way than comparing the
> difference of distance with FLT_EPSILON which definitely doesn't seem
> like the way to fix :(

Probably, this is caused by some compiler optimization.  Could you
re-check the issue with different compilers and optimization levels?

Regarding the memory leak, could you add a corresponding regression
test to the patch (probably similar to Tom's query upthread)?

------
Regards,
Alexander Korotkov



Re: [PATCH] nodeindexscan with reorder memory leak

From
Aliaksandr Kalenik
Date:
On Mon, Feb 7, 2022 at 11:20 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> Regarding the memory leak, could you add a corresponding regression
> test to the patch (probably similar to Tom's query upthread)?

Yes, added similar to Tom's query but with polygons instead of circles.

Attachment

Re: [PATCH] nodeindexscan with reorder memory leak

From
Alexander Korotkov
Date:
On Thu, Feb 10, 2022 at 1:19 AM Aliaksandr Kalenik <akalenik@kontur.io> wrote:
> On Mon, Feb 7, 2022 at 11:20 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > Regarding the memory leak, could you add a corresponding regression
> > test to the patch (probably similar to Tom's query upthread)?
>
> Yes, added similar to Tom's query but with polygons instead of circles.

I've rechecked that the now patched code branch is covered by
regression tests.  I think the memory leak issue is independent of the
computational errors we've observed.

So, I'm going to push and backpatch this if no objections.

------
Regards,
Alexander Korotkov



Re: [PATCH] nodeindexscan with reorder memory leak

From
Tom Lane
Date:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> I've rechecked that the now patched code branch is covered by
> regression tests.  I think the memory leak issue is independent of the
> computational errors we've observed.
> So, I'm going to push and backpatch this if no objections.

+1.  We should work on the roundoff-error issue as well, but
as you say, it's an independent problem.

            regards, tom lane



Re: [PATCH] nodeindexscan with reorder memory leak

From
Alexander Korotkov
Date:
On Thu, Feb 10, 2022 at 2:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > I've rechecked that the now patched code branch is covered by
> > regression tests.  I think the memory leak issue is independent of the
> > computational errors we've observed.
> > So, I'm going to push and backpatch this if no objections.
>
> +1.  We should work on the roundoff-error issue as well, but
> as you say, it's an independent problem.

Pushed, thank you.

------
Regards,
Alexander Korotkov