Re: IndexJoin memory problem using spgist and boxes - Mailing list pgsql-hackers

From Anton Dignös
Subject Re: IndexJoin memory problem using spgist and boxes
Date
Msg-id CALNdv1gfKVidXz3EQBY5EcEAnNTxELM1LM6q+2rd3szKxwXhPw@mail.gmail.com
Whole thread Raw
In response to Re: IndexJoin memory problem using spgist and boxes  (Anton Dignös <dignoes@inf.unibz.it>)
Responses Re: IndexJoin memory problem using spgist and boxes  (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>)
List pgsql-hackers
Hi,

attached is the patch that uses two memory contexts.
One for calling the inner consistent function,
and a new one for keeping the traversal memory of the inner consistent function.

I run some test to compare the memory footprints. I report the total maximum
memory usage (sum of all children) of the per-query memory context
that is the parent of all memory
contexts used. The test datasets are described below.

 TEST |  HEAD | previous patch | patch V2
------+-------+----------------+----------
 T1   | 3.4GB | 98kB           | 81kB
 T2   | 3.4GB | 17MB           | 17MB
 T3   | 3.5GB | 17MB           | 17MB
 T4   | 7GB   | 17MB           | 17MB
 T5   | 8GB   | 34MB           | 25MB

T1: 1M x 1M tuples with relatively few overlaps
T2: as T1, but with 1 tuple in the outer relation for which the entire
index reports matches
T3: as T1, but with 10 tuples in the outer relation for which the
entire index reports matches
T4: as T3, but the outer relation has double the number of tuples
T5: as T4, but the inner relation has double the number of tuples

TEST dataset creation queries (executed incrementally)

T1:
-- create table r with 1M tuples
CREATE TABLE r AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000);
-- create table s with 1M tuples
CREATE TABLE s AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000) ORDER BY random(); -- random sort just
speeds up index creation
CREATE INDEX s_idx ON s USING spgist(box);

T2:
-- inserts a tuple for which the entire index is a match
INSERT INTO r VALUES (2, box(point(1, 1), point(1000000, 1000000)));

T3:
-- inserts 9 more tuples as in T2.

T4:
-- doubles the outer relations
INSERT INTO r SELECT * FROM r;

T5:
-- doubles the inner relation
INSERT INTO s SELECT * FROM s;


The new patch is a bit better in terms of memory by using the two
memory contexts.
I also checked the query times using explain analyze, both patches
have approximately the same runtime,
but run 5-6 times faster than HEAD.

Best regards,
Anton

Attachment

pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: Implementing SQL ASSERTION
Next
From: Dmitry Dolgov
Date:
Subject: jsonb nesting level edge case