Home > mailing lists

How to craft a query that uses memory? - Mailing list pgsql-general

From
Subject	How to craft a query that uses memory?
Date	June 19, 2015 10:42:36
Msg-id	C5DBACC6DCC7604C9E4875FD9C7968B11A148D131B@ITXS01EVS.service.it.nrw.de Whole thread Raw
Responses	Re: How to craft a query that uses memory?
List	pgsql-general

Tree view

Hello list,

What would be the best way of setting up a query that uses lots of memory, in order to study the effects of resource usage parameters on the performance of the query?

I thought that a query on a large table involving JOINs would be a way, but while the backend initially used 1.7 gigs of memory, it only uses a couple dozen megs of memory anymore after I restarted PostgreSQL because the checkpointer process had 2.4 gigs of memory in use. (The table consumes nearly 1.3 gigs of disk space, and the query returns one billion as the join used results in one billion rows.)

The query I was using is this one:

SELECT count(*) FROM pgbench_accounts a JOIN pgbench_accounts b ON (a.abalance = b.abalance);

which I ran after initializing the table using pgbench –I –s 100, and populating the abalance field with

UPDATE pgbench_accounts SET abalance = aid % 100000 + 1;

The query plan reads,

postgres=# EXPLAIN ANALYZE SELECT count(*) FROM pgbench_accounts a JOIN pgbench_accounts b ON (a.abalance = b.abalance);

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=22479359.09..22479359.10 rows=1 width=0) (actual time=787769.456..787769.456 rows=1 loops=1)

-> Merge Join (cost=3126660.67..19725402.17 rows=1101582767 width=0) (actual time=62254.349..621119.691 rows=1000000000 loops=1)

Merge Cond: (a.abalance = b.abalance)

-> Sort (cost=1563330.33..1588330.33 rows=10000000 width=4) (actual time=27706.051..42305.488 rows=10000000 loops=1)

Sort Key: a.abalance

Sort Method: external merge Disk: 136624kB

-> Seq Scan on pgbench_accounts a (cost=0.00..263935.00 rows=10000000 width=4) (actual time=0.260..11592.979 rows=10000000 loops=1)

-> Materialize (cost=1563330.33..1613330.33 rows=10000000 width=4) (actual time=34548.224..209123.713 rows=999999901 loops=1)

-> Sort (cost=1563330.33..1588330.33 rows=10000000 width=4) (actual time=34548.202..43988.283 rows=10000000 loops=1)

Sort Key: b.abalance

Sort Method: external merge Disk: 136624kB

-> Seq Scan on pgbench_accounts b (cost=0.00..263935.00 rows=10000000 width=4) (actual time=0.017..16807.894 rows=10000000 loops=1)

Total runtime: 787836.841 ms

(13 Zeilen)

The point of the SELECT count(*) (as opposed to something like SELECT a.*) is, of course, to make sure we are not measuring the impact of the memory consumption of pgbench.

There are sorts involved, which I heard might require working memory but could also “spill to disk” as someone put it. So how do I craft a query that actually does use lots of memory?

Cheers,

Holger Friedrich

pgsql-general by date:

From: Albe Laurenz
Date: 19 June 2015, 08:19:10
Subject: Re: valgrind

From: Albe Laurenz
Date: 19 June 2015, 10:56:30
Subject: Re: How to craft a query that uses memory?

How to craft a query that uses memory? - Mailing list pgsql-general

Previous

Next