Re: Bug? Query plans / EXPLAIN using gigabytes of memory - Mailing list pgsql-general
From | Toby Corkindale |
---|---|
Subject | Re: Bug? Query plans / EXPLAIN using gigabytes of memory |
Date | |
Msg-id | c4d89858-13a7-461d-9ead-021b58a4f608@dmz03.strategicdata.internal Whole thread Raw |
In response to | Re: Bug? Query plans / EXPLAIN using gigabytes of memory (Willy-Bas Loos <willybas@gmail.com>) |
Responses |
Re: Bug? Query plans / EXPLAIN using gigabytes of memory
|
List | pgsql-general |
Hi Willy-Bas, Thanks for your reply. I realise that stacking the views up like this complicates matters, but the actual views are fairly simple queries, and eachone individually is only looking at a few dozen rows. (Eg. selecting min, max or average value from a small set, groupedby one column) From the point of view of creating reporting queries, it's a nice and logical way to build up a query, and we didn't thinkit would present any problems.. and even on a well-populated database, the query runs very fast. It's just the astoundingamount of memory used that presents difficulties. Looking at the postgresql.conf for non-default settings, the notable ones are: max_connections = 200 ssl = false shared_buffers = 256MB max_prepared_transactions = 16 # although they aren't used for the group of queries in question maintenance_work_mem = 128MB # work_mem is left at default of 1MB effective_io_concurrency = 2 random_page_cost = 3.0 effective_cache_size = 512MB geqo = on geqo_threshold = 12 geqo_effort = 7 Some other things are non-default, like checkpoints, streaming-replication stuff, but those shouldn't have any effect. The memory settings (shared buffers, effective cache) might seem to be set quite conservatively at the moment, given thememory available in the machine -- but since we can exhaust that memory with just a few connections, it seems fair. Cheers, Toby ----- Original Message ----- From: "Willy-Bas Loos" <willybas@gmail.com> To: "Toby Corkindale" <toby.corkindale@strategicdata.com.au> Cc: "pgsql-general" <pgsql-general@postgresql.org> Sent: Wednesday, 25 April, 2012 6:05:37 PM Subject: Re: [GENERAL] Bug? Query plans / EXPLAIN using gigabytes of memory Stacking views is a bad practice. It usually means that you are making the db do a lot of unnecessary work, scanning tablesmore than once when you don't even need them. According to your description, you have 3 layers of views on partitioned tables. I can imagine that that leaves the planner with a lot of possible query plans, a lot of interaction and a lot of statisticsto read. do you have any special settings for the statistics on these tables? and could you please post the non-default settings in your postgresql.conf file? $ grep ^[^#] /etc/ postgresql /9.1/main/ postgresql . conf | grep -e ^[^[:space:]] Would be helpful to see if you have any statistics or planner stuff altered. Cheers, WBL On Wed, Apr 25, 2012 at 8:18 AM, Toby Corkindale < toby.corkindale@strategicdata.com.au > wrote: Hi, I'm hitting some peculiar behaviour. I'm currently on Pg 9.1.3 on a 64bit Debian system. I have a database which is moderately large - 20 GByte or so - and contains that data split up over dozens of tables, whichare themselves partitioned. Queries are usually only run against fairly small, partitioned, sets of data. These queries generally run fairly fast. Performance is not a problem. However Postgres is chewing up huge amounts of memory just to create the query plan! For example, even if I just run EXPLAIN SELECT a_column FROM a_view WHERE partition_id = 1; Then the postgres backend process takes several seconds to return, and in the worst example here, is hogging more than 3Gbyteonce it comes back. (It doesn't free that up until you close the connection) The query plan that comes back does seem quite convoluted, but then, the view is a query run over about eight other views,each of which is pulling data from a few other views. The actual underlying data being touched is only *a few dozen*small rows. As I said, the query runs fast enough.. however we only need a handful of these queries to get run in separate connections,and the database server will be exhausted of memory. Especially since the memory isn't returned until the endof the connection, yet these connections typically stay up for a while. I wondered if there's anything I can do to reduce this memory usage? And, is this a bug? I've posted the output of the query plan here: https://gist.github.com/2487097 Thanks in advance, Toby -- Sent via pgsql-general mailing list ( pgsql-general@postgresql.org ) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- "Quality comes from focus and clarity of purpose" -- Mark Shuttleworth
pgsql-general by date: