Thread: High CPU usage after partitioning
rd
This is the way the world ends.
Not with a bang, but a whimper.
I was under the impression that the default_statistics_target was a percentage of rows to analyze. Maybe this is not the case? I ran an analyze during a "quiet point" last night and for a few of my large tables, I didn't get what I consider a reasonable sampling of rows. When running with "verbose" enabled, it appeared that a maximum of 240000 rows were being analyzed, including on tables exceeding 4-8mm rows. My default_statistics_target = 80. Shouldn't I be analyzing a larger percentage of these big tables? What is the unit-of-measure used for default_statistics_target? Thanks in advance, AJ
On 21.01.2013 17:29, AJ Weber wrote: > I was under the impression that the default_statistics_target was a > percentage of rows to analyze. Maybe this is not the case? Nope. > I ran an analyze during a "quiet point" last night and for a few of my > large tables, I didn't get what I consider a reasonable sampling of > rows. When running with "verbose" enabled, it appeared that a maximum of > 240000 rows were being analyzed, including on tables exceeding 4-8mm > rows. My default_statistics_target = 80. > > Shouldn't I be analyzing a larger percentage of these big tables? Analyze only needs a fairly small random sample of the rows in the table to get a picture of what the data looks like. Compare with e.g opinion polls; you only need to sample a few thousand people to get a result with reasonable error bound. That's for estimating the histogram. Estimating ndistinct is a different story, and it's well-known that the estimates of ndistinct are sometimes wildly wrong. > What is the unit-of-measure used for default_statistics_target? It's the number of entries stored in the histogram and most-common-values list in pg_statistics. See also http://www.postgresql.org/docs/devel/static/planner-stats.html: "The amount of information stored in pg_statistic by ANALYZE, in particular the maximum number of entries in the most_common_vals and histogram_bounds arrays for each column, can be set on a column-by-column basis using the ALTER TABLE SET STATISTICS command, or globally by setting the default_statistics_target configuration variable. The default limit is presently 100 entries." - Heikki
AJ Weber wrote: > What is the unit-of-measure used for default_statistics_target? Number of entries in pg_stats.histogram_bounds orpg_stats.most_common_vals. Yours, Laurenz Albe
On Mon, Jan 21, 2013 at 9:05 AM, rudi <rudolone@gmail.com> wrote: > Hello, > > I'm running postgresl 9.0. After partitioning a big table, CPU usage raised > from average 5-10% to average 70-80%. > > - the table contains about 20.000.000 rows > - partitions are selected using a trigger, based on an indexed field, a date > (IF date_taken >= x AND date_taken < y) > - I created 5 partitions, the 2012 one now contains most of the rows. The > 2013 partition is the "live" partition, mostly insert, a few select based on > the above indexed field. The 2013, 2014, 2015 partitions are empty > - constraint execution is on. > > I have 2 weeks CPU usage reports and the pattern definately changed after I > made the partitions. Any idea? First thing that jumps to mind is you have some seq-scan heavy plans that were not seq-scan before. Could be due to query fooling CE mechanism or some other CE (probably fixable issue). To diagnose we need to see some explain analyze plans of queries that are using higher than expected cpu usage. Second possible cause is trigger overhead from inserts. Not likely to cause so much of a jump, but if this is the issue suggested optimization path is to insert directly to the partition. merlin
Merlin Moncure wrote: >> I'm running postgresl 9.0. After partitioning a big table, CPU >> usage raised from average 5-10% to average 70-80%. > First thing that jumps to mind is you have some seq-scan heavy > plans that were not seq-scan before. Make sure that all indexes are defined for each partition. It is not enough to define them on just the parent level. -Kevin
On 01/21/2013 10:05 AM, rudi wrote: > Hello, > > I'm running postgresl 9.0. After partitioning a big table, CPU usage > raised from average 5-10% to average 70-80%. > > - the table contains about 20.000.000 rows > - partitions are selected using a trigger, based on an indexed field, > a date (IF date_taken >= x AND date_taken < y) > - I created 5 partitions, the 2012 one now contains most of the rows. > The 2013 partition is the "live" partition, mostly insert, a few > select based on the above indexed field. The 2013, 2014, 2015 > partitions are empty > - constraint execution is on. > I have 2 weeks CPU usage reports and the pattern definately changed > after I made the partitions. Any idea? > > Well, the first question that comes to my mind is whether it's the inserts that are causing the load or the reads. If it's the inserts then you should show us the whole trigger. Does it by any chance use 'execute'? cheers andrew
Well, the first question that comes to my mind is whether it's the inserts that are causing the load or the reads. If it's the inserts then you should show us the whole trigger. Does it by any chance use 'execute'?
On 01/21/2013 10:05 AM, rudi wrote:Hello,
I'm running postgresl 9.0. After partitioning a big table, CPU usage raised from average 5-10% to average 70-80%.
- the table contains about 20.000.000 rows
- partitions are selected using a trigger, based on an indexed field, a date (IF date_taken >= x AND date_taken < y)
- I created 5 partitions, the 2012 one now contains most of the rows. The 2013 partition is the "live" partition, mostly insert, a few select based on the above indexed field. The 2013, 2014, 2015 partitions are empty
- constraint execution is on.
I have 2 weeks CPU usage reports and the pattern definately changed after I made the partitions. Any idea?
rd
This is the way the world ends.
Not with a bang, but a whimper.
On Tue, Jan 22, 2013 at 7:34 AM, rudi <rudolone@gmail.com> wrote: > Every query has been carefully optimized, child tables are indexed. The > table(s) has a UNIQUE index on ("date_taken", "device_id") and "date_taken" > is the partitioning column (one partition per year). > There are few well known access path to this table: INSERTs (40-50.000 each > day), SELECTs on a specific device_id AND on a specific day. > > BUT, I discovered an access path used by a process every few secs. to get > the last log for a given device, and this query became really slow after > partitioning: > > Result (cost=341156.04..341182.90 rows=4 width=86) (actual > time=1132.326..1132.329 rows=1 loops=1) > InitPlan 1 (returns $0) > -> Aggregate (cost=341156.03..341156.04 rows=1 width=8) (actual > time=1132.295..1132.296 rows=1 loops=1) > -> Append (cost=0.00..341112.60 rows=17371 width=8) (actual > time=45.600..1110.057 rows=19016 loops=1) > -> Seq Scan on sb_logs (cost=0.00..0.00 rows=1 width=8) > (actual time=0.000..0.000 rows=0 loops=1) > Filter: (device_id = 901) > -> Index Scan using > sb_logs_2012_on_date_taken_and_device_id on sb_logs_2012 sb_logs > (cost=0.00..319430.51 rows=16003 width=8) (actual time=45.599..1060.143 > rows=17817 loops=1) > Index Cond: (device_id = 901) > -> Index Scan using > sb_logs_2013_on_date_taken_and_device_id on sb_logs_2013 sb_logs > (cost=0.00..21663.39 rows=1363 width=8) (actual time=0.022..47.661 rows=1199 > loops=1) > Index Cond: (device_id = 901) > -> Bitmap Heap Scan on sb_logs_2014 sb_logs > (cost=10.25..18.71 rows=4 width=8) (actual time=0.011..0.011 rows=0 loops=1) > Recheck Cond: (device_id = 901) > -> Bitmap Index Scan on > sb_logs_2014_on_date_taken_and_device_id (cost=0.00..10.25 rows=4 width=0) > (actual time=0.008..0.008 rows=0 loops=1) > Index Cond: (device_id = 901) > -> Append (cost=0.00..26.86 rows=4 width=86) (actual > time=1132.325..1132.328 rows=1 loops=1) > -> Seq Scan on sb_logs (cost=0.00..0.00 rows=1 width=90) (actual > time=0.002..0.002 rows=0 loops=1) > Filter: ((device_id = 901) AND (date_taken = $0)) > -> Index Scan using sb_logs_2012_on_date_taken_and_device_id on > sb_logs_2012 sb_logs (cost=0.00..10.20 rows=1 width=90) (actual > time=1132.314..1132.314 rows=0 loops=1) > Index Cond: ((date_taken = $0) AND (device_id = 901)) > -> Index Scan using sb_logs_2013_on_date_taken_and_device_id on > sb_logs_2013 sb_logs (cost=0.00..8.39 rows=1 width=91) (actual > time=0.007..0.008 rows=1 loops=1) > Index Cond: ((date_taken = $0) AND (device_id = 901)) > -> Index Scan using sb_logs_2014_on_date_taken_and_device_id on > sb_logs_2014 sb_logs (cost=0.00..8.27 rows=1 width=72) (actual > time=0.002..0.002 rows=0 loops=1) > Index Cond: ((date_taken = $0) AND (device_id = 901)) > Total runtime: 1132.436 ms > > I must find a better way to get that information, but I wonder if there > could be a better plan. The same query over a table with the same structure > but not partitioned gives far better plan: > > Index Scan using index_iv_logs_on_date_taken_and_device_id on iv_logs > (cost=12.35..21.88 rows=1 width=157) (actual time=0.065..0.066 rows=1 > loops=1) > Index Cond: ((date_taken = $1) AND (device_id = 1475)) > InitPlan 2 (returns $1) > -> Result (cost=12.34..12.35 rows=1 width=0) (actual time=0.059..0.059 > rows=1 loops=1) > InitPlan 1 (returns $0) > -> Limit (cost=0.00..12.34 rows=1 width=8) (actual > time=0.055..0.056 rows=1 loops=1) > -> Index Scan Backward using > index_iv_logs_on_date_taken_and_device_id on iv_logs (cost=0.00..261052.53 > rows=21154 width=8) (actual time=0.055..0.055 rows=1 loops=1) > Index Cond: ((date_taken IS NOT NULL) AND (device_id > = 1475)) > Total runtime: 0.110 ms let's see the query -- it's probably written in such a way so as to not be able to be optimized through CE. merlin
not be able to be optimized through CE.let's see the query -- it's probably written in such a way so as to
Partioned table query => explain analyze SELECT "sb_logs".* FROM "sb_logs" WHERE (device_id = 901 AND date_taken = (SELECT MAX(date_taken) FROM sb_logs WHERE device_id = 901));
rd
This is the way the world ends.
Not with a bang, but a whimper.
On 01/22/2013 09:21 AM, rudi wrote: > On Tue, Jan 22, 2013 at 3:04 PM, Merlin Moncure <mmoncure@gmail.com > <mailto:mmoncure@gmail.com>> wrote: > > let's see the query -- it's probably written in such a way so as to > not be able to be optimized through CE. > > > The query is pretty simple and standard, the behaviour (and the plan) > is totally different when it comes to a partitioned table. > > Partioned table query => explain analyze SELECT "sb_logs".* FROM > "sb_logs" WHERE (device_id = 901 AND date_taken = (SELECT > MAX(date_taken) FROM sb_logs WHERE device_id = 901)); > And there you have it. Constraint exclusion does not work in cases like this. It only works with static expressions (such as a literal date in this case). cheers andrew
The query is pretty simple and standard, the behaviour (and the plan) is totally different when it comes to a partitioned table.And there you have it. Constraint exclusion does not work in cases like this. It only works with static expressions (such as a literal date in this case).
Partioned table query => explain analyze SELECT "sb_logs".* FROM "sb_logs" WHERE (device_id = 901 AND date_taken = (SELECT MAX(date_taken) FROM sb_logs WHERE device_id = 901));
rd
This is the way the world ends.
Not with a bang, but a whimper.
In PG 9.2 I’m getting “Index Only Scan Backward” for every partition in the first part of execution plan, when looking for MAX in partitioned table on a similar query:
" -> Index Only Scan Backward using pk_cycle_200610 on gp_cycle_200610 gp_cycle (cost=0.00..8.34 rows=5 width=8) (actual time=0.021..0.021 rows=1 loops=1)"
" Index Cond: (cycle_date_time IS NOT NULL)"
" Heap Fetches: 0"
May be you should upgrade to 9.2.
Regards,
Igor Neyman
From: rudi [mailto:rudolone@gmail.com]
Sent: Tuesday, January 22, 2013 10:08 AM
To: pgsql-performance@postgresql.org
Subject: Re: High CPU usage after partitioning
On Tue, Jan 22, 2013 at 3:46 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
The query is pretty simple and standard, the behaviour (and the plan) is totally different when it comes to a partitioned table.
Partioned table query => explain analyze SELECT "sb_logs".* FROM "sb_logs" WHERE (device_id = 901 AND date_taken = (SELECT MAX(date_taken) FROM sb_logs WHERE device_id = 901));
And there you have it. Constraint exclusion does not work in cases like this. It only works with static expressions (such as a literal date in this case).
Ok, but I would have expected same plant repeated 4 times. When the table is not partitioned, the plan is defintely smarter: it knows that index is reversed and looks for max with an index scan backward). When the table is partitioned, it scan forward and I guess it will always do a full index scan.
--
rd
This is the way the world ends.
Not with a bang, but a whimper.
Andrew Dunstan <andrew@dunslane.net> writes: > On 01/22/2013 09:21 AM, rudi wrote: >> The query is pretty simple and standard, the behaviour (and the plan) >> is totally different when it comes to a partitioned table. >> >> Partioned table query => explain analyze SELECT "sb_logs".* FROM >> "sb_logs" WHERE (device_id = 901 AND date_taken = (SELECT >> MAX(date_taken) FROM sb_logs WHERE device_id = 901)); > And there you have it. Constraint exclusion does not work in cases like > this. It only works with static expressions (such as a literal date in > this case). This isn't about constraint exclusion I think. The main problem is in the sub-select: 9.0 isn't able to index-optimize a MAX() across a partitioned table, for lack of MergeAppend, so you end up scanning lots of rows there. 9.1 or 9.2 should be better. regards, tom lane