Thread: Statistics mismatch between n_live_tup and actual row count
Hi, we're currently investigating a statistics issue on postgres. We have some tables which frequently show up with strange valuesfor n_live_tup. If you compare those values with a count on that particular table, there is a mismatch of factor 10-30.This causes the planner to come up with very bad plans (we also have this issue on bigger table like the one below). db=# SELECT relname, n_live_tup, last_analyze, last_autoanalyze FROM pg_stat_user_tables WHERE schemaname = 's' AND relname= 't'; relname | n_live_tup | last_analyze | last_autoanalyze ----------+------------+-------------------------------+------------------------------- t | 7252 | 2011-12-08 03:00:02.556088+01 | 2011-12-01 18:29:00.536321+01 db=# SELECT COUNT(*) FROM s.t; count ------- 280 The strange thing is, if we run an ANALYZE on this table, the statistic is good. 10 minutes later it's bad again. We suspectone of our processes which might do unnecessary (i.e. blind) updates. Can this be the cause of the statistics problem? PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc-4.6.real (Debian 4.6.2-4) 4.6.2, 64-bit The migration to 9.1 did not fix this problem, as we already have this since 8.4. I can provide all postgres configuration, but I don't see anything we changed (compared to the default config), which mightbe related here. Any clue appreciated here! Best regards Andy
Andreas Brandl <ml@3.141592654.de> writes: > we're currently investigating a statistics issue on postgres. We have some tables which frequently show up with strangevalues for n_live_tup. If you compare those values with a count on that particular table, there is a mismatch of factor10-30. This causes the planner to come up with very bad plans (we also have this issue on bigger table like the onebelow). The planner doesn't use n_live_tup; the only thing that that's used for is decisions about when to autovacuum/autoanalyze. So you have two problems here not one. Can you provide a test case for the n_live_tup drift? That is, something that when done over and over causes n_live_tup to get further and further from reality? regards, tom lane
Hi, > Andreas Brandl <ml@3.141592654.de> writes: > > we're currently investigating a statistics issue on postgres. We > > have some tables which frequently show up with strange values for > > n_live_tup. If you compare those values with a count on that > > particular table, there is a mismatch of factor 10-30. This causes > > the planner to come up with very bad plans (we also have this issue > > on bigger table like the one below). > > The planner doesn't use n_live_tup; the only thing that that's used > for > is decisions about when to autovacuum/autoanalyze. So you have two > problems here not one. So, you're saying that having a mismatch between n_live_tup and the actual row count is not that much of a problem (besidesit influences when to auto-vacuum/analyze), right? I'm just curious: where does the planner take the (approximate) row-count from? > Can you provide a test case for the n_live_tup drift? That is, > something that when done over and over causes n_live_tup to get > further > and further from reality? I'll try to implement a minimal piece of code showing this, although I'm not sure if this will work. Might there be a link between n_live_tup drifting and doing unnecessary (blind) updates, which do not change any informationof a row? Thank you! Best regards Andreas
Andreas Brandl <ml@3.141592654.de> writes: >> The planner doesn't use n_live_tup; > I'm just curious: where does the planner take the (approximate) row-count from? It uses the tuple density estimated by the last vacuum or analyze (viz, reltuples/relpages) and multiplies that by the current relation size. There are various reasons for not using n_live_tup, some historical and some still pretty relevant. > Might there be a link between n_live_tup drifting and doing unnecessary (blind) updates, which do not change any informationof a row? Possibly. It's premature to speculate with no test case, but I'm wondering if HOT updates confuse that arithmetic. No-op updates would follow the HOT path as long as there was room on the page... regards, tom lane
> Andreas Brandl <ml@3.141592654.de> writes: > >> The planner doesn't use n_live_tup; > > > I'm just curious: where does the planner take the (approximate) > > row-count from? > > It uses the tuple density estimated by the last vacuum or analyze > (viz, > reltuples/relpages) and multiplies that by the current relation size. > There are various reasons for not using n_live_tup, some historical > and > some still pretty relevant. Thanks. I checked pg_class.reltuples against the corresponding count(*) and it's the same drifting here: reltuples equalsn_live_tup for our problematic tables and is way off the actual row count. So I guess it's only one problem again (or it's likely that the problem of bad plans is correlated with the reltuples estimation). > > Might there be a link between n_live_tup drifting and doing > > unnecessary (blind) updates, which do not change any information of > > a row? > > Possibly. It's premature to speculate with no test case, but I'm > wondering if HOT updates confuse that arithmetic. No-op updates > would follow the HOT path as long as there was room on the page... I try to come up with a test case, if possible. Thank you! Best regards, Andreas
I am also seeing a drift in the n_live_tup value compared to actual row count on the table on PG9.0.6 It drifts after a vacuum , you can bring it back closer to the actual number by running ANALYSE several times, you can lock it back into the right value with a vacuum full, but then if you run a vacuum it shows a n_live_tup less than the actual rows in the table. This is on a table seeing 80% HOT updates. Its a table that is heavily updated. Server not running autovacuum Tim -- View this message in context: http://postgresql.1045698.n5.nabble.com/Statistics-mismatch-between-n-live-tup-and-actual-row-count-tp5059317p5735108.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
This drift gets more confusing. My small table A (60K rows) is not being inserted to (except one or two rows) it is getting thousands of updates a minute. Analyze and vacuum on the table are running regularly. But sometimes ,every time the vacuum runs the reltuples value jumps up. Sometime bloating by 4 times the actualy size of the table (have seen 10 times) The impact of this on the query plans is evident straight away. Joins from all small A to large table B (55 Million rows) on the large B's primary key start seq scanning. Some sort of threshold is being hit where the planner believes that the size of A is big enough that given we are requesting all of A we might as well scan all of B. Also even though the table has been clustered and even vacuum full'ed the relpages value continues to grow. Any answers? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Statistics-mismatch-between-n-live-tup-and-actual-row-count-tp5059317p5735588.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 12/06/2012 06:13 PM, tim_wilson wrote: > This drift gets more confusing. > > My small table A (60K rows) is not being inserted to (except one or two > rows) it is getting thousands of updates a minute. Analyze and vacuum on the > table are running regularly. But sometimes ,every time the vacuum runs the > reltuples value jumps up. Sometime bloating by 4 times the actualy size of > the table (have seen 10 times) > > The impact of this on the query plans is evident straight away. Joins from > all small A to large table B (55 Million rows) on the large B's primary key > start seq scanning. > > Some sort of threshold is being hit where the planner believes that the size > of A is big enough that given we are requesting all of A we might as well > scan all of B. > > Also even though the table has been clustered and even vacuum full'ed the > relpages value continues to grow. > > Any answers? 1) FYI an update is a delete/insert, so you are actually getting thousands of inserts a minute. 2) As to why the size is growing- Are there open transactions holding old row versions around? -- Adrian Klaver adrian.klaver@gmail.com
<http://postgresql.1045698.n5.nabble.com/file/n5735593/pg_drift.png> 86% of the updates are HOT updates! The difference between the first and second line of this image above is that 366 updates happened of which 299 where HOT. And a vacuum on the table was run. Look at the retuples number change after the vacuum! Open transactions... how does this impact those numbers? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Statistics-mismatch-between-n-live-tup-and-actual-row-count-tp5059317p5735593.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 12/06/2012 06:45 PM, tim_wilson wrote: > <http://postgresql.1045698.n5.nabble.com/file/n5735593/pg_drift.png> > > > > 86% of the updates are HOT updates! > > The difference between the first and second line of this image above is that > 366 updates happened of which 299 where HOT. And a vacuum on the table was > run. > Look at the retuples number change after the vacuum! > > Open transactions... how does this impact those numbers? > The very simple version: Because of MVCC a particular row may have many tuple versions associated with it. If those tuple versions are visible to open transactions they may not be marked for removal until those transactions are completed. Others on this list are more capable than I of filling in the fine detail. -- Adrian Klaver adrian.klaver@gmail.com
Hi, I have questions about Linux Write cache sizing: 1) /proc/sys/vm/dirty_ratio : current value (default) 20 2) /proc/sys/vm/dirty_background_ratio: current value (default) 10 I am using Ubuntu 12.04 (8GB RAM) and PostgreSQL 9.2.1, what values of these kernel ratios should be set for better PostgreSQLdatabase performance? Thanks ac
tim_wilson <tim.wilson@telogis.com> writes: > This drift gets more confusing. In recent releases, reltuples (and relpages) are updated via a "moving average" calculation that is meant to converge on the true value over time. The reason vacuum has to act that way is that it typically doesn't scan the whole table anymore, but only the pages that have been dirtied recently. So it measures the tuple density in the pages it scans, and updates the previous value more or less aggressively depending on the percentage of the pages that it looked at. It's possible that there's some bug in that algorithm, but what seems more likely is that the pages that are getting dirtied are nonrepresentative of the whole table for some reason. Or maybe both. Can you put together a self-contained test case showing this behavior? regards, tom lane
Thanks for the reply Tom, will try and construct test case. Have been unable to replicate in a simple test the sort of updates that the table out in the wild is seeing, so may impact that issue of vacuum finding unrepresentative sample, maybe. Will try harder! When you say recent releases, does that include 8.4? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Statistics-mismatch-between-n-live-tup-and-actual-row-count-tp5059317p5735602.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Hi, On 7 December 2012 14:17, ac@hsk.hk <ac@hsk.hk> wrote: > I have questions about Linux Write cache sizing: > > 1) /proc/sys/vm/dirty_ratio : current value (default) 20 > 2) /proc/sys/vm/dirty_background_ratio: current value (default) 10 > > I am using Ubuntu 12.04 (8GB RAM) and PostgreSQL 9.2.1, what values of these kernel ratios > should be set for better PostgreSQL database performance? Read this thread: http://archives.postgresql.org/pgsql-performance/2011-10/msg00076.php Mainly, - http://archives.postgresql.org/pgsql-performance/2011-10/msg00078.php - and http://archives.postgresql.org/pgsql-performance/2011-10/msg00080.php To sum up: - with 8G RAM you should be fine with the defaults - keep keep dirty_ratio lower than the size of your disk controller cache - you can use dirty_bytes and dirty_background_bytes in recent kernels -- Ondrej Ivanic (http://www.linkedin.com/in/ondrejivanic)
tim_wilson <tim.wilson@telogis.com> writes: > When you say recent releases, does that include 8.4? 8.4.9 and later --- anything including this commit: Author: Tom Lane <tgl@sss.pgh.pa.us> Branch: master Release: REL9_1_BR [b4b6923e0] 2011-05-30 17:06:52 -0400 Branch: REL9_0_STABLE Release: REL9_0_5 [73bd34c81] 2011-05-30 17:07:07 -0400 Branch: REL8_4_STABLE Release: REL8_4_9 [b503da135] 2011-05-30 17:07:19 -0400 Fix VACUUM so that it always updates pg_class.reltuples/relpages. When we added the ability for vacuum to skip heap pages by consulting the visibility map, we made it just not update the reltuples/relpages statistics if it skipped any pages. But this could leave us with extremely out-of-date stats for a table that contains any unchanging areas, especially for TOAST tables which never get processed by ANALYZE. In particular this could result in autovacuum making poor decisions about when to process the table, as in recent report from Florian Helmberger. And in general it's a bad idea to not update the stats at all. Instead, use the previous values of reltuples/relpages as an estimate of the tuple density in unvisited pages. This approach results in a "moving average" estimate of reltuples, which should converge to the correct value over multiple VACUUM and ANALYZE cycles even when individual measurements aren't very good. This new method for updating reltuples is used by both VACUUM and ANALYZE, with the result that we no longer need the grotty interconnections that caused ANALYZE to not update the stats depending on what had happened in the parent VACUUM command. Also, fix the logic for skipping all-visible pages during VACUUM so that it looks ahead rather than behind to decide what to do, as per a suggestion from Greg Stark. This eliminates useless scanning of all-visible pages at the start of the relation or just after a not-all-visible page. In particular, the first few pages of the relation will not be invariably included in the scanned pages, which seems to help in not overweighting them in the reltuples estimate. Back-patch to 8.4, where the visibility map was introduced. regards, tom lane
Hi, thank you very much! On 7 Dec 2012, at 11:47 AM, Ondrej Ivanič wrote: > Hi, > > On 7 December 2012 14:17, ac@hsk.hk <ac@hsk.hk> wrote: >> I have questions about Linux Write cache sizing: >> >> 1) /proc/sys/vm/dirty_ratio : current value (default) 20 >> 2) /proc/sys/vm/dirty_background_ratio: current value (default) 10 >> >> I am using Ubuntu 12.04 (8GB RAM) and PostgreSQL 9.2.1, what values of these kernel ratios >> should be set for better PostgreSQL database performance? > > Read this thread: > http://archives.postgresql.org/pgsql-performance/2011-10/msg00076.php > Mainly, > - http://archives.postgresql.org/pgsql-performance/2011-10/msg00078.php > - and http://archives.postgresql.org/pgsql-performance/2011-10/msg00080.php > > To sum up: > - with 8G RAM you should be fine with the defaults > - keep keep dirty_ratio lower than the size of your disk controller cache > - you can use dirty_bytes and dirty_background_bytes in recent kernels > > -- > Ondrej Ivanic > (http://www.linkedin.com/in/ondrejivanic)
It seems that we are currently running 8.4.3 on the server we are encountering the problem. Will upgrade to 8.4.9 and then will come back with a test case if we still see the issue Thanks again for your help. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Statistics-mismatch-between-n-live-tup-and-actual-row-count-tp5059317p5735835.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.