Thread: auto-vacuum & Negative "anl" Values
Greetings all,
I have been spending some time looking into how auto-vacuum is performing on one of our servers. After putting the PostgreSQL logs in debug I noticed that the threshold for ANALYZE was never being hit for a particular table because the calculated value becomes increasingly negative.
We have an entry in the pg_autovacuum table for a table that has quite frequent inserts and updates, but not deletes. Here are the values:
-[ RECORD 1 ]----+------
vacrelid | #####
enabled | t
vac_base_thresh | 500
vac_scale_factor | 0.1
anl_base_thresh | 200
anl_scale_factor | 0.05
vac_cost_delay | -1
vac_cost_limit | -1
I've noticed that the threshold for ANALYZE never gets met because the threshold for VACUUM is hit first, therefore resetting the counters. Here is a snippet of the log that shows what's happening:
DEBUG: tablename: vac: 961 (threshold 14217), anl: -9756 (threshold 7058)
DEBUG: tablename: vac: 1924 (threshold 14217), anl: -8792 (threshold 7058)
DEBUG: tablename: vac: 2953 (threshold 14217), anl: -7763 (threshold 7058)
DEBUG: tablename: vac: 3998 (threshold 14217), anl: -6718 (threshold 7058)
DEBUG: tablename: vac: 5170 (threshold 14217), anl: -5546 (threshold 7058)
DEBUG: tablename: vac: 6405 (threshold 14217), anl: -4311 (threshold 7058)
DEBUG: tablename: vac: 7635 (threshold 14217), anl: -3081 (threshold 7058)
DEBUG: tablename: vac: 8818 (threshold 14217), anl: -1898 (threshold 7058)
DEBUG: tablename: vac: 9917 (threshold 14217), anl: -798 (threshold 7058)
DEBUG: tablename: vac: 10987 (threshold 14217), anl: 272 (threshold 7058)
DEBUG: tablename: vac: 12016 (threshold 14217), anl: 1301 (threshold 7058)
DEBUG: tablename: vac: 12929 (threshold 14217), anl: 2214 (threshold 7058)
DEBUG: tablename: vac: 13717 (threshold 14217), anl: 3002 (threshold 7058)
DEBUG: tablename: vac: 14441 (threshold 14217), anl: 3727 (threshold 7058)
...vacuum threshold is hit...
DEBUG: tablename: vac: 752 (threshold 14217), anl: -9962 (threshold 7058)
DEBUG: tablename: vac: 1491 (threshold 14217), anl: -9223 (threshold 7058)
DEBUG: tablename: vac: 2213 (threshold 14217), anl: -8501 (threshold 7058)
DEBUG: tablename: vac: 2984 (threshold 14217), anl: -7730 (threshold 7058)
The outcome of this is that an ANALYZE is never run, as after the counters are reset for each VACUUM the counter for ANALYZE gets increasingly larger. But as you can see from our entries in pg_autovacuum above, an ANALYZE should occur much more frequently than a VACUUM.
We're running PostgreSQL 8.1.4 on a RHEL-3 with a 2.4.21-27.0.2.ELsmp kernel.
Input is appreciated to explain exactly what is happening here. Thanks so much!
--
Dylan Hansen
Enterprise Systems Developer
Dylan Hansen <dhansen@pixpo.com> writes: > I have been spending some time looking into how auto-vacuum is > performing on one of our servers. After putting the PostgreSQL logs > in debug I noticed that the threshold for ANALYZE was never being hit > for a particular table because the calculated value becomes > increasingly negative. Hmm, it shouldn't ever be negative at all, I would think. The calculation in question is anltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples - tabentry->last_anl_tuples; Apparently somehow last_anl_tuples has managed to get to be bigger than n_live_tuples, which maybe could happen after a delete. Should we be clamping last_anl_tuples to not exceed n_live_tuples somewhere? Alvaro and Matthew, what do you think? regards, tom lane
Tom Lane wrote: > Dylan Hansen <dhansen@pixpo.com> writes: > > I have been spending some time looking into how auto-vacuum is > > performing on one of our servers. After putting the PostgreSQL logs > > in debug I noticed that the threshold for ANALYZE was never being hit > > for a particular table because the calculated value becomes > > increasingly negative. > > Hmm, it shouldn't ever be negative at all, I would think. The > calculation in question is > > anltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples - > tabentry->last_anl_tuples; > > Apparently somehow last_anl_tuples has managed to get to be bigger than > n_live_tuples, which maybe could happen after a delete. Should we be > clamping last_anl_tuples to not exceed n_live_tuples somewhere? > Alvaro and Matthew, what do you think? Hmm ... I'd think that the number of dead tuples plus live tuples should never be smaller than the number of tuples seen at last analyze. Unless some stats messages are lost (say, stop the server, start with stats disabled, do a big DELETE, stop, restart normally). I think there's a bug elsewhere. On pgstat_recv_tabstat, I see tabentry->n_live_tuples = tabmsg[i].t_tuples_inserted; tabentry->n_dead_tuples = tabmsg[i].t_tuples_updated + tabmsg[i].t_tuples_deleted; But really n_live_tuples should be decreased by t_tuples_deleted as well, so this should read tabentry->n_live_tuples = tabmsg[i].t_tuples_inserted - tabmsg[i].t_tuples_deleted; tabentry->n_dead_tuples = tabmsg[i].t_tuples_updated + tabmsg[i].t_tuples_deleted; On the other hand I don't see how this would explain the problem Dylan is seeing, because the effect would be the inverse -- anltuples would be erroneously calculated too high, so it wouldn't become negative. I think we should catch a negative anltuples value, log a WARNING, and force an analyze, because it's indicative of a bug. I'm interested in what Dylan can tell about the activity over this table? What does pg_stat_user_tables say about it? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> Apparently somehow last_anl_tuples has managed to get to be bigger than >> n_live_tuples, which maybe could happen after a delete. Should we be >> clamping last_anl_tuples to not exceed n_live_tuples somewhere? >> Alvaro and Matthew, what do you think? > Hmm ... I'd think that the number of dead tuples plus live tuples should > never be smaller than the number of tuples seen at last analyze. The scenario I was imagining was big delete followed by VACUUM-without-ANALYZE. In this situation, it looks to me like pgstat_recv_vacuum updates n_live_tuples to the new smaller value and doesn't do anything to last_anl_tuples. I'm thinking we need tabentry->n_live_tuples = msg->m_tuples; tabentry->n_dead_tuples = 0; if (msg->m_analyze) { tabentry->last_anl_tuples = msg->m_tuples; if (msg->m_autovacuum) tabentry->autovac_analyze_timestamp = msg->m_vacuumtime; else tabentry->analyze_timestamp = msg->m_vacuumtime; } + else + { + /* last_anl_tuples must never exceed n_live_tuples */ + tabentry->last_anl_tuples = Min(tabentry->last_anl_tuples, + msg->m_tuples); + } } but perhaps I'm confused. regards, tom lane
Hi Tom, Alvaro. Thanks so much for your replies.
On 22-Jun-06, at 12:56 PM, Alvaro Herrera wrote:
Hmm ... I'd think that the number of dead tuples plus live tuples shouldnever be smaller than the number of tuples seen at last analyze. Unlesssome stats messages are lost (say, stop the server, start with statsdisabled, do a big DELETE, stop, restart normally).
We had to restart the server to upgrade to 8.1.4 when it was released, but since then we have not restarted our server. We haven't changed our stats gathering recently, although we've made some changes to auto-vacuum to do it more frequently (outlined in my first e-mail). Unfortunately, I do not have any data to supply previous to the upgrade as we have just recently changed our logging to be more verbose.
I'm interested in what Dylan can tell about the activity over thistable? What does pg_stat_user_tables say about it?
Here's the pg_stat_user_tables entry for this table:
-[ RECORD 1 ]-+----------
relid | #####
schemaname | public
relname | tablename
seq_scan | 345
seq_tup_read | 46699390
idx_scan | 33731562
idx_tup_fetch | 171166937
n_tup_ins | 3359
n_tup_upd | 8205786
n_tup_del | 0
As you can see, the number of deleted tuples is 0, so I don't believe that it could be an issue with DELETEs on the table because there aren't any!
It has to be something that occurs when the VACCUM threshold is hit, because it resets both the VACUUM and ANALYZE counters, and it decreases
Yesterday we tried running the ANALYZE command and it did reset the counter to a non-negative value. However, the next time that the VACUUM threshold was reached, the ANALYZE counter was decreased to a negative number.
Thanks for looking into this, please let me know if there's any information I can provide!
--
Dylan Hansen
Enterprise Systems Developer
Tom Lane wrote: > Dylan Hansen <dhansen@pixpo.com> writes: > >> I have been spending some time looking into how auto-vacuum is >> performing on one of our servers. After putting the PostgreSQL logs >> in debug I noticed that the threshold for ANALYZE was never being hit >> for a particular table because the calculated value becomes >> increasingly negative. >> > > Hmm, it shouldn't ever be negative at all, I would think. The > calculation in question is > > anltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples - > tabentry->last_anl_tuples; > > Apparently somehow last_anl_tuples has managed to get to be bigger than > n_live_tuples, which maybe could happen after a delete. Should we be > clamping last_anl_tuples to not exceed n_live_tuples somewhere? > Alvaro and Matthew, what do you think? I think I had something in the contrib version that checked this. I always assumed it would be caused by a stats reset which was more common in earlier PGSQL releases since stats_reset_on_startup (or whatever the correct spelling of that is) was enabled by default.
So can I assume that this is a bug?
The only resolution I can see right now is to setup a cron job that will perform an ANALYZE periodically, as the pg_autovacuum ANALYZE threshold is never reached.
Any other suggestions? Thanks for the input!
--
Dylan Hansen
Enterprise Systems Developer
On 24-Jun-06, at 4:09 PM, Matthew T. O'Connor wrote:
Tom Lane wrote:Dylan Hansen <dhansen@pixpo.com> writes:
I have been spending some time looking into how auto-vacuum is performing on one of our servers. After putting the PostgreSQL logs in debug I noticed that the threshold for ANALYZE was never being hit for a particular table because the calculated value becomes increasingly negative.
Hmm, it shouldn't ever be negative at all, I would think. Thecalculation in question isanltuples = tabentry->n_live_tuples + tabentry->n_dead_tuples -tabentry->last_anl_tuples;Apparently somehow last_anl_tuples has managed to get to be bigger thann_live_tuples, which maybe could happen after a delete. Should we beclamping last_anl_tuples to not exceed n_live_tuples somewhere?Alvaro and Matthew, what do you think?I think I had something in the contrib version that checked this. I always assumed it would be caused by a stats reset which was more common in earlier PGSQL releases since stats_reset_on_startup (or whatever the correct spelling of that is) was enabled by default.---------------------------(end of broadcast)---------------------------TIP 9: In versions below 8.0, the planner will ignore your desire tochoose an index scan if your joining column's datatypes do notmatch
Dylan Hansen wrote: > So can I assume that this is a bug? Definitively a bug. > The only resolution I can see right now is to setup a cron job that > will perform an ANALYZE periodically, as the pg_autovacuum ANALYZE > threshold is never reached. > > Any other suggestions? Thanks for the input! I just committed a fix, so the other alternative is get a CVS checkout from the 8.1 branch and put it up to see if it fixes your problem. The relevant patch is below. Index: src/backend/postmaster/pgstat.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/postmaster/pgstat.c,v retrieving revision 1.111.2.3 diff -c -p -r1.111.2.3 pgstat.c *** src/backend/postmaster/pgstat.c 19 May 2006 15:15:38 -0000 1.111.2.3 --- src/backend/postmaster/pgstat.c 27 Jun 2006 03:36:03 -0000 *************** pgstat_recv_vacuum(PgStat_MsgVacuum *msg *** 2919,2924 **** --- 2919,2930 ---- tabentry->n_dead_tuples = 0; if (msg->m_analyze) tabentry->last_anl_tuples = msg->m_tuples; + else + { + /* last_anl_tuples must never exceed n_live_tuples */ + tabentry->last_anl_tuplse = Min(tabentry->last_anl_tuples, + msg->m_tuples); + } } /* ---------- *************** pgstat_recv_tabstat(PgStat_MsgTabstat *m *** 3055,3061 **** tabentry->tuples_updated += tabmsg[i].t_tuples_updated; tabentry->tuples_deleted += tabmsg[i].t_tuples_deleted; ! tabentry->n_live_tuples += tabmsg[i].t_tuples_inserted; tabentry->n_dead_tuples += tabmsg[i].t_tuples_updated + tabmsg[i].t_tuples_deleted; --- 3061,3068 ---- tabentry->tuples_updated += tabmsg[i].t_tuples_updated; tabentry->tuples_deleted += tabmsg[i].t_tuples_deleted; ! tabentry->n_live_tuples += tabmsg[i].t_tuples_inserted - ! tabmsg[i].t_tuples_deleted; tabentry->n_dead_tuples += tabmsg[i].t_tuples_updated + tabmsg[i].t_tuples_deleted; -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > + /* last_anl_tuples must never exceed n_live_tuples */ If we actually believe the above statement, it seems like your patch to pgstat_recv_tabstat() opens a new issue: with that patch, it is possible for pgstat_recv_tabstat() to decrease n_live_tuples, and therefore a clamp needs to be applied in pgstat_recv_tabstat() too. No? The reason I didn't patch it myself is that I'm not quite clear on what *should* be happening here. What effect should a large delete have on the ANALYZE threshold, exactly? You could argue that a deletion potentially changes the statistics (by omission), and therefore inserts, updates, and deletes should equally count +1 towards the analyze threshold. I don't think we are implementing that though. If we want to do it that way, I suspect last_anl_tuples as currently defined is not the right comparison point. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > + /* last_anl_tuples must never exceed n_live_tuples */ > > If we actually believe the above statement, it seems like your patch > to pgstat_recv_tabstat() opens a new issue: with that patch, it is > possible for pgstat_recv_tabstat() to decrease n_live_tuples, and > therefore a clamp needs to be applied in pgstat_recv_tabstat() too. > No? Hmm, yeah. > The reason I didn't patch it myself is that I'm not quite clear on what > *should* be happening here. What effect should a large delete have on > the ANALYZE threshold, exactly? You could argue that a deletion > potentially changes the statistics (by omission), and therefore inserts, > updates, and deletes should equally count +1 towards the analyze > threshold. I don't think we are implementing that though. If we want > to do it that way, I suspect last_anl_tuples as currently defined is not > the right comparison point. Maybe what we should do is revert the pgstat_recv_tabstat() part of the patch in 8.1, and consider redefining last_anl_tuples in HEAD. Caffeine is not high enough yet to propose anything sensible, but I'll think about it a bit later. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hi Tom, Alvaro,
Thanks for your work on this. Please keep me posted as to which version in CVS this fix will be applied to and I will do my best to test it.
Thanks again!
--
Dylan Hansen
Enterprise Systems Developer
On 27-Jun-06, at 5:42 AM, Alvaro Herrera wrote:
Tom Lane wrote:Alvaro Herrera <alvherre@commandprompt.com> writes:+ /* last_anl_tuples must never exceed n_live_tuples */If we actually believe the above statement, it seems like your patchto pgstat_recv_tabstat() opens a new issue: with that patch, it ispossible for pgstat_recv_tabstat() to decrease n_live_tuples, andtherefore a clamp needs to be applied in pgstat_recv_tabstat() too.No?Hmm, yeah.The reason I didn't patch it myself is that I'm not quite clear on what*should* be happening here. What effect should a large delete have onthe ANALYZE threshold, exactly? You could argue that a deletionpotentially changes the statistics (by omission), and therefore inserts,updates, and deletes should equally count +1 towards the analyzethreshold. I don't think we are implementing that though. If we wantto do it that way, I suspect last_anl_tuples as currently defined is notthe right comparison point.Maybe what we should do is revert the pgstat_recv_tabstat() part of thepatch in 8.1, and consider redefining last_anl_tuples in HEAD. Caffeineis not high enough yet to propose anything sensible, but I'll thinkabout it a bit later.--Alvaro Herrera http://www.CommandPrompt.com/PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Tom Lane wrote: > The reason I didn't patch it myself is that I'm not quite clear on what > *should* be happening here. What effect should a large delete have on > the ANALYZE threshold, exactly? You could argue that a deletion > potentially changes the statistics (by omission), and therefore inserts, > updates, and deletes should equally count +1 towards the analyze > threshold. I don't think we are implementing that though. If we want > to do it that way, I suspect last_anl_tuples as currently defined is not > the right comparison point. Just as a point of reference, the old contrib pg_autovacuum counts ins + upd + del against the analyze threshold where as the vacuum threshold only compares against upd + del.