Thread: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Hi, It seems to me that VACUUM and ANALYZE somewhat disagree on what exactly reltuples means. VACUUM seems to be thinking that reltuples = live + dead while ANALYZE apparently believes that reltuples = live This causes somewhat bizarre changes in the value, depending on which of those commands was executed last. To demonstrate the issue, let's create a simple table with 1M rows, delete 10% rows and then we'll do a bunch of VACUUM / ANALYZE and check reltuples, n_live_tup and n_dead_tup in the catalogs. I've disabled autovacuum so that it won't interfere with this, and there's another transaction blocking VACUUM from actually cleaning any dead tuples. test=# create table t as select i from generate_series(1,1000000) s(i); test=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 1e+06 | 1000000 | 0 So, that's nice. Now let's delete 10% of rows, and run VACUUM and ANALYZE a few times. test=# delete from t where random() < 0.1; test=# vacuum t; test=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 1e+06 | 900413 | 99587 test=# analyze t; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 900413 | 900413 | 99587 test=# vacuum t; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 1e+06 | 900413 | 99587 So, analyze and vacuum disagree. To further confuse the poor DBA, VACUUM always simply ignores the old values while ANALYZE combines the old and new values on large tables (and converges to the "correct" value after a few steps). This table is small (less than 30k pages), so ANALYZE does not do that. This is quite annoying, because people tend to look at reltuples while investigating bloat (e.g. because the check_postgres query mentioned on our wiki [1] uses reltuples in the formula). [1] https://wiki.postgresql.org/wiki/Show_database_bloat And when the cleanup is blocked for some reason (as in the example above), VACUUM tends to be running much more often (because it can't cleanup anything). So reltuples tend to be set to the higher value, which I'd argue is the wrong value for estimating bloat. I haven't looked at the code yet, but I've confirmed this happens both on 9.6 and 10. I haven't checked older versions, but I guess those are affected too. The question is - which of the reltuples definitions is the right one? I've always assumed that "reltuples = live + dead" but perhaps not? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > It seems to me that VACUUM and ANALYZE somewhat disagree on what exactly > reltuples means. VACUUM seems to be thinking that > reltuples = live + dead > while ANALYZE apparently believes that > reltuples = live > The question is - which of the reltuples definitions is the right one? > I've always assumed that "reltuples = live + dead" but perhaps not? I think the planner basically assumes that reltuples is the live tuple count, so maybe we'd better change VACUUM to get in step. regards, tom lane
On 7/25/17 12:55 AM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> It seems to me that VACUUM and ANALYZE somewhat disagree on what >> exactly reltuples means. VACUUM seems to be thinking that reltuples >> = live + dead while ANALYZE apparently believes that reltuples = >> live > >> The question is - which of the reltuples definitions is the right >> one? I've always assumed that "reltuples = live + dead" but perhaps >> not? > > I think the planner basically assumes that reltuples is the live > tuple count, so maybe we'd better change VACUUM to get in step. > Attached is a patch that (I think) does just that. The disagreement was caused by VACUUM treating recently dead tuples as live, while ANALYZE treats both of those as dead. At first I was worried that this will negatively affect plans in the long-running transaction, as it will get underestimates (due to reltuples not including rows it can see). But that's a problem we already have anyway, you just need to run ANALYZE in the other session. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 7/25/17 12:55 AM, Tom Lane wrote: >> I think the planner basically assumes that reltuples is the live >> tuple count, so maybe we'd better change VACUUM to get in step. > Attached is a patch that (I think) does just that. The disagreement was > caused by VACUUM treating recently dead tuples as live, while ANALYZE > treats both of those as dead. > At first I was worried that this will negatively affect plans in the > long-running transaction, as it will get underestimates (due to > reltuples not including rows it can see). But that's a problem we > already have anyway, you just need to run ANALYZE in the other session. This definitely will have some impact on plans, at least in cases where there's a significant number of unvacuumable dead tuples. So I think it's a bit late for v10, and I wouldn't want to back-patch at all. Please add to the next commitfest. regards, tom lane
On 7/25/17 5:04 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 7/25/17 12:55 AM, Tom Lane wrote: >>> I think the planner basically assumes that reltuples is the live >>> tuple count, so maybe we'd better change VACUUM to get in step. > >> Attached is a patch that (I think) does just that. The disagreement >> was caused by VACUUM treating recently dead tuples as live, while >> ANALYZE treats both of those as dead. > >> At first I was worried that this will negatively affect plans in >> the long-running transaction, as it will get underestimates (due >> to reltuples not including rows it can see). But that's a problem >> we already have anyway, you just need to run ANALYZE in the other >> session. > > This definitely will have some impact on plans, at least in cases > where there's a significant number of unvacuumable dead tuples. So I > think it's a bit late for v10, and I wouldn't want to back-patch at > all. Please add to the next commitfest. > I dare to disagree here, for two reasons. Firstly, the impact *is* already there, it only takes running ANALYZE. Or VACUUM ANALYZE. In both those cases we already end up with reltuples=n_live_tup. Secondly, I personally strongly prefer stable predictable behavior over intermittent oscillations between two values. That's a major PITA on production, both to investigate and fix. So people already have this issue, although it only strikes randomly. And no way to fix it (well, except for fixing the cleanup, but that may not be possible). It is true we tend to run VACUUM more often than ANALYZE, particularly in situations where the cleanup can't proceed - ANALYZE will do it's work and VACUUM will be triggered over and over again, so it "wins" this way. But I'm not sure that's something we should rely on. FWIW I personally see this as a fairly annoying bug, and would vote to backpatch it, although I understand people might object. But I don't quite see a reason not to fix this in v10. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jul 25, 2017 at 07:02:28PM +0200, Tomas Vondra wrote: > On 7/25/17 5:04 PM, Tom Lane wrote: > >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > >>Attached is a patch that (I think) does just that. The disagreement > >>was caused by VACUUM treating recently dead tuples as live, while > >>ANALYZE treats both of those as dead. > > > >>At first I was worried that this will negatively affect plans in > >>the long-running transaction, as it will get underestimates (due > >>to reltuples not including rows it can see). But that's a problem > >>we already have anyway, you just need to run ANALYZE in the other > >>session. > > > >This definitely will have some impact on plans, at least in cases > >where there's a significant number of unvacuumable dead tuples. So I > >think it's a bit late for v10, and I wouldn't want to back-patch at > >all. Please add to the next commitfest. > > > > I dare to disagree here, for two reasons. > > Firstly, the impact *is* already there, it only takes running ANALYZE. Or > VACUUM ANALYZE. In both those cases we already end up with > reltuples=n_live_tup. > > Secondly, I personally strongly prefer stable predictable behavior over > intermittent oscillations between two values. That's a major PITA on > production, both to investigate and fix. > FWIW I personally see this as a fairly annoying bug, and would vote to > backpatch it, although I understand people might object. I tend to agree. If you have a setup that somehow never uses ANALYZE or never uses VACUUM on a particular table, you might prefer today's (accidental) behavior. However, the discrepancy arises only on a table that gets dead tuples, and a table that gets dead tuples will see both VACUUM and ANALYZE soon enough. It does seem like quite a stretch to imagine someone wanting plans to depend on which of VACUUM or ANALYZE ran most recently.
On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 7/25/17 12:55 AM, Tom Lane wrote:Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:It seems to me that VACUUM and ANALYZE somewhat disagree on what
exactly reltuples means. VACUUM seems to be thinking that reltuples
= live + dead while ANALYZE apparently believes that reltuples =
liveThe question is - which of the reltuples definitions is the right
one? I've always assumed that "reltuples = live + dead" but perhaps
not?
I think the planner basically assumes that reltuples is the live
tuple count, so maybe we'd better change VACUUM to get in step.
Attached is a patch that (I think) does just that. The disagreement was caused by VACUUM treating recently dead tuples as live, while ANALYZE treats both of those as dead.
At first I was worried that this will negatively affect plans in the long-running transaction, as it will get underestimates (due to reltuples not including rows it can see). But that's a problem we already have anyway, you just need to run ANALYZE in the other session.
Thanks for the patch.
From the mail, I understand that this patch tries to improve the
reltuples value update in the catalog table by the vacuum command
to consider the proper visible tuples similar like analyze command.
- num_tuples);
+ num_tuples - nkeep);
With the above correction, there is a problem in reporting the number
of live tuples to the stats.
postgres=# select reltuples, n_live_tup, n_dead_tup
from pg_stat_user_tables join pg_class using (relname)
where relname = 't';
reltuples | n_live_tup | n_dead_tup
-----------+------------+------------
899818 | 799636 | 100182
(1 row)
The live tuples data value is again decremented with dead tuples
value before sending them to stats in function lazy_vacuum_rel(),
/* report results to the stats collector, too */
new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
The fix needs a correction here also. Or change the correction in
lazy_vacuum_rel() function itself before updating catalog table similar
like stats.
While testing this patch, I found another problem that is not related to
this patch. When the vacuum command is executed mutiple times on
a table with no dead rows, the number of reltuples value is slowly
reducing.
postgres=# select reltuples, n_live_tup, n_dead_tup
from pg_stat_user_tables join pg_class using (relname)
where relname = 't';
reltuples | n_live_tup | n_dead_tup
-----------+------------+------------
899674 | 899674 | 0
(1 row)
postgres=# vacuum t;
VACUUM
postgres=# select reltuples, n_live_tup, n_dead_tup
from pg_stat_user_tables join pg_class using (relname)
where relname = 't';
reltuples | n_live_tup | n_dead_tup
-----------+------------+------------
899622 | 899622 | 0
(1 row)
postgres=# vacuum t;
VACUUM
postgres=# select reltuples, n_live_tup, n_dead_tup
from pg_stat_user_tables join pg_class using (relname)
where relname = 't';
reltuples | n_live_tup | n_dead_tup
-----------+------------+------------
899570 | 899570 | 0
(1 row)
In lazy_scan_heap() function, we force to scan the last page of the
relation to avoid the access exclusive lock in lazy_truncate_heap
if there are tuples in the last page. Because of this reason, the
scanned_pages value will never be 0, so the vac_estimate_reltuples
function will estimate the tuples based on the number of tuples
from the last page of the relation. This estimation is leading to
reduce the number of retuples.
I am thinking whether this problem really happen in real world scenarios
to produce a fix?
Regards,
Hari Babu
Fujitsu Australia
> On 06 Sep 2017, at 09:45, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: > On 7/25/17 12:55 AM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> writes: > It seems to me that VACUUM and ANALYZE somewhat disagree on what > exactly reltuples means. VACUUM seems to be thinking that reltuples > = live + dead while ANALYZE apparently believes that reltuples = > live > > The question is - which of the reltuples definitions is the right > one? I've always assumed that "reltuples = live + dead" but perhaps > not? > > I think the planner basically assumes that reltuples is the live > tuple count, so maybe we'd better change VACUUM to get in step. > > Attached is a patch that (I think) does just that. The disagreement was caused by VACUUM treating recently dead tuplesas live, while ANALYZE treats both of those as dead. > > At first I was worried that this will negatively affect plans in the long-running transaction, as it will get underestimates(due to reltuples not including rows it can see). But that's a problem we already have anyway, you just needto run ANALYZE in the other session. > > Thanks for the patch. > From the mail, I understand that this patch tries to improve the > reltuples value update in the catalog table by the vacuum command > to consider the proper visible tuples similar like analyze command. > > - num_tuples); > + num_tuples - nkeep); > > With the above correction, there is a problem in reporting the number > of live tuples to the stats. > > postgres=# select reltuples, n_live_tup, n_dead_tup > from pg_stat_user_tables join pg_class using (relname) > where relname = 't'; > reltuples | n_live_tup | n_dead_tup > -----------+------------+------------ > 899818 | 799636 | 100182 > (1 row) > > > The live tuples data value is again decremented with dead tuples > value before sending them to stats in function lazy_vacuum_rel(), > > /* report results to the stats collector, too */ > new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples; > > The fix needs a correction here also. Or change the correction in > lazy_vacuum_rel() function itself before updating catalog table similar > like stats. This patch is marked Waiting for Author, have you had a chance to look at this to address the comments in the above review? cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 09/06/2017 09:45 AM, Haribabu Kommi wrote: > > > On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: > > On 7/25/17 12:55 AM, Tom Lane wrote: > > Tomas Vondra <tomas.vondra@2ndquadrant.com > <mailto:tomas.vondra@2ndquadrant.com>> writes: > > It seems to me that VACUUM and ANALYZE somewhat disagree on what > exactly reltuples means. VACUUM seems to be thinking that > reltuples > = live + dead while ANALYZE apparently believes that reltuples = > live > > > The question is - which of the reltuples definitions is the > right > one? I've always assumed that "reltuples = live + dead" but > perhaps > not? > > > I think the planner basically assumes that reltuples is the live > tuple count, so maybe we'd better change VACUUM to get in step. > > > Attached is a patch that (I think) does just that. The disagreement > was caused by VACUUM treating recently dead tuples as live, while > ANALYZE treats both of those as dead. > > At first I was worried that this will negatively affect plans in the > long-running transaction, as it will get underestimates (due to > reltuples not including rows it can see). But that's a problem we > already have anyway, you just need to run ANALYZE in the other session. > > > Thanks for the patch. > From the mail, I understand that this patch tries to improve the > reltuples value update in the catalog table by the vacuum command > to consider the proper visible tuples similar like analyze command. > > -num_tuples); > +num_tuples - nkeep); > > With the above correction, there is a problem in reporting the number > of live tuples to the stats. > > postgres=# select reltuples, n_live_tup, n_dead_tup > from pg_stat_user_tables join pg_class using (relname) > where relname = 't'; > reltuples | n_live_tup | n_dead_tup > -----------+------------+------------ > 899818 | 799636 | 100182 > (1 row) > > > The live tuples data value is again decremented with dead tuples > value before sending them to stats in function lazy_vacuum_rel(), > > /* report results to the stats collector, too */ > new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples; > > The fix needs a correction here also. Or change the correction in > lazy_vacuum_rel() function itself before updating catalog table similar > like stats. > Ah, haven't noticed that for some reason - you're right, we estimate the reltuples based on (num_tuples - nkeep), so it doesn't make much sense to subtract nkeep again. Attached is v2 of the fix. I've removed the subtraction from lazy_vacuum_rel(), leaving just new_live_tuples = new_rel_tuples; and now it behaves as expected (no second subtraction). That means we can get rid of new_live_tuples altogether (and the protection against negative values), and use new_rel_tuples directly. Which pretty much means that in this case (pg_class.reltuples == pg_stat_user_tables.n_live_tup) but I guess that's fine, based on the initial discussion in this thread. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi, Apologies, I forgot to respond to the second part of your message. On 09/06/2017 09:45 AM, Haribabu Kommi wrote: > > While testing this patch, I found another problem that is not related to > this patch. When the vacuum command is executed mutiple times on > a table with no dead rows, the number of reltuples value is slowly > reducing. > > postgres=# select reltuples, n_live_tup, n_dead_tup > from pg_stat_user_tables join pg_class using (relname) > where relname = 't'; > reltuples | n_live_tup | n_dead_tup > -----------+------------+------------ > 899674 | 899674 | 0 > (1 row) > > postgres=# vacuum t; > VACUUM > postgres=# select reltuples, n_live_tup, n_dead_tup > from pg_stat_user_tables join pg_class using (relname) > where relname = 't'; > reltuples | n_live_tup | n_dead_tup > -----------+------------+------------ > 899622 | 899622 | 0 > (1 row) > > postgres=# vacuum t; > VACUUM > postgres=# select reltuples, n_live_tup, n_dead_tup > from pg_stat_user_tables join pg_class using (relname) > where relname = 't'; > reltuples | n_live_tup | n_dead_tup > -----------+------------+------------ > 899570 | 899570 | 0 > (1 row) > > > In lazy_scan_heap() function, we force to scan the last page of the > relation to avoid the access exclusive lock in lazy_truncate_heap if > there are tuples in the last page. Because of this reason, the > scanned_pages value will never be 0, so the vac_estimate_reltuples > function will estimate the tuples based on the number of tuples from > the last page of the relation. This estimation is leading to reduce > the number of retuples. > Hmmm, that's annoying. Perhaps if we should not update the values in this case, then? I mean, if we only scan the last page, how reliable the derived values are? For the record - AFAICS this issue is unrelated to do with the patch (i.e. it's not introduced by it). > I am thinking whether this problem really happen in real world > scenarios to produce a fix? > Not sure. As vacuum run decrements the query only a little bit, so you'd have to run the vacuum many times to be actually bitten by it. For people relying on autovacuum that won't happen, as it only runs on tables with certain number of dead tuples. So you'd have to be running VACUUM in a loop or something (but not VACUUM ANALYZE, because that works fine) from a script, or something like that. That being said, fixing a bug is always a good thing I guess. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 25, 2017 at 4:39 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 09/06/2017 09:45 AM, Haribabu Kommi wrote:
>
>
> On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
> On 7/25/17 12:55 AM, Tom Lane wrote:
>
> Tomas Vondra <tomas.vondra@2ndquadrant.com> +num_tuples - nkeep);> <mailto:tomas.vondra@2ndquadrant.com>> writes:
>
> It seems to me that VACUUM and ANALYZE somewhat disagree on what
> exactly reltuples means. VACUUM seems to be thinking that
> reltuples
> = live + dead while ANALYZE apparently believes that reltuples =
> live
>
>
> The question is - which of the reltuples definitions is the
> right
> one? I've always assumed that "reltuples = live + dead" but
> perhaps
> not?
>
>
> I think the planner basically assumes that reltuples is the live
> tuple count, so maybe we'd better change VACUUM to get in step.
>
>
> Attached is a patch that (I think) does just that. The disagreement
> was caused by VACUUM treating recently dead tuples as live, while
> ANALYZE treats both of those as dead.
>
> At first I was worried that this will negatively affect plans in the
> long-running transaction, as it will get underestimates (due to
> reltuples not including rows it can see). But that's a problem we
> already have anyway, you just need to run ANALYZE in the other session.
>
>
> Thanks for the patch.
> From the mail, I understand that this patch tries to improve the
> reltuples value update in the catalog table by the vacuum command
> to consider the proper visible tuples similar like analyze command.
>
> -num_tuples);
>
> With the above correction, there is a problem in reporting the number
> of live tuples to the stats.
>
> postgres=# select reltuples, n_live_tup, n_dead_tup
> from pg_stat_user_tables join pg_class using (relname)
> where relname = 't';
> reltuples | n_live_tup | n_dead_tup
> -----------+------------+------------
> 899818 | 799636 | 100182
> (1 row)
>
>
> The live tuples data value is again decremented with dead tuples
> value before sending them to stats in function lazy_vacuum_rel(),
>
> /* report results to the stats collector, too */
> new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
>
> The fix needs a correction here also. Or change the correction in
> lazy_vacuum_rel() function itself before updating catalog table similar
> like stats.
>
Ah, haven't noticed that for some reason - you're right, we estimate the
reltuples based on (num_tuples - nkeep), so it doesn't make much sense
to subtract nkeep again. Attached is v2 of the fix.
I've removed the subtraction from lazy_vacuum_rel(), leaving just
new_live_tuples = new_rel_tuples;
and now it behaves as expected (no second subtraction). That means we
can get rid of new_live_tuples altogether (and the protection against
negative values), and use new_rel_tuples directly.
Which pretty much means that in this case
(pg_class.reltuples == pg_stat_user_tables.n_live_tup)
but I guess that's fine, based on the initial discussion in this thread.
The changes are fine and now it reports proper live tuples in both
pg_class and stats. The other issue of continuous vacuum operation
leading to decrease of number of live tuples is not related to this
patch and that can be handled separately.
I changed the patch status as ready for committer.
Regards,
Hari Babu
Fujitsu Australia
Haribabu Kommi <kommi.haribabu@gmail.com> writes: > The changes are fine and now it reports proper live tuples in both > pg_class and stats. The other issue of continuous vacuum operation > leading to decrease of number of live tuples is not related to this > patch and that can be handled separately. I did not like this patch too much, because it did nothing to fix the underlying problem of confusion between "live tuples" and "total tuples"; in fact, it made that worse, because now the comment on LVRelStats.new_rel_tuples is a lie. And there's at least one usage of that field value where I think we do want total tuples not live tuples: where we pass it to index AM cleanup functions. Indexes don't really care whether heap entries are live or not, only that they're there. Plus the VACUUM VERBOSE log message that uses the field is supposed to be reporting total tuples not live tuples. I hacked up the patch trying to make that better, finding along the way that contrib/pgstattuple shared in the confusion about what it was trying to count. Results attached. However, I'm not sure we're there yet, because there remains a fairly nasty discrepancy even once we've gotten everyone onto the same page about reltuples counting just live tuples: VACUUM and ANALYZE have different definitions of what's "live". In particular they do not treat INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same. Should we try to do something about that? If so, what? It looks like ANALYZE's behavior is pretty tightly constrained, judging by the comments in acquire_sample_rows. Another problem is that it looks like CREATE INDEX will set reltuples to the total number of heap entries it chose to index, because that is what IndexBuildHeapScan counts. Maybe we should adjust that? regards, tom lane diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c index 5bf0613..68211c6 100644 *** a/contrib/pgstattuple/pgstatapprox.c --- b/contrib/pgstattuple/pgstatapprox.c *************** statapprox_heap(Relation rel, output_typ *** 68,74 **** Buffer vmbuffer = InvalidBuffer; BufferAccessStrategy bstrategy; TransactionId OldestXmin; - uint64 misc_count = 0; OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM); bstrategy = GetAccessStrategy(BAS_BULKREAD); --- 68,73 ---- *************** statapprox_heap(Relation rel, output_typ *** 153,177 **** tuple.t_tableOid = RelationGetRelid(rel); /* ! * We count live and dead tuples, but we also need to add up ! * others in order to feed vac_estimate_reltuples. */ switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf)) { - case HEAPTUPLE_RECENTLY_DEAD: - misc_count++; - /* Fall through */ - case HEAPTUPLE_DEAD: - stat->dead_tuple_len += tuple.t_len; - stat->dead_tuple_count++; - break; case HEAPTUPLE_LIVE: stat->tuple_len += tuple.t_len; stat->tuple_count++; break; ! case HEAPTUPLE_INSERT_IN_PROGRESS: ! case HEAPTUPLE_DELETE_IN_PROGRESS: ! misc_count++; break; default: elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result"); --- 152,172 ---- tuple.t_tableOid = RelationGetRelid(rel); /* ! * We follow VACUUM's lead in counting INSERT_IN_PROGRESS and ! * DELETE_IN_PROGRESS tuples as "live". */ switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf)) { case HEAPTUPLE_LIVE: + case HEAPTUPLE_INSERT_IN_PROGRESS: + case HEAPTUPLE_DELETE_IN_PROGRESS: stat->tuple_len += tuple.t_len; stat->tuple_count++; break; ! case HEAPTUPLE_DEAD: ! case HEAPTUPLE_RECENTLY_DEAD: ! stat->dead_tuple_len += tuple.t_len; ! stat->dead_tuple_count++; break; default: elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result"); *************** statapprox_heap(Relation rel, output_typ *** 184,191 **** stat->table_len = (uint64) nblocks * BLCKSZ; stat->tuple_count = vac_estimate_reltuples(rel, false, nblocks, scanned, ! stat->tuple_count + misc_count); /* * Calculate percentages if the relation has one or more pages. --- 179,190 ---- stat->table_len = (uint64) nblocks * BLCKSZ; + /* + * Extrapolate the live-tuple count to the whole table in the same way + * that VACUUM does. (XXX shouldn't we also extrapolate tuple_len?) + */ stat->tuple_count = vac_estimate_reltuples(rel, false, nblocks, scanned, ! stat->tuple_count); /* * Calculate percentages if the relation has one or more pages. diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index ef60a58..87bba31 100644 *** a/doc/src/sgml/catalogs.sgml --- b/doc/src/sgml/catalogs.sgml *************** SCRAM-SHA-256$<replaceable><iteration *** 1739,1746 **** <entry><type>float4</type></entry> <entry></entry> <entry> ! Number of rows in the table. This is only an estimate used by the ! planner. It is updated by <command>VACUUM</command>, <command>ANALYZE</command>, and a few DDL commands such as <command>CREATE INDEX</command>. </entry> --- 1739,1746 ---- <entry><type>float4</type></entry> <entry></entry> <entry> ! Number of live rows in the table. This is only an estimate used by ! the planner. It is updated by <command>VACUUM</command>, <command>ANALYZE</command>, and a few DDL commands such as <command>CREATE INDEX</command>. </entry> diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index cbd6e9b..067034d 100644 *** a/src/backend/commands/vacuum.c --- b/src/backend/commands/vacuum.c *************** vacuum_set_xid_limits(Relation rel, *** 771,776 **** --- 771,779 ---- * we take the old value of pg_class.reltuples as a measurement of the * tuple density in the unscanned pages. * + * Note: scanned_tuples should count only *live* tuples, since + * pg_class.reltuples is defined that way. + * * This routine is shared by VACUUM and ANALYZE. */ double diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 6587db7..ed14db2 100644 *** a/src/backend/commands/vacuumlazy.c --- b/src/backend/commands/vacuumlazy.c *************** typedef struct LVRelStats *** 115,122 **** BlockNumber frozenskipped_pages; /* # of frozen pages we skipped */ BlockNumber tupcount_pages; /* pages whose tuples we counted */ double scanned_tuples; /* counts only tuples on tupcount_pages */ ! double old_rel_tuples; /* previous value of pg_class.reltuples */ double new_rel_tuples; /* new estimated total # of tuples */ double new_dead_tuples; /* new estimated total # of dead tuples */ BlockNumber pages_removed; double tuples_deleted; --- 115,123 ---- BlockNumber frozenskipped_pages; /* # of frozen pages we skipped */ BlockNumber tupcount_pages; /* pages whose tuples we counted */ double scanned_tuples; /* counts only tuples on tupcount_pages */ ! double old_live_tuples; /* previous value of pg_class.reltuples */ double new_rel_tuples; /* new estimated total # of tuples */ + double new_live_tuples; /* new estimated total # of live tuples */ double new_dead_tuples; /* new estimated total # of dead tuples */ BlockNumber pages_removed; double tuples_deleted; *************** lazy_vacuum_rel(Relation onerel, int opt *** 196,202 **** TransactionId xidFullScanLimit; MultiXactId mxactFullScanLimit; BlockNumber new_rel_pages; - double new_rel_tuples; BlockNumber new_rel_allvisible; double new_live_tuples; TransactionId new_frozen_xid; --- 197,202 ---- *************** lazy_vacuum_rel(Relation onerel, int opt *** 245,251 **** vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats)); vacrelstats->old_rel_pages = onerel->rd_rel->relpages; ! vacrelstats->old_rel_tuples = onerel->rd_rel->reltuples; vacrelstats->num_index_scans = 0; vacrelstats->pages_removed = 0; vacrelstats->lock_waiter_detected = false; --- 245,251 ---- vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats)); vacrelstats->old_rel_pages = onerel->rd_rel->relpages; ! vacrelstats->old_live_tuples = onerel->rd_rel->reltuples; vacrelstats->num_index_scans = 0; vacrelstats->pages_removed = 0; vacrelstats->lock_waiter_detected = false; *************** lazy_vacuum_rel(Relation onerel, int opt *** 311,321 **** * since then we don't know for certain that all tuples have a newer xmin. */ new_rel_pages = vacrelstats->rel_pages; ! new_rel_tuples = vacrelstats->new_rel_tuples; if (vacrelstats->tupcount_pages == 0 && new_rel_pages > 0) { new_rel_pages = vacrelstats->old_rel_pages; ! new_rel_tuples = vacrelstats->old_rel_tuples; } visibilitymap_count(onerel, &new_rel_allvisible, NULL); --- 311,321 ---- * since then we don't know for certain that all tuples have a newer xmin. */ new_rel_pages = vacrelstats->rel_pages; ! new_live_tuples = vacrelstats->new_live_tuples; if (vacrelstats->tupcount_pages == 0 && new_rel_pages > 0) { new_rel_pages = vacrelstats->old_rel_pages; ! new_live_tuples = vacrelstats->old_live_tuples; } visibilitymap_count(onerel, &new_rel_allvisible, NULL); *************** lazy_vacuum_rel(Relation onerel, int opt *** 327,333 **** vac_update_relstats(onerel, new_rel_pages, ! new_rel_tuples, new_rel_allvisible, vacrelstats->hasindex, new_frozen_xid, --- 327,333 ---- vac_update_relstats(onerel, new_rel_pages, ! new_live_tuples, new_rel_allvisible, vacrelstats->hasindex, new_frozen_xid, *************** lazy_vacuum_rel(Relation onerel, int opt *** 335,344 **** false); /* report results to the stats collector, too */ - new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples; - if (new_live_tuples < 0) - new_live_tuples = 0; /* just in case */ - pgstat_report_vacuum(RelationGetRelid(onerel), onerel->rd_rel->relisshared, new_live_tuples, --- 335,340 ---- *************** lazy_scan_heap(Relation onerel, int opti *** 1275,1284 **** vacrelstats->new_dead_tuples = nkeep; /* now we can compute the new value for pg_class.reltuples */ ! vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false, ! nblocks, ! vacrelstats->tupcount_pages, ! num_tuples); /* * Release any remaining pin on visibility map page. --- 1271,1284 ---- vacrelstats->new_dead_tuples = nkeep; /* now we can compute the new value for pg_class.reltuples */ ! vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel, false, ! nblocks, ! vacrelstats->tupcount_pages, ! num_tuples - nkeep); ! ! /* indexes probably care more about the total number of heap entries */ ! vacrelstats->new_rel_tuples = ! vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples; /* * Release any remaining pin on visibility map page. *************** lazy_vacuum_index(Relation indrel, *** 1614,1620 **** ivinfo.analyze_only = false; ivinfo.estimated_count = true; ivinfo.message_level = elevel; ! ivinfo.num_heap_tuples = vacrelstats->old_rel_tuples; ivinfo.strategy = vac_strategy; /* Do bulk deletion */ --- 1614,1621 ---- ivinfo.analyze_only = false; ivinfo.estimated_count = true; ivinfo.message_level = elevel; ! /* We can only provide an approximate value of num_heap_tuples here */ ! ivinfo.num_heap_tuples = vacrelstats->old_live_tuples; ivinfo.strategy = vac_strategy; /* Do bulk deletion */ *************** lazy_cleanup_index(Relation indrel, *** 1645,1650 **** --- 1646,1652 ---- ivinfo.analyze_only = false; ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages); ivinfo.message_level = elevel; + /* Now we can provide a better estimate of total # of surviving tuples */ ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples; ivinfo.strategy = vac_strategy; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hi, On 11/02/2017 08:15 PM, Tom Lane wrote: > Haribabu Kommi <kommi.haribabu@gmail.com> writes: >> The changes are fine and now it reports proper live tuples in both >> pg_class and stats. The other issue of continuous vacuum operation >> leading to decrease of number of live tuples is not related to this >> patch and that can be handled separately. > > I did not like this patch too much, because it did nothing to fix > the underlying problem of confusion between "live tuples" and "total > tuples"; in fact, it made that worse, because now the comment on > LVRelStats.new_rel_tuples is a lie. And there's at least one usage > of that field value where I think we do want total tuples not live > tuples: where we pass it to index AM cleanup functions. Indexes > don't really care whether heap entries are live or not, only that > they're there. Plus the VACUUM VERBOSE log message that uses the > field is supposed to be reporting total tuples not live tuples. > > I hacked up the patch trying to make that better, finding along the > way that contrib/pgstattuple shared in the confusion about what > it was trying to count. Results attached. > Thanks for looking into this. I agree your patch is more consistent and generally cleaner. > However, I'm not sure we're there yet, because there remains a fairly > nasty discrepancy even once we've gotten everyone onto the same page > about reltuples counting just live tuples: VACUUM and ANALYZE have > different definitions of what's "live". In particular they do not treat > INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same. Should we > try to do something about that? If so, what? It looks like ANALYZE's > behavior is pretty tightly constrained, judging by the comments in > acquire_sample_rows. > Yeah :-( For the record (and people following this thread who are too lazy to open the analyze.c and search for the comments), the problem is that acquire_sample_rows does not count HEAPTUPLE_INSERT_IN_PROGRESS as live (and HEAPTUPLE_DELETE_IN_PROGRESS as dead) as it assumes the transaction will send the stats at the end. Which is a correct assumption, but it means that when there is a long-running transaction that inserted/deleted many rows, the reltuples value will oscillate during VACUUM / ANALYZE runs. ISTM we need to unify those definitions, probably so that VACUUM adopts what acquire_sample_rows does. I mean, if ANALYZE assumes that the stats will be updated at the end of transaction, why shouldn't VACUUM do the same thing? The one reason I can think of is that VACUUM is generally expected to run longer than ANALYZE, so the assumption that large transactions complete later is not quite reliable. Or is there some other reason why VACUUM can't treat the in-progress tuples the same way? > Another problem is that it looks like CREATE INDEX will set reltuples > to the total number of heap entries it chose to index, because that > is what IndexBuildHeapScan counts. Maybe we should adjust that? > You mean by only counting live tuples in IndexBuildHeapRangeScan, following whatever definition we end up using in VACUUM/ANALYZE? Seems like a good idea I guess, although CREATE INDEX not as frequent as the other commands. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 11/02/2017 08:15 PM, Tom Lane wrote: >> However, I'm not sure we're there yet, because there remains a fairly >> nasty discrepancy even once we've gotten everyone onto the same page >> about reltuples counting just live tuples: VACUUM and ANALYZE have >> different definitions of what's "live". In particular they do not treat >> INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same. Should we >> try to do something about that? If so, what? It looks like ANALYZE's >> behavior is pretty tightly constrained, judging by the comments in >> acquire_sample_rows. > ISTM we need to unify those definitions, probably so that VACUUM adopts > what acquire_sample_rows does. I mean, if ANALYZE assumes that the stats > will be updated at the end of transaction, why shouldn't VACUUM do the > same thing? That was the way I was leaning. I haven't thought very hard about the implications, but as long as the change in VACUUM's behavior extends only to the live-tuple count it reports, it seems like adjusting it couldn't break anything too badly. >> Another problem is that it looks like CREATE INDEX will set reltuples >> to the total number of heap entries it chose to index, because that >> is what IndexBuildHeapScan counts. Maybe we should adjust that? > You mean by only counting live tuples in IndexBuildHeapRangeScan, > following whatever definition we end up using in VACUUM/ANALYZE? Right. One issue is that, as I mentioned, the index AMs probably want to think about total-tuples-indexed not live-tuples; so for their purposes, what IndexBuildHeapScan currently counts is the right thing. We need to look and see if any AMs are actually using that value rather than just silently passing it back. If they are, we might need to go to the trouble of computing/returning two values. regards, tom lane
BTW see bug #14863 which is related to this: https://postgr.es/m/CAEBTBzu5j_E1K1jb9OKwTZj98MPeM7V81-vadp5adRm=NhJnwA@mail.gmail.com -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Nov 19, 2017 at 6:40 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > Thanks for looking into this. I agree your patch is more consistent and > generally cleaner. This is classified as a bug fix, and is marked as waiting on author. I am moving it to next CF as work continues. -- Michael
Tomas, * Michael Paquier (michael.paquier@gmail.com) wrote: > On Sun, Nov 19, 2017 at 6:40 AM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > Thanks for looking into this. I agree your patch is more consistent and > > generally cleaner. > > This is classified as a bug fix, and is marked as waiting on author. I > am moving it to next CF as work continues. This is still in waiting-on-author state; it'd be great to get your thoughts on where this patch is and what the next steps are to move it forward. If you need additional feedback or there needs to be more discussion (perhaps with Tom) then maybe this should be in needs-review instead, but it'd be great if you could review the discussion and patch again and at least provide an update on it (last update from you was in November). Thanks! Stephen
Attachment
On 01/05/2018 05:25 AM, Stephen Frost wrote: > Tomas, > > * Michael Paquier (michael.paquier@gmail.com) wrote: >> On Sun, Nov 19, 2017 at 6:40 AM, Tomas Vondra >> <tomas.vondra@2ndquadrant.com> wrote: >>> Thanks for looking into this. I agree your patch is more consistent and >>> generally cleaner. >> >> This is classified as a bug fix, and is marked as waiting on author. I >> am moving it to next CF as work continues. > > This is still in waiting-on-author state; it'd be great to get your > thoughts on where this patch is and what the next steps are to move > it forward. If you need additional feedback or there needs to be > more discussion (perhaps with Tom) then maybe this should be in > needs-review instead, but it'd be great if you could review the > discussion and patch again and at least provide an update on it (last > update from you was in November). > Hmmm, I'm not sure, TBH. As I already mentioned, Tom's updated patch is better than what I posted initially, and I agree with his approach to the remaining issues he pointed out. But I somehow assumed that he's already looking into that. Tom, do you plan to look into this patch soon, or should I? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > As I already mentioned, Tom's updated patch is better than what I posted > initially, and I agree with his approach to the remaining issues he > pointed out. But I somehow assumed that he's already looking into that. > Tom, do you plan to look into this patch soon, or should I? No, I thought you were going to run with those ideas. I have a lot of other stuff on my plate ... regards, tom lane
On 01/08/2018 08:39 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> As I already mentioned, Tom's updated patch is better than what I >> posted initially, and I agree with his approach to the remaining >> issues he pointed out. But I somehow assumed that he's already >> looking into that. Tom, do you plan to look into this patch soon, >> or should I? > > No, I thought you were going to run with those ideas. I have a lot > of other stuff on my plate ... > OK, will do. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Tomas, On 1/8/18 3:28 PM, Tomas Vondra wrote: > > > On 01/08/2018 08:39 PM, Tom Lane wrote: >> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>> As I already mentioned, Tom's updated patch is better than what I >>> posted initially, and I agree with his approach to the remaining >>> issues he pointed out. But I somehow assumed that he's already >>> looking into that. Tom, do you plan to look into this patch soon, >>> or should I? >> >> No, I thought you were going to run with those ideas. I have a lot >> of other stuff on my plate ... >> > > OK, will do. What the status of this patch? It's been waiting on author since last November, though I can see there was some confusion over who was working on it. Given that it's a bug fix it would be good to see a patch for this CF, or very soon after. Thanks, -- -David david@pgmasters.net
On 03/05/2018 04:12 PM, David Steele wrote: > Hi Tomas, > > On 1/8/18 3:28 PM, Tomas Vondra wrote: >> >> >> On 01/08/2018 08:39 PM, Tom Lane wrote: >>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>>> As I already mentioned, Tom's updated patch is better than what I >>>> posted initially, and I agree with his approach to the remaining >>>> issues he pointed out. But I somehow assumed that he's already >>>> looking into that. Tom, do you plan to look into this patch soon, >>>> or should I? >>> >>> No, I thought you were going to run with those ideas. I have a lot >>> of other stuff on my plate ... >>> >> >> OK, will do. > > What the status of this patch? It's been waiting on author since last > November, though I can see there was some confusion over who was working > on it. > > Given that it's a bug fix it would be good to see a patch for this CF, > or very soon after. > Yeah, it's the next thing on my TODO. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, So here is an updated version of the patch/fix, addressing the remaining issues in v3 posted by Tom in November. The 0001 part is actually a bugfix in bloom and spgist index AM, which did something like this: reltuples = IndexBuildHeapScan(...) result->heap_tuples = result->index_tuples = reltuples; That is, these two AMs simply used the number of heap rows for the index. That does not work for partial indexes, of course, where the correct index reltuples value is likely much lower. 0001 fixes this by tracking the number of actually indexed rows in the build states, just like in the other index AMs. A VACUUM or ANALYZE will fix the estimate, of course, but for tables that are not changing very much it may take quite a while. So I think this is something we definitely need to back-patch. The 0002 part is the main part, unifying the definition of reltuples on three main places: a) acquire_sample_rows (ANALYZE) b) lazy_scan_heap (VACUUM) c) IndexBuildHeapRangeScan (CREATE INDEX) As the ANALYZE case seems the most constrained, the other two places were updated to use the same criteria for which rows to include in the reltuples estimate: * HEAPTUPLE_LIVE * HEAPTUPLE_INSERT_IN_PROGRESS (same transaction) * HEAPTUPLE_DELETE_IN_PROGRESS (not the same trasaction) This resolves the issue with oscillating reltuples estimates, produced by VACUUM and ANALYZE (with many non-removable dead tuples). I've checked all IndexBuildHeapRangeScan callers, and none of them is using the reltuples estimate for anything except for passing it to index_update_stats. Aside from the bug fixed in 0001, of course. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > 0001 fixes this by tracking the number of actually indexed rows in the > build states, just like in the other index AMs. > A VACUUM or ANALYZE will fix the estimate, of course, but for tables > that are not changing very much it may take quite a while. So I think > this is something we definitely need to back-patch. Agreed, and done. I noticed a second bug in contrib/bloom, too: it'd forget to write out the last index page if that contained only one tuple, because the new-page code path in bloomBuildCallback failed to increment the "count" field after clearing it. > The 0002 part is the main part, unifying the definition of reltuples on > three main places: On to this part ... regards, tom lane
I wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> The 0002 part is the main part, unifying the definition of reltuples on >> three main places: > On to this part ... I've pushed 0002 with several corrections: it did not seem to me that you'd correctly propagated what ANALYZE is doing into CREATE INDEX or pgstattuple. Also, since one of the things VACUUM does with num_tuples is to report it as "the total number of non-removable tuples", I thought we'd better leave that calculation alone. I made the added code count live tuples in a new variable live_tuples, instead. regards, tom lane
On 03/22/2018 08:51 PM, Tom Lane wrote: > I wrote: >> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>> The 0002 part is the main part, unifying the definition of reltuples on >>> three main places: > >> On to this part ... > > I've pushed 0002 with several corrections: it did not seem to me that > you'd correctly propagated what ANALYZE is doing into CREATE INDEX or > pgstattuple. Also, since one of the things VACUUM does with num_tuples > is to report it as "the total number of non-removable tuples", I thought > we'd better leave that calculation alone. I made the added code count > live tuples in a new variable live_tuples, instead. > OK, makes sense. Thanks. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Just want to add for the archive that I happened to run across what appears to be a 7-year old report of (I think) both of these vacuum/analyze bugs: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate. Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means https://www.postgresql.org/message-id/flat/6373488.29.1323717222303.JavaMail.root%40store1.zcs.ext.wpsrv.net#6373488.29.1323717222303.JavaMail.root@store1.zcs.ext.wpsrv.net
On Mon, 23 Apr 2018 20:09:21 -0500 Justin Pryzby <pryzby@telsasoft.com> wrote: > Just want to add for the archive that I happened to run across what appears to > be a 7-year old report of (I think) both of these vacuum/analyze bugs: > > Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate. > Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means > > https://www.postgresql.org/message-id/flat/6373488.29.1323717222303.JavaMail.root%40store1.zcs.ext.wpsrv.net#6373488.29.1323717222303.JavaMail.root@store1.zcs.ext.wpsrv.net Nice find. It does look like both, although, since the info is not in the thread it is not certain that relpages was large enough to hit the analyze issue. Unless that was with the old default statistics target in which case it would be pretty easy to hit. -dg -- David Gould daveg@sonic.net If simplicity worked, the world would be overrun with insects.