Thread: No stats after promoting standby?
I created a test standby that was recovering from our prod primary via
restored WAL files. When I broke recovery and opened up that test standby,
I saw that records in pg_stat_all_tables for tables that contain many
millions of rows (and have for years) were saying a few hundred rows, with
last_analyze and last_autoanalyze being null. I know this isn't the case
for pg_stat_all_tables on prod.
Do stats not automatically get preserved when a standby is promoted? Did
something go wrong here or should I know/expect to have to manually gather
DB stats after a failover (like we do with major upgrades)?
This is PG 9.2.22 on CentOS.
Don.
--
Don Seiler
www.seiler.us
On Mon, Dec 4, 2017 at 3:37 PM, Adrien Nayrat
wrote:
>
> Stats are deleted after promotion :
> " The stats file is deleted at the start of recovery, so stats from
> primary and
> standby will differ; this is considered a feature, not a bug."
> https://www.postgresql.org/docs/current/static/hot-standby.html
>
> Or : https://www.postgresql.org/message-id/31781.1508448417%
> 40sss.pgh.pa.us
>
> You have to do an ANALYZE just after promotion.
>
OK good to know. Interesting point but I think I disagree about the standby
having its own stats. Until replication is broken, wouldn't the data be
identical to the primary, and so the statistics should be identical as well?
In the meantime I'll know to manually analyze.
Don.
--
Don Seiler
www.seiler.us
On 12/04/2017 08:26 PM, Don Seiler wrote: > Do stats not automatically get preserved when a standby is promoted? Did > something go wrong here or should I know/expect to have to manually gather DB > stats after a failover (like we do with major upgrades)? Hi, Stats are deleted after promotion : " The stats file is deleted at the start of recovery, so stats from primary and standby will differ; this is considered a feature, not a bug." https://www.postgresql.org/docs/current/static/hot-standby.html Or : https://www.postgresql.org/message-id/31781.1508448417%40sss.pgh.pa.us You have to do an ANALYZE just after promotion. Regards, -- Adrien NAYRAT
Attachment
Don Seiler wrote: > On Mon, Dec 4, 2017 at 3:37 PM, Adrien Nayrat <adrien.nayrat@dalibo.com> > wrote: > > > > Stats are deleted after promotion : > > " The stats file is deleted at the start of recovery, so stats from > > primary and > > standby will differ; this is considered a feature, not a bug." > > https://www.postgresql.org/docs/current/static/hot-standby.html > OK good to know. Interesting point but I think I disagree about the standby > having its own stats. Until replication is broken, wouldn't the data be > identical to the primary, and so the statistics should be identical as well? Those stats aren't part of regular data, so no they shouldn't. The docs are telling half the truth when they say this is a not-bug-but-feature, though. It's true that we don't want to overwrite some of these counters, such as number of scans, number of tuples accessed, etc; but ideally we should keep things such as vacuum/analyze counts and last times, and also the counts of dead/live/ mod_since_analyze tuples, to keep autovacuum informed without requiring an ANALYZE. This seems pretty difficult to implement, though; as far as I know we haven't even discussed it. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Dec 4, 2017 at 4:02 PM, Alvaro Herrera
wrote:
> Those stats aren't part of regular data, so no they shouldn't.
>
I would consider object statistics to be part of a database, and a very
important part at that.
> The docs are telling half the truth when they say this is a
> not-bug-but-feature, though. It's true that we don't want to overwrite
> some of these counters, such as number of scans, number of tuples
> accessed, etc; but ideally we should keep things such as vacuum/analyze
> counts and last times, and also the counts of dead/live/
> mod_since_analyze tuples, to keep autovacuum informed without requiring
> an ANALYZE.
>
I'm not so worried about the time of last vacuum or analyze but the
statistics for optimizer usage would be necessary. Otherwise I would think
the time to complete a failover would have to include that "vacuumdb --all
--analyze-only" run as queries run before that could be very out-of-sorts
with no information on the data. Although perhaps I'm misunderstanding
where the optimizer gets its information from and these pg_stat_% tables
are not what I seem to think they are.
Anyway, sounds like you've all been over this one before. I'll just make a
note of it for my own reference for now.
Don.
--
Don Seiler
www.seiler.us
Don Seiler wrote: > On Mon, Dec 4, 2017 at 4:02 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> > wrote: > > The docs are telling half the truth when they say this is a > > not-bug-but-feature, though. It's true that we don't want to overwrite > > some of these counters, such as number of scans, number of tuples > > accessed, etc; but ideally we should keep things such as vacuum/analyze > > counts and last times, and also the counts of dead/live/ > > mod_since_analyze tuples, to keep autovacuum informed without requiring > > an ANALYZE. > > I'm not so worried about the time of last vacuum or analyze but the > statistics for optimizer usage would be necessary. The optimizer stats are not lost on crash or promote actually; these are stored in pg_statistic (not in the stats collector) and *are* part of regular "data". > Otherwise I would think > the time to complete a failover would have to include that "vacuumdb --all > --analyze-only" run as queries run before that could be very out-of-sorts > with no information on the data. Although perhaps I'm misunderstanding > where the optimizer gets its information from and these pg_stat_% tables > are not what I seem to think they are. I think it's pretty common misunderstanding -- it's not easy to see the boundary between pg_statistic and pg_stat_* tables. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Dec 4, 2017 at 4:36 PM, Alvaro Herrera
wrote:
> The optimizer stats are not lost on crash or promote actually; these are
> stored in pg_statistic (not in the stats collector) and *are* part of
> regular "data".
>
> ...
>
> I think it's pretty common misunderstanding -- it's not easy to see the
> boundary between pg_statistic and pg_stat_* tables.
Thanks for this. I thought I remembered being corrected on this before
(probably on this very list) and just came to think of it at the end of my
last message.
So what, exactly ARE the pg_stat_* tables? Is that the data stored in the
files in stats_temp_directory? How do they differ from pg_statistic?
I've found this blog post that seems to answer that question with a "Yes":
https://blog.pgaddict.com/posts/the-two-kinds-of-stats-in-postgresql
--
Don Seiler
www.seiler.us
I'm pretty late on this, but as it seems you didn't get any answer...
2017-12-04 23:49 GMT+01:00 Don Seiler <don@seiler.us>:
On Mon, Dec 4, 2017 at 4:36 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:The optimizer stats are not lost on crash or promote actually; these are
stored in pg_statistic (not in the stats collector) and *are* part of
regular "data".
...
I think it's pretty common misunderstanding -- it's not easy to see the
boundary between pg_statistic and pg_stat_* tables.Thanks for this. I thought I remembered being corrected on this before (probably on this very list) and just came to think of it at the end of my last message.So what, exactly ARE the pg_stat_* tables?
Activity statistics, collected by the stats collector process.
Is that the data stored in the files in stats_temp_directory?
Yes.
How do they differ from pg_statistic?
Those in pg_statistic are data statistics. They're used by the planner, and collected by the ANALYZE SQL command.
I've found this blog post that seems to answer that question with a "Yes": https://blog.pgaddict.com/posts/the-two-kinds-of- stats-in-postgresql
+1
--
Guillaume.