Re: Auto-analyse on insert operations - Mailing list pgsql-general
From | Bertrand Roos |
---|---|
Subject | Re: Auto-analyse on insert operations |
Date | |
Msg-id | 563CDFC6.3010101@areal.fr Whole thread Raw |
In response to | Re: Auto-analyse on insert operations (Adrian Klaver <adrian.klaver@aklaver.com>) |
Responses |
Re: Auto-analyse on insert operations
|
List | pgsql-general |
Le 04/11/2015 16:57, Adrian Klaver a écrit : > On 11/04/2015 07:43 AM, Bertrand Roos wrote: >> >> >> Le 04/11/2015 14:55, Bill Moran a écrit : >>> On Wed, 4 Nov 2015 14:32:37 +0100 >>> Bertrand Roos <bertrand.roos@areal.fr> wrote: >>>> I try to configure auto-analyse task with postgresql 9.4. >>>> I have the following configuration (default configuration): >>>> track_counts = on >>>> autovacuum = on >>>> log_autovacuum_min_duration = -1 >>>> autovacuum_max_workers = 3 >>>> autovacuum_naptime = 300s >>>> autovacuum_vacuum_threshold = 50 >>>> autovacuum_analyze_threshold = 50 >>>> autovacuum_vacuum_scale_factor = 0.2 >>>> autovacuum_analyze_scale_factor = 0.2 >>>> autovacuum_freeze_max_age = 200000000 >>>> autovacuum_multixact_freeze_max_age = 400000000 >>>> autovacuum_vacuum_cost_delay = 20ms >>>> autovacuum_vacuum_cost_limit = -1 >>>> >>>> With this configuration, I can observe that some tables are >>>> auto-analysed, but some others are not. Even if there are millions of >>>> insert operations on an empty table (all tables are in cluster mode). >>>> In fact it seems that tables with update operations are the only ones >>>> that are auto-analysed. >>>> I'm quite suprised because the documentation says that daemon check >>>> the >>>> count of insert, update and delete operations. >>>> What could it be the reason ? Why tables which have only update >>>> operation, aren't analysed ? >>>> Are update operations really taken into account ? >>> Given that autoanalyze is pretty critical to the way the system >>> functions, >>> it's unlikely that it just doesn't work (someone else would have >>> noticed). >>> >>> A more likely scenario is that you've found some extremely obscure edge >>> case. If that's the case, you're going to have to give very specific >>> details as to how you're testing it before anyone is liable to be able >>> to help you. >>> >>> I get the impression that you're somewhat new to Postgres, in which >>> case >>> it's very likely that the problem is that you're not testing the >>> situation >>> correctly. In that case, we're going to need specific details on how >>> you're >>> observing that tables are or are not being analysed. >>> >>> As a wild-guess theory: the process that does the analyze only wakes up >>> to check tables every 5 minutes (based on the config you show) ... >>> so are >>> you doing the inserts then checking the table without leaving enough >>> time >>> in between for the system to wake up and notice the change? >>> >> Thanks for your answer Bill. >> Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing >> something wrong. But I did my test on a more than 1 day duration, so >> it's not an issue of autovacuum_naptime (I insert 760 lignes each 30 >> seconds during 36 hours). >> I can't give all the details of this test because it is to complicated >> with triggers and partman (and your objective is not to solve >> configuration issues of others). >> >> In fact, I was telling the question because I have read on some forums >> that the auto vacuum deamon only count dead tuple so only update and >> delete operations can cause the scheduling of auto-analyse. >> So if it's the case it perfectly explain why my test case doesn't work. >> But in the other hand the documentation says that delete, update and >> insert operations are counted. >> Is it an know issue that insert operations are not counted for the >> trigger of auto-analyse ? > > No, see below: > > http://www.postgresql.org/docs/9.4/interactive/routine-vacuuming.html#AUTOVACUUM > > > 23.1.3. Updating Planner Statistics > > "The autovacuum daemon, if enabled, will automatically issue ANALYZE > commands whenever the content of a table has changed sufficiently. > However, administrators might prefer to rely on manually-scheduled > ANALYZE operations, particularly if it is known that update activity > on a table will not affect the statistics of "interesting" columns. > The daemon schedules ANALYZE strictly as a function of the number of > rows inserted or updated; it has no knowledge of whether that will > lead to meaningful statistical changes." > > > Is the partman you refer to this?: > > https://github.com/keithf4/pg_partman > > Can you give an outline view of what you are doing and how you are > determining the status of analyze? > >> >> If it's not, I can try to reproduce this weird behaviour with a simpler >> test and give you all the details of the test. >> >> Bertrand >> >> >> >> > > I have done multiple tests trying to reproduce my issue and finally find a partial explanation. My non-working use case is : With a database with configuration by default (autovacuum=on, autovacuum_max_workers=3 ...) A table ('table_name') partioned with partman with a child table per month (@Adrian Klaver, yes it's https://github.com/keithf4/pg_partman) A remote deamon that calls an analyze every day on the table 'table_name' A remote application that calls multiple stored procedures doing an insert operation on 'table_name' (785 inserts each 30s), with no delete, no update operation This stored procedures are sometimes blocked by a short lock (they are doing some update operations on a cache table of 'table_name') It happened that : The current child table has no statistics : - SELECT * FROM pg_stat WHERE tablename = '$table_name_month'; => empty row - SELECT * FROM pg_stat_user_tables WHERE relname = '$table_name_month'; => statistics are good (coherent with values I should have) but I saw no value for last_autoanalyze or last_analyze (even if the test was running for a few days and the table start with no data and grow to 6 million rows) The behaviour of the auto-analyze I have observed : Insert operations on a table trigger auto-analyze (with track_counts=on). Stored procedure on a table trigger auto-analyze (even if track_functions=off) Insert operations on a partitioned table trigger auto-analyze. Insert operations on a daily analyze table may not trigger auto-analyze, because an analyze reset the count of operation for auto-analyze. An analyze on a partitioned table update statistics of the parent table but not for children tables. An analyze on a partitioned table do not reset the count of operations for auto-analyze of children tables. I may not be able yet to explain the reason why the auto-analyze is rarely not triggered. But at least I found a way to fix pg_stat of the current table. @Bill Moran, Are you sure that a dump of the table could help you ? The values of the table are correct, the main issue is that statistics are becoming invalid and so postgres is choosing dummy request paths. Bertrand
pgsql-general by date: