Re: PG12 autovac issues - Mailing list pgsql-general

From Justin King
Subject Re: PG12 autovac issues
Date
Msg-id CAE39h23X=09_PBvyvC3yy1MBVu6pkLuDj9KpaQUzoSmi_2+iKw@mail.gmail.com
Whole thread Raw
In response to Re: PG12 autovac issues  (Justin King <kingpin867@gmail.com>)
List pgsql-general
On Mon, Mar 23, 2020 at 4:31 PM Justin King <kingpin867@gmail.com> wrote:
>
> On Mon, Mar 23, 2020 at 3:00 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2020-03-23 20:47:25 +0100, Julien Rouhaud wrote:
> > > > - relfrozenxid, age(relfrozenxid) for the oldest table in the oldest database
> > > >   SELECT oid::regclass, age(relfrozenxid), relfrozenxid FROM pg_class WHERE relfrozenxid <> 0 ORDER BY
age(relfrozenxid)DESC LIMIT 1;
 
> > >
> > > The vm has been trashed since, and I don't have that level of detail available
> > > in the gathered stats unfortunately (and the available information I have is a
> > > little bit messy, sorry for that).
> > >
> > > > - Oldest backend xmin
> > > >   SELECT pid, backend_xmin, age(backend_xmin) FROM pg_stat_activity WHERE backend_xmin <> 0 ORDER BY
age(backend_xmin)DESC LIMIT 3;
 
> > > > - oldest replication xmin:
> > > >   SELECT pid, backend_xmin, age(backend_xmin) FROM pg_stat_replication WHERE backend_xmin <> 0 ORDER BY
age(backend_xmin)DESC LIMIT 3;
 
> > > > - oldest slot xmin:
> > > >   SELECT slot_name, xmin, age(xmin), catalog_xmin, age(catalog_xmin) FROM pg_replication_slots WHERE xmin <> 0
ORcatalog_xmin <> 0 ORDER BY greatest(age(xmin), age(catalog_xmin)) DESC LIMIT 3;
 
> > > > - oldest prepared transaction (unfortunately xmin not available)
> > > >   SELECT gid, database, transaction FROM pg_prepared_xacts ORDER BY age(transaction) LIMIT 3;
> > >
> > > I have output from this query:
https://github.com/OPMDG/check_pgactivity/blob/oldest_xmin/check_pgactivity#L5669-L5732
> > > (which may be dumb).  Here are the non-null rows after 8AM GMT+1:
> >
> > Could you share what the config of the server was?
> >
> >
> > > > > The same bench was run against pg11 many times and never triggered this issue.
> > > > > So far our best guess is a side effect of 2aa6e331ead7.
> > > >
> > > > In that case you'd likely see DEBUG1 output, right? Did you try running
> > > > with that?
> > >
> > > That's unfortunately not an option, as while the issue is reproducible, it
> > > happens after roughly 16h with an average of 6.5ktps, and we definitely don't
> > > have enough disk space for that amount of logs.  We'll try next time with a
> > > recompiled version with the DEBUG1 promoted to WARNING, and maybe other elog.
> >
> > You wouldn't need to run it with DEBUG1 the whole time, you could just
> > change the config and reload config once you hit the issue.
> >
> >
> > > > > Michael and I have been trying to reproduce this issue locally (drastically
> > > > > reducing the various freeze_age parameters) for hours, but no luck for now.
> > > >
> > > > Have you considered using gdb to investigate?
> > >
> > > Not for now, because it's hard to be around for the ~ 90min interval the system
> > > stays alive after atuvacuum get stuck, especially since it's a weekend bench.
> > >
> > > > > This is using a vanilla pg 12.1, with some OLTP workload.  The only possibly
> > > > > relevant configuration changes are quite aggressive autovacuum settings on some
> > > > > tables (no delay, analyze/vacuum threshold to 1k and analyze/vacuum scale
> > > > > factor to 0, for both heap and toast).
> > > >
> > > > That, uh, is not an insignificant set of changes for an autovac
> > > > scheduling issues.
> > > >
> > > > It sounds like what might be happening is that you have something
> > > > holding back the "oldest needed transaction" horizon. Before
> > > > 2aa6e331ead7, if the xmin horizon hasn't increased, there'd be repeated
> > > > vacuums unable to increase the horizon, but afterwards they'll all
> > > > immediately exit without any messages.
> > >
> > > The xmin horizon is increasing AFAICS.  Grepping the autovacuum logs, I can see
> > > for the 08:00 AM UTC - 08:59 AM UTC interval the oldest xmin going from
> > > 200069684 to 210363848.  E.g.:
> >
> > Which database is this on?
> >
> >
> > > [...]
> > > 266603:2020-03-14 08:49:59.927 UTC LOG:  automatic aggressive vacuum to prevent wraparound of table
"pg_toast.pg_toast_27683":index scans: 0
 
> > > 266604-       pages: 0 removed, 0 remain, 0 skipped due to pins, 0 skipped frozen
> > > 266605-       tuples: 0 removed, 0 remain, 0 are dead but not yet removable, oldest xmin: 209635921
> > > 266606-       buffer usage: 25 hits, 1 misses, 1 dirtied
> > > 266607-       avg read rate: 15.440 MB/s, avg write rate: 15.440 MB/s
> > > 266608-       system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> > > 266609:2020-03-14 08:49:59.929 UTC LOG:  automatic aggressive vacuum to prevent wraparound of table
"pg_toast.pg_toast_27694":index scans: 0
 
> > > 266610-       pages: 0 removed, 0 remain, 0 skipped due to pins, 0 skipped frozen
> > > 266611-       tuples: 0 removed, 0 remain, 0 are dead but not yet removable, oldest xmin: 209635934
> > > 266612-       buffer usage: 25 hits, 1 misses, 1 dirtied
> > > 266613-       avg read rate: 21.058 MB/s, avg write rate: 21.058 MB/s
> > > 266614-       system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> > > 266615:2020-03-14 08:49:59.931 UTC LOG:  automatic aggressive vacuum to prevent wraparound of table
"pg_toast.pg_toast_24845":index scans: 0
 
> > > 266616-       pages: 0 removed, 0 remain, 0 skipped due to pins, 0 skipped frozen
> > > 266617-       tuples: 0 removed, 0 remain, 0 are dead but not yet removable, oldest xmin: 209635934
> > > 266618-       buffer usage: 25 hits, 1 misses, 2 dirtied
> > > 266619-       avg read rate: 8.005 MB/s, avg write rate: 16.009 MB/s
> > > 266620-       system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> > > 266621:2020-03-14 08:49:59.945 UTC LOG:  automatic aggressive vacuum to prevent wraparound of table
"pg_toast.pg_toast_27712":index scans: 0
 
> > > 266622-       pages: 0 removed, 0 remain, 0 skipped due to pins, 0 skipped frozen
> > > 266623-       tuples: 0 removed, 0 remain, 0 are dead but not yet removable, oldest xmin: 209635953
> > > [...]
> >
> > Do you have any non-toast ones?
>
> This is occurring in our environment right now (started about 30 min
> ago).  Here 's the latest logs (grepped by vacuum):
>
> Mar 23 20:54:16 cowtn postgres[15569]: [12-1] 2020-03-23 20:54:16.542
> GMT [15569] LOG:  automatic vacuum of table "feedi.production.tita":
> index scans: 1
> Mar 23 20:54:27 cowtn postgres[15654]: [8-1] 2020-03-23 20:54:27.964
> GMT [15654] LOG:  automatic vacuum of table
> "feedi.production.distributed_virtual_schedule": index scans: 1
> Mar 23 20:54:29 cowtn postgres[15654]: [10-1] 2020-03-23 20:54:29.103
> GMT [15654] LOG:  automatic vacuum of table "feedi.production.tita":
> index scans: 1
> Mar 23 20:54:29 cowtn postgres[15654]: [12-1] 2020-03-23 20:54:29.284
> GMT [15654] LOG:  automatic vacuum of table
> "feedi.pg_toast.pg_toast_2619": index scans: 1
> Mar 23 20:54:44 cowtn postgres[15759]: [8-1] 2020-03-23 20:54:44.005
> GMT [15759] LOG:  automatic vacuum of table
> "feedi.production.flightplans": index scans: 1
> Mar 23 20:54:45 cowtn postgres[15759]: [11-1] 2020-03-23 20:54:45.979
> GMT [15759] LOG:  automatic vacuum of table "feedi.production.tita":
> index scans: 1
> Mar 23 20:54:54 cowtn postgres[15814]: [7-1] 2020-03-23 20:54:54.367
> GMT [15814] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_statistic": index scans: 0
> Mar 23 20:54:54 cowtn postgres[15814]: [8-1] 2020-03-23 20:54:54.378
> GMT [15814] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_type": index scans: 0
> Mar 23 20:54:54 cowtn postgres[15814]: [9-1] 2020-03-23 20:54:54.389
> GMT [15814] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_foreign_server": index scans: 0
> Mar 23 20:54:54 cowtn postgres[15814]: [10-1] 2020-03-23 20:54:54.410
> GMT [15814] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_statistic_ext_data": index scans: 0
> Mar 23 20:54:54 cowtn postgres[15814]: [11-1] 2020-03-23 20:54:54.420
> GMT [15814] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_user_mapping": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [7-1] 2020-03-23 20:55:09.194
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_attribute": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [8-1] 2020-03-23 20:55:09.205
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_proc": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [9-1] 2020-03-23 20:55:09.216
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_class": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [10-1] 2020-03-23 20:55:09.227
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_attrdef": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [11-1] 2020-03-23 20:55:09.239
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_constraint": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [12-1] 2020-03-23 20:55:09.250
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_inherits": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [13-1] 2020-03-23 20:55:09.260
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_index": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [14-1] 2020-03-23 20:55:09.272
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_operator": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [15-1] 2020-03-23 20:55:09.283
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_opfamily": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [16-1] 2020-03-23 20:55:09.294
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_opclass": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [17-1] 2020-03-23 20:55:09.305
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_am": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [18-1] 2020-03-23 20:55:09.317
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_amop": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [19-1] 2020-03-23 20:55:09.328
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_amproc": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [20-1] 2020-03-23 20:55:09.339
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_language": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [21-1] 2020-03-23 20:55:09.349
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_largeobject_metadata": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [22-1] 2020-03-23 20:55:09.360
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_aggregate": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [23-1] 2020-03-23 20:55:09.371
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_largeobject": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [24-1] 2020-03-23 20:55:09.382
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_statistic_ext": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [25-1] 2020-03-23 20:55:09.393
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_rewrite": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [26-1] 2020-03-23 20:55:09.404
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_trigger": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [27-1] 2020-03-23 20:55:09.415
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_event_trigger": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [28-1] 2020-03-23 20:55:09.426
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_description": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [29-1] 2020-03-23 20:55:09.437
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_cast": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [30-1] 2020-03-23 20:55:09.448
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_enum": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [31-1] 2020-03-23 20:55:09.459
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_namespace": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [32-1] 2020-03-23 20:55:09.470
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_conversion": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [33-1] 2020-03-23 20:55:09.481
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_depend": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [34-1] 2020-03-23 20:55:09.502
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_db_role_setting": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [35-1] 2020-03-23 20:55:09.524
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_pltemplate": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [36-1] 2020-03-23 20:55:09.545
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_shdepend": index scans: 1
> Mar 23 20:55:09 cowtn postgres[15890]: [37-1] 2020-03-23 20:55:09.556
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_shdescription": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [38-1] 2020-03-23 20:55:09.567
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_ts_config": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [39-1] 2020-03-23 20:55:09.578
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_ts_config_map": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [40-1] 2020-03-23 20:55:09.589
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_ts_dict": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [41-1] 2020-03-23 20:55:09.600
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_ts_parser": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [42-1] 2020-03-23 20:55:09.611
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_ts_template": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [43-1] 2020-03-23 20:55:09.622
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_extension": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [44-1] 2020-03-23 20:55:09.633
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_foreign_data_wrapper": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [45-1] 2020-03-23 20:55:09.643
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_foreign_table": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [46-1] 2020-03-23 20:55:09.654
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_policy": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [47-1] 2020-03-23 20:55:09.675
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_default_acl": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [48-1] 2020-03-23 20:55:09.686
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_init_privs": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [49-1] 2020-03-23 20:55:09.696
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_seclabel": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [50-1] 2020-03-23 20:55:09.719
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_collation": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [51-1] 2020-03-23 20:55:09.729
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_partitioned_table": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [52-1] 2020-03-23 20:55:09.740
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_range": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [53-1] 2020-03-23 20:55:09.751
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_transform": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [54-1] 2020-03-23 20:55:09.762
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_sequence": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [55-1] 2020-03-23 20:55:09.772
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_publication": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [56-1] 2020-03-23 20:55:09.783
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_publication_rel": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [57-1] 2020-03-23 20:55:09.794
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_catalog.pg_subscription_rel": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [58-1] 2020-03-23 20:55:09.805
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.information_schema.sql_packages": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [59-1] 2020-03-23 20:55:09.816
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.information_schema.sql_features": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [60-1] 2020-03-23 20:55:09.827
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.information_schema.sql_implementation_info": index
> scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [61-1] 2020-03-23 20:55:09.838
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.information_schema.sql_parts": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [62-1] 2020-03-23 20:55:09.849
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.information_schema.sql_languages": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [63-1] 2020-03-23 20:55:09.859
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.information_schema.sql_sizing": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [64-1] 2020-03-23 20:55:09.870
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.information_schema.sql_sizing_profiles": index scans:
> 0
> Mar 23 20:55:09 cowtn postgres[15890]: [65-1] 2020-03-23 20:55:09.880
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2600": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [66-1] 2020-03-23 20:55:09.891
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2604": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [67-1] 2020-03-23 20:55:09.902
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3456": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [68-1] 2020-03-23 20:55:09.912
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2606": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [69-1] 2020-03-23 20:55:09.923
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_826": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [70-1] 2020-03-23 20:55:09.933
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2609": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [71-1] 2020-03-23 20:55:09.944
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3466": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [72-1] 2020-03-23 20:55:09.955
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3079": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [73-1] 2020-03-23 20:55:09.965
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2328": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [74-1] 2020-03-23 20:55:09.976
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1417": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [75-1] 2020-03-23 20:55:09.987
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3118": index scans: 0
> Mar 23 20:55:09 cowtn postgres[15890]: [76-1] 2020-03-23 20:55:09.997
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3394": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [77-1] 2020-03-23 20:55:10.008
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2612": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [78-1] 2020-03-23 20:55:10.018
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2615": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [79-1] 2020-03-23 20:55:10.029
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3350": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [80-1] 2020-03-23 20:55:10.040
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3256": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [81-1] 2020-03-23 20:55:10.050
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1255": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [82-1] 2020-03-23 20:55:10.062
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2618": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [83-1] 2020-03-23 20:55:10.072
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3596": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [84-1] 2020-03-23 20:55:10.083
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2619": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [85-1] 2020-03-23 20:55:10.094
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3381": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [86-1] 2020-03-23 20:55:10.105
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3429": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [87-1] 2020-03-23 20:55:10.115
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2620": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [88-1] 2020-03-23 20:55:10.126
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3600": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [89-1] 2020-03-23 20:55:10.137
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1247": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [90-1] 2020-03-23 20:55:10.147
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1418": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [91-1] 2020-03-23 20:55:10.158
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1260": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [92-1] 2020-03-23 20:55:10.169
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1262": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [93-1] 2020-03-23 20:55:10.179
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2964": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [94-1] 2020-03-23 20:55:10.190
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1136": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [95-1] 2020-03-23 20:55:10.201
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_6000": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [96-1] 2020-03-23 20:55:10.211
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_2396": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [97-1] 2020-03-23 20:55:10.222
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_3592": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [98-1] 2020-03-23 20:55:10.232
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_6100": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [99-1] 2020-03-23 20:55:10.243
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_1213": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [100-1] 2020-03-23 20:55:10.254
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_13267": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [101-1] 2020-03-23 20:55:10.264
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_13272": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [102-1] 2020-03-23 20:55:10.275
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_13287": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [103-1] 2020-03-23 20:55:10.286
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_13277": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [104-1] 2020-03-23 20:55:10.296
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_13282": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [105-1] 2020-03-23 20:55:10.307
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_13292": index scans: 0
> Mar 23 20:55:10 cowtn postgres[15890]: [106-1] 2020-03-23 20:55:10.317
> GMT [15890] LOG:  automatic aggressive vacuum to prevent wraparound of
> table "postgres.pg_toast.pg_toast_13297": index scans: 0
>
> >
> >
> > > > I wonder if what might be happening is that we're somehow missed/failed
> > > > to update relfrozenxid and/or datfrozenxid. If you manually vacuum some
> > > > table in the oldest database, but that is *NOT* the oldest table itself,
> > > > does the problem "resolve" itself?

Andres-

I did confirm that if you run a manual vacuum on a table in the oldest
database but NOT the oldest table, the problem does resolve itself.
Here is the before and after:

postgres=# SELECT datname
    , age(datfrozenxid)
    , current_setting('autovacuum_freeze_max_age')
FROM pg_database;
  datname  |    age    | current_setting
-----------+-----------+-----------------
 postgres  | 202375735 | 200000000
 template1 | 202345459 | 200000000
 template0 | 132459914 | 200000000
 feedi     | 132459914 | 200000000
(4 rows)

postgres=#  SELECT c.oid::regclass
    , age(c.relfrozenxid)
    , pg_size_pretty(pg_total_relation_size(c.oid))
    , age(c.relfrozenxid)/200000000::float AS pct
FROM pg_class c
JOIN pg_namespace n on c.relnamespace = n.oid
WHERE relkind IN ('r', 't', 'm')
ORDER BY 2 DESC LIMIT 10;
          oid          |    age    | pg_size_pretty |     pct
-----------------------+-----------+----------------+-------------
 pg_subscription       | 203640681 | 24 kB          | 1.018203405
 pg_authid             | 203640681 | 80 kB          | 1.018203405
 pg_tablespace         | 203620449 | 80 kB          | 1.018102245
 pg_shseclabel         | 203620449 | 16 kB          | 1.018102245
 pg_auth_members       | 203620449 | 72 kB          | 1.018102245
 pg_database           | 203620449 | 80 kB          | 1.018102245
 pg_replication_origin | 203620449 | 24 kB          | 1.018102245
 pg_statistic          |  53646423 | 456 kB         | 0.268232115
 pg_statistic_ext_data |  53646423 | 16 kB          | 0.268232115
 pg_user_mapping       |  53646423 | 24 kB          | 0.268232115
(10 rows)

postgres=# vacuum FREEZE VERBOSE ANALYZE pg_statistic;
INFO:  aggressively vacuuming "pg_catalog.pg_statistic"
INFO:  "pg_statistic": found 0 removable, 18 nonremovable row versions
in 1 out of 38 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 39602876
There were 18 unused item identifiers.
Skipped 0 pages due to buffer pins, 37 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  aggressively vacuuming "pg_toast.pg_toast_2619"
INFO:  "pg_toast_2619": found 0 removable, 13 nonremovable row
versions in 4 out of 4 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 39602876
There were 12 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

postgres=# SELECT datname
    , age(datfrozenxid)
    , current_setting('autovacuum_freeze_max_age')
FROM pg_database;
  datname  |    age    | current_setting
-----------+-----------+-----------------
 postgres  |  54640707 | 200000000
 template1 |     38248 | 200000000
 template0 | 134719144 | 200000000
 feedi     | 134719144 | 200000000
(4 rows)

Immediately after running this, all the autovac's starting working
again.  I've attached logfiles that might be easier to read.

We are going to try and recreate the issue, in the meantime, I've
turned on DEBUG1 and when it reoccurs, I'll run the queries in the
"What Is" section as well.

I also have a backtrace of the autovac launcher, but there appears to
be nothing useful at all there (as you'd probably expect) and there
were no autovac workers running.

Thank you again-
Justin

>
> postgres=# SELECT datname
>     , age(datfrozenxid)
>     , current_setting('autovacuum_freeze_max_age')
> FROM pg_database;
>   datname  |    age    | current_setting
> -----------+-----------+-----------------
>  postgres  | 202375735 | 200000000
>  template1 | 202345459 | 200000000
>  template0 | 132459914 | 200000000
>  feedi     | 132459914 | 200000000
> (4 rows)
>
>
> > >
> > > I'll also add pg_class snapshot for next time we run the bench.  I'm not sure
> > > if we'll be able to catch the 1h45 interval when the system stays alive
> > > after the issue though.
> >
> > Could you just script something to stop the benchmark once the disk is
> > 90% full or so?
> >
> > Did you see any errors / fatals around the time autovacuum stopped
> > working?
>
> There are no ERROR or FATAL messages around the time -- (or at all).
>
> Since this is occurring right now, what else would be useful to
> capture?  You'd asked about a GDB -- do you want that of the main
> process or the autovac worker?
>
> >
> > Greetings,
> >
> > Andres Freund

Attachment

pgsql-general by date:

Previous
From: Andres Freund
Date:
Subject: Re: PG12 autovac issues
Next
From: pinker
Date:
Subject: Re: Loading 500m json files to database