Thread: How am I supposed to fix this?
I'm getting the below, and am unaware of how to fix it.... 11.4 on FreeBSD 12. ler=# reindex (verbose) table dns_query ; INFO: index "dns_query_pkey" was reindexed DETAIL: CPU: user: 114.29 s, system: 207.94 s, elapsed: 698.87 s ERROR: index "pg_toast_17760_index" contains unexpected zero page at block 23686 HINT: Please REINDEX it. CONTEXT: parallel worker ler=# reindex index pg_toast_17760_index; ERROR: relation "pg_toast_17760_index" does not exist ler=# reindex (verbose) database ler; INFO: index "pg_class_oid_index" was reindexed DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: index "pg_class_relname_nsp_index" was reindexed DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: index "pg_class_tblspc_relfilenode_index" was reindexed DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s INFO: table "pg_catalog.pg_class" was reindexed load: 14.53 cmd: psql 2675 [select] 2765.27r 0.01u 0.01s 0% 8292k INFO: index "dns_query_pkey" was reindexed DETAIL: CPU: user: 112.91 s, system: 205.51 s, elapsed: 688.28 s ERROR: index "pg_toast_17760_index" contains unexpected zero page at block 23686 HINT: Please REINDEX it. ler=# ler=# select version(); version ----------------------------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 11.4 on amd64-portbld-freebsd12.0, compiled by FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) (based on LLVM 8.0.0), 64-bit (1 row) ler=# -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
On Tue, Aug 06, 2019 at 12:06:45PM -0500, Larry Rosenman wrote: >I'm getting the below, and am unaware of how to fix it.... > >11.4 on FreeBSD 12. > > > >ler=# reindex (verbose) table dns_query ; >INFO: index "dns_query_pkey" was reindexed >DETAIL: CPU: user: 114.29 s, system: 207.94 s, elapsed: 698.87 s >ERROR: index "pg_toast_17760_index" contains unexpected zero page at >block 23686 >HINT: Please REINDEX it. >CONTEXT: parallel worker >ler=# reindex index pg_toast_17760_index; >ERROR: relation "pg_toast_17760_index" does not exist You probably need to explicitly say pg_toast.pg_toast_17760_index here, because pg_toast schema is not part of the search_path by default. >ler=# reindex (verbose) database ler; >INFO: index "pg_class_oid_index" was reindexed >DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s >INFO: index "pg_class_relname_nsp_index" was reindexed >DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s >INFO: index "pg_class_tblspc_relfilenode_index" was reindexed >DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s >INFO: table "pg_catalog.pg_class" was reindexed >load: 14.53 cmd: psql 2675 [select] 2765.27r 0.01u 0.01s 0% 8292k >INFO: index "dns_query_pkey" was reindexed >DETAIL: CPU: user: 112.91 s, system: 205.51 s, elapsed: 688.28 s >ERROR: index "pg_toast_17760_index" contains unexpected zero page at >block 23686 >HINT: Please REINDEX it. >ler=# > Assuming the toast index is corrupted, this is kinda expected (when trying to reindex an index on the toasted data). The question is how much other data corruption is there ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Aug-06, Larry Rosenman wrote: > ler=# reindex index pg_toast_17760_index; > ERROR: relation "pg_toast_17760_index" does not exist Maybe try "reindex index pg_toast.pg_toast_17760_index" > ler=# reindex (verbose) database ler; [...] > ERROR: index "pg_toast_17760_index" contains unexpected zero page at block > 23686 > HINT: Please REINDEX it. I suspect REINDEX is trying to access that index for some other reason than reindexing it. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 6, 2019 at 10:19 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > The question is how much other data corruption is there ... Larry could try running amcheck on the other indexes. Just the basic bt_check_index() checks should be enough to detect problems like this. They can be run fairly non-disruptively. Something like this should do it: SELECT bt_index_check(index => c.oid), c.relname, c.relpages FROM pg_index i JOIN pg_opclass op ON i.indclass[0] = op.oid JOIN pg_am am ON op.opcmethod = am.oid JOIN pg_class c ON i.indexrelid = c.oid JOIN pg_namespace n ON c.relnamespace = n.oid WHERE am.amname = 'btree' -- Don't check temp tables, which may be from another session: AND c.relpersistence != 't' -- Function may throw an error when this is omitted: AND c.relkind = 'i' AND i.indisready AND i.indisvalid ORDER BY c.relpages DESC; If this takes too long, you can always adjust the query to only verify system indexes or TOAST indexes. -- Peter Geoghegan
On 08/06/2019 12:30 pm, Peter Geoghegan wrote: > On Tue, Aug 6, 2019 at 10:19 AM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> The question is how much other data corruption is there ... > > Larry could try running amcheck on the other indexes. Just the basic > bt_check_index() checks should be enough to detect problems like this. > They can be run fairly non-disruptively. Something like this should do > it: > > SELECT bt_index_check(index => c.oid), > c.relname, > c.relpages > FROM pg_index i > JOIN pg_opclass op ON i.indclass[0] = op.oid > JOIN pg_am am ON op.opcmethod = am.oid > JOIN pg_class c ON i.indexrelid = c.oid > JOIN pg_namespace n ON c.relnamespace = n.oid > WHERE am.amname = 'btree' > -- Don't check temp tables, which may be from another session: > AND c.relpersistence != 't' > -- Function may throw an error when this is omitted: > AND c.relkind = 'i' AND i.indisready AND i.indisvalid > ORDER BY c.relpages DESC; > > If this takes too long, you can always adjust the query to only verify > system indexes or TOAST indexes. ler=# SELECT bt_index_check(index => c.oid), ler-# c.relname, ler-# c.relpages ler-# FROM pg_index i ler-# JOIN pg_opclass op ON i.indclass[0] = op.oid ler-# JOIN pg_am am ON op.opcmethod = am.oid ler-# JOIN pg_class c ON i.indexrelid = c.oid ler-# JOIN pg_namespace n ON c.relnamespace = n.oid ler-# WHERE am.amname = 'btree' ler-# -- Don't check temp tables, which may be from another session: ler-# AND c.relpersistence != 't' ler-# -- Function may throw an error when this is omitted: ler-# AND c.relkind = 'i' AND i.indisready AND i.indisvalid ler-# ORDER BY c.relpages DESC; ERROR: function bt_index_check(index => oid) does not exist LINE 1: SELECT bt_index_check(index => c.oid), ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. ler=# -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
On Tue, Aug 6, 2019 at 10:34 AM Larry Rosenman <ler@lerctr.org> wrote: > ERROR: function bt_index_check(index => oid) does not exist > LINE 1: SELECT bt_index_check(index => c.oid), > ^ > HINT: No function matches the given name and argument types. You might > need to add explicit type casts. It's a contrib extension, so you have to "create extension amcheck" first. -- Peter Geoghegan
On 08/06/2019 12:35 pm, Peter Geoghegan wrote: > On Tue, Aug 6, 2019 at 10:34 AM Larry Rosenman <ler@lerctr.org> wrote: >> ERROR: function bt_index_check(index => oid) does not exist >> LINE 1: SELECT bt_index_check(index => c.oid), >> ^ >> HINT: No function matches the given name and argument types. You >> might >> need to add explicit type casts. > > It's a contrib extension, so you have to "create extension amcheck" > first. the check is running (this is a HUGE table). For the initial error, it would be nice if: 1) the pg_toast schema was mentioned or 2) reindex searched pg_toast as well. I did do the reindex pg_toast. index. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
On 08/06/2019 12:45 pm, Larry Rosenman wrote: > On 08/06/2019 12:35 pm, Peter Geoghegan wrote: >> On Tue, Aug 6, 2019 at 10:34 AM Larry Rosenman <ler@lerctr.org> wrote: >>> ERROR: function bt_index_check(index => oid) does not exist >>> LINE 1: SELECT bt_index_check(index => c.oid), >>> ^ >>> HINT: No function matches the given name and argument types. You >>> might >>> need to add explicit type casts. >> >> It's a contrib extension, so you have to "create extension amcheck" >> first. > > > the check is running (this is a HUGE table). > > For the initial error, it would be nice if: > 1) the pg_toast schema was mentioned > or > 2) reindex searched pg_toast as well. > > I did do the reindex pg_toast. index. As a followup, btcheck found another index that had issues, and a toast table was missing a chunk. I have ALL the data I used to create this table still around so I just dropped it and am reloading the data. I still think that the error message should mention the fully qualified index name. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
On Tue, Aug 6, 2019 at 11:11 AM Larry Rosenman <ler@lerctr.org> wrote: > As a followup, btcheck found another index that had issues, and a toast > table was missing a chunk. > > I have ALL the data I used to create this table still around so I just > dropped it and am reloading the data. It sounds like there is a generic storage issue at play here. Often TOAST data is the apparent first thing that gets corrupted, because that's only because the inconsistencies are relatively obvious. I suggest that you rerun amcheck using the same query, though this time specify "heapallindexed=true" to bt_check_index(). Increase maintenance_work_mem if it's set to a low value first (ideally you can crank it up to 600MB). This type of verification will take a lot longer, but will find more subtle inconsistencies that could easily be missed. Please let us know how this goes. I am always keen to hear about how much the tooling helps in the real world. -- Peter Geoghegan
On 08/06/2019 1:16 pm, Peter Geoghegan wrote: > On Tue, Aug 6, 2019 at 11:11 AM Larry Rosenman <ler@lerctr.org> wrote: >> As a followup, btcheck found another index that had issues, and a >> toast >> table was missing a chunk. >> >> I have ALL the data I used to create this table still around so I just >> dropped it and am reloading the data. > > It sounds like there is a generic storage issue at play here. Often > TOAST data is the apparent first thing that gets corrupted, because > that's only because the inconsistencies are relatively obvious. > > I suggest that you rerun amcheck using the same query, though this > time specify "heapallindexed=true" to bt_check_index(). Increase > maintenance_work_mem if it's set to a low value first (ideally you can > crank it up to 600MB). This type of verification will take a lot > longer, but will find more subtle inconsistencies that could easily be > missed. > > Please let us know how this goes. I am always keen to hear about how > much the tooling helps in the real world. I've already dropped and re-created the table involved (a metric crapton of DNS queries). I know why and how this happened as well. I had to fully restore my system, and bacula didn't catch all the data etc since it was being modified, and I didn't do the smart thing then and restore from a pg_dump. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106