Re: ERROR: unexpected chunk number 452 (expected 0) for toast value94674063 in pg_toast_56980977 - Mailing list pgsql-admin

From Alvaro Herrera
Subject Re: ERROR: unexpected chunk number 452 (expected 0) for toast value94674063 in pg_toast_56980977
Date
Msg-id 20180718195850.axhzokgbggoxpcmr@alvherre.pgsql
Whole thread Raw
In response to ERROR: unexpected chunk number 452 (expected 0) for toast value94674063 in pg_toast_56980977  (Naveen Kumar <naveenchowdaryon@gmail.com>)
Responses Re: ERROR: unexpected chunk number 452 (expected 0) for toast value 94674063 in pg_toast_56980977  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-admin
On 2018-Jul-18, Naveen Kumar wrote:

> Hello Experts,
> 
> What could be the reason for bellow errors. Can one help me to fix this?
> 
> PostgreSQL Database: [~rptdb~] Data Backup Failed with PostgreSQL Error:
> [~pg_dump: Dumping the contents of table "document" failed: PQgetResult()
> failed.pg_dump: Error message from server: ERROR: unexpected chunk number
> 452 (expected 0) for toast value 94674063 in pg_toast_56980977pg_dump: The
> command was: COPY reports_extended.document (document_id, access_key,
> created_date, document_name, document_size, document_status, document_type,
> encryption_type, external_system_storage_id, external_system_storage_url,
> last_updated_date, md5_hash, source_system_name, storage_type, created_by,
> customer_org_id, content_type, tags, file, ownedby) TO stdout;~].

There are two related bugfixes in 9.6.9 (and all other versions released
with it).  Maybe you'd do well to upgrade to the latest minor of
whatever supported branch you're using, and then let's discuss more.  If
you were victim to this bug I'm not sure there's direct mitigation, or
you'll need to UPDATE the offending rows to set the columns to null (or
some other value).  Eight years ago (!) I wrote a function to scan for
those, quoted in this blog post (sorry about my Spanish):
  https://alvherre.livejournal.com/4404.html
May be helpful.  (I think it needs a minor fix to run in current
releases.)

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL_11_BR [d1e907929] 2018-04-11 18:11:29 -0400
Branch: REL_10_STABLE Release: REL_10_4 [08e6cda1c] 2018-04-11 18:11:29 -0400
Branch: REL9_6_STABLE Release: REL9_6_9 [060bb38d0] 2018-04-11 18:11:30 -0400
Branch: REL9_5_STABLE Release: REL9_5_13 [efbe36a2c] 2018-04-11 18:11:30 -0400
Branch: REL9_4_STABLE Release: REL9_4_18 [6943fb927] 2018-04-11 18:11:30 -0400
Branch: REL9_3_STABLE Release: REL9_3_23 [66d4b6bb8] 2018-04-11 18:11:30 -0400

    Ignore nextOid when replaying an ONLINE checkpoint.
    
    The nextOid value is from the start of the checkpoint and may well be stale
    compared to values from more recent XLOG_NEXTOID records.  Previously, we
    adopted it anyway, allowing the OID counter to go backwards during a crash.
    While this should be harmless, it contributed to the severity of the bug
    fixed in commit 0408e1ed5, by allowing duplicate TOAST OIDs to be assigned
    immediately following a crash.  Without this error, that issue would only
    have arisen when TOAST objects just younger than a multiple of 2^32 OIDs
    were deleted and then not vacuumed in time to avoid a conflict.
    
    Pavan Deolasee
    
    Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL_11_BR [0408e1ed5] 2018-04-11 17:41:22 -0400
Branch: REL_10_STABLE Release: REL_10_4 [5a11bf970] 2018-04-11 17:41:23 -0400
Branch: REL9_6_STABLE Release: REL9_6_9 [8bba10f7e] 2018-04-11 17:41:25 -0400
Branch: REL9_5_STABLE Release: REL9_5_13 [3767216fb] 2018-04-11 17:41:26 -0400
Branch: REL9_4_STABLE Release: REL9_4_18 [5b3ed6b78] 2018-04-11 17:41:27 -0400
Branch: REL9_3_STABLE Release: REL9_3_23 [7448e7e23] 2018-04-11 17:41:28 -0400

    Do not select new object OIDs that match recently-dead entries.
    
    When selecting a new OID, we take care to avoid picking one that's already
    in use in the target table, so as not to create duplicates after the OID
    counter has wrapped around.  However, up to now we used SnapshotDirty when
    scanning for pre-existing entries.  That ignores committed-dead rows, so
    that we could select an OID matching a deleted-but-not-yet-vacuumed row.
    While that mostly worked, it has two problems:
    
    * If recently deleted, the dead row might still be visible to MVCC
    snapshots, creating a risk for duplicate OIDs when examining the catalogs
    within our own transaction.  Such duplication couldn't be visible outside
    the object-creating transaction, though, and we've heard few if any field
    reports corresponding to such a symptom.
    
    * When selecting a TOAST OID, deleted toast rows definitely *are* visible
    to SnapshotToast, and will remain so until vacuumed away.  This leads to
    a conflict that will manifest in errors like "unexpected chunk number 0
    (expected 1) for toast value nnnnn".  We've been seeing reports of such
    errors from the field for years, but the cause was unclear before.
    
    The fix is simple: just use SnapshotAny to search for conflicting rows.
    This results in a slightly longer window before object OIDs can be
    recycled, but that seems unlikely to create any large problems.
    
    Pavan Deolasee
    
    Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-admin by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: find objects stored on a specific tablespace
Next
From: Tom Lane
Date:
Subject: Re: ERROR: unexpected chunk number 452 (expected 0) for toast value 94674063 in pg_toast_56980977