Thread: ERROR: unexpected chunk number 452 (expected 0) for toast value94674063 in pg_toast_56980977

Hello Experts,

What could be the reason for bellow errors. Can one help me to fix this?

PostgreSQL Database: [~rptdb~] Data Backup Failed with PostgreSQL Error: [~pg_dump: Dumping the contents of table "document" failed: PQgetResult() failed.pg_dump: Error message from server: ERROR: unexpected chunk number 452 (expected 0) for toast value 94674063 in pg_toast_56980977pg_dump: The command was: COPY reports_extended.document (document_id, access_key, created_date, document_name, document_size, document_status, document_type, encryption_type, external_system_storage_id, external_system_storage_url, last_updated_date, md5_hash, source_system_name, storage_type, created_by, customer_org_id, content_type, tags, file, ownedby) TO stdout;~].

Thanks & Regards,

Naveen Kumar .M,
Sr. PostgreSQL Database Administrator,
Mobile: 7755929449.

My attitude will always be based on how you treat me. 
Hi Naveen,

Normally we see the chunk errors comes due to the disk IO storage level issues and it leads towards the data corruption.

Try reindexing or vacuuming otherwise it may need a detailed analysis to resolve these chunk errors.



On Wed, Jul 18, 2018 at 9:15 PM, Naveen Kumar <naveenchowdaryon@gmail.com> wrote:
Hello Experts,

What could be the reason for bellow errors. Can one help me to fix this?

PostgreSQL Database: [~rptdb~] Data Backup Failed with PostgreSQL Error: [~pg_dump: Dumping the contents of table "document" failed: PQgetResult() failed.pg_dump: Error message from server: ERROR: unexpected chunk number 452 (expected 0) for toast value 94674063 in pg_toast_56980977pg_dump: The command was: COPY reports_extended.document (document_id, access_key, created_date, document_name, document_size, document_status, document_type, encryption_type, external_system_storage_id, external_system_storage_url, last_updated_date, md5_hash, source_system_name, storage_type, created_by, customer_org_id, content_type, tags, file, ownedby) TO stdout;~].

Thanks & Regards,

Naveen Kumar .M,
Sr. PostgreSQL Database Administrator,
Mobile: 7755929449.

My attitude will always be based on how you treat me. 

If the reindex does not work, You need to loop through the data and identify which rows are bad and delete them. 

Modify the following script to suit your needs and it should help you identify which rows are bad.

DO $f$
DECLARE
    baddata TEXT;
    badid INT;
BEGIN
FOR badid IN SELECT id FROM badtable LOOP
    BEGIN
        SELECT badcolumn
        INTO columndata
        FROM badtable where id = badid;
    EXCEPTION
        WHEN OTHERS THEN
            RAISE NOTICE 'Data for ID % is corrupt', badid;
            CONTINUE;
    END;
END LOOP;
END;
$f$


On Wed, Jul 18, 2018 at 9:01 AM, Shreeyansh Dba <shreeyansh2014@gmail.com> wrote:
Hi Naveen,

Normally we see the chunk errors comes due to the disk IO storage level issues and it leads towards the data corruption.

Try reindexing or vacuuming otherwise it may need a detailed analysis to resolve these chunk errors.



On Wed, Jul 18, 2018 at 9:15 PM, Naveen Kumar <naveenchowdaryon@gmail.com> wrote:
Hello Experts,

What could be the reason for bellow errors. Can one help me to fix this?

PostgreSQL Database: [~rptdb~] Data Backup Failed with PostgreSQL Error: [~pg_dump: Dumping the contents of table "document" failed: PQgetResult() failed.pg_dump: Error message from server: ERROR: unexpected chunk number 452 (expected 0) for toast value 94674063 in pg_toast_56980977pg_dump: The command was: COPY reports_extended.document (document_id, access_key, created_date, document_name, document_size, document_status, document_type, encryption_type, external_system_storage_id, external_system_storage_url, last_updated_date, md5_hash, source_system_name, storage_type, created_by, customer_org_id, content_type, tags, file, ownedby) TO stdout;~].

Thanks & Regards,

Naveen Kumar .M,
Sr. PostgreSQL Database Administrator,
Mobile: 7755929449.

My attitude will always be based on how you treat me. 




--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
On 2018-Jul-18, Naveen Kumar wrote:

> Hello Experts,
> 
> What could be the reason for bellow errors. Can one help me to fix this?
> 
> PostgreSQL Database: [~rptdb~] Data Backup Failed with PostgreSQL Error:
> [~pg_dump: Dumping the contents of table "document" failed: PQgetResult()
> failed.pg_dump: Error message from server: ERROR: unexpected chunk number
> 452 (expected 0) for toast value 94674063 in pg_toast_56980977pg_dump: The
> command was: COPY reports_extended.document (document_id, access_key,
> created_date, document_name, document_size, document_status, document_type,
> encryption_type, external_system_storage_id, external_system_storage_url,
> last_updated_date, md5_hash, source_system_name, storage_type, created_by,
> customer_org_id, content_type, tags, file, ownedby) TO stdout;~].

There are two related bugfixes in 9.6.9 (and all other versions released
with it).  Maybe you'd do well to upgrade to the latest minor of
whatever supported branch you're using, and then let's discuss more.  If
you were victim to this bug I'm not sure there's direct mitigation, or
you'll need to UPDATE the offending rows to set the columns to null (or
some other value).  Eight years ago (!) I wrote a function to scan for
those, quoted in this blog post (sorry about my Spanish):
  https://alvherre.livejournal.com/4404.html
May be helpful.  (I think it needs a minor fix to run in current
releases.)

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL_11_BR [d1e907929] 2018-04-11 18:11:29 -0400
Branch: REL_10_STABLE Release: REL_10_4 [08e6cda1c] 2018-04-11 18:11:29 -0400
Branch: REL9_6_STABLE Release: REL9_6_9 [060bb38d0] 2018-04-11 18:11:30 -0400
Branch: REL9_5_STABLE Release: REL9_5_13 [efbe36a2c] 2018-04-11 18:11:30 -0400
Branch: REL9_4_STABLE Release: REL9_4_18 [6943fb927] 2018-04-11 18:11:30 -0400
Branch: REL9_3_STABLE Release: REL9_3_23 [66d4b6bb8] 2018-04-11 18:11:30 -0400

    Ignore nextOid when replaying an ONLINE checkpoint.
    
    The nextOid value is from the start of the checkpoint and may well be stale
    compared to values from more recent XLOG_NEXTOID records.  Previously, we
    adopted it anyway, allowing the OID counter to go backwards during a crash.
    While this should be harmless, it contributed to the severity of the bug
    fixed in commit 0408e1ed5, by allowing duplicate TOAST OIDs to be assigned
    immediately following a crash.  Without this error, that issue would only
    have arisen when TOAST objects just younger than a multiple of 2^32 OIDs
    were deleted and then not vacuumed in time to avoid a conflict.
    
    Pavan Deolasee
    
    Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL_11_BR [0408e1ed5] 2018-04-11 17:41:22 -0400
Branch: REL_10_STABLE Release: REL_10_4 [5a11bf970] 2018-04-11 17:41:23 -0400
Branch: REL9_6_STABLE Release: REL9_6_9 [8bba10f7e] 2018-04-11 17:41:25 -0400
Branch: REL9_5_STABLE Release: REL9_5_13 [3767216fb] 2018-04-11 17:41:26 -0400
Branch: REL9_4_STABLE Release: REL9_4_18 [5b3ed6b78] 2018-04-11 17:41:27 -0400
Branch: REL9_3_STABLE Release: REL9_3_23 [7448e7e23] 2018-04-11 17:41:28 -0400

    Do not select new object OIDs that match recently-dead entries.
    
    When selecting a new OID, we take care to avoid picking one that's already
    in use in the target table, so as not to create duplicates after the OID
    counter has wrapped around.  However, up to now we used SnapshotDirty when
    scanning for pre-existing entries.  That ignores committed-dead rows, so
    that we could select an OID matching a deleted-but-not-yet-vacuumed row.
    While that mostly worked, it has two problems:
    
    * If recently deleted, the dead row might still be visible to MVCC
    snapshots, creating a risk for duplicate OIDs when examining the catalogs
    within our own transaction.  Such duplication couldn't be visible outside
    the object-creating transaction, though, and we've heard few if any field
    reports corresponding to such a symptom.
    
    * When selecting a TOAST OID, deleted toast rows definitely *are* visible
    to SnapshotToast, and will remain so until vacuumed away.  This leads to
    a conflict that will manifest in errors like "unexpected chunk number 0
    (expected 1) for toast value nnnnn".  We've been seeing reports of such
    errors from the field for years, but the cause was unclear before.
    
    The fix is simple: just use SnapshotAny to search for conflicting rows.
    This results in a slightly longer window before object OIDs can be
    recycled, but that seems unlikely to create any large problems.
    
    Pavan Deolasee
    
    Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2018-Jul-18, Naveen Kumar wrote:
>> What could be the reason for bellow errors. Can one help me to fix this?
>> Error message from server: ERROR: unexpected chunk number
>> 452 (expected 0) for toast value 94674063 in pg_toast_56980977

> There are two related bugfixes in 9.6.9 (and all other versions released
> with it).  Maybe you'd do well to upgrade to the latest minor of
> whatever supported branch you're using, and then let's discuss more.

It would certainly be helpful to upgrade to latest minor if not there
already, not to mention tell us which PG version this is.  But I'm
not really convinced that the 0408e1ed5 bug matches this symptom.
That would've led to duplicate TOAST rows, but it's hard to see how
duplicates could produce the above.

I'm inclined to think that this is either a corrupt toast index
(which'd be fixable with REINDEX) or actually-missing TOAST data.
Either way, if it's due to a PG bug rather than storage system
malfeasance, it's some other bug than that one.

            regards, tom lane