Re: BUG #18351: VACUUM FULL fails with error: missing chunk number 0 for toast value XXX - Mailing list pgsql-bugs

From Alexander Lakhin
Subject Re: BUG #18351: VACUUM FULL fails with error: missing chunk number 0 for toast value XXX
Date
Msg-id 80abbbf0-588b-f808-c264-77c3ebee83f3@gmail.com
Whole thread Raw
In response to BUG #18351: VACUUM FULL fails with error: missing chunk number 0 for toast value XXX  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #18351: VACUUM FULL fails with error: missing chunk number 0 for toast value XXX  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-bugs
17.02.2024 23:00, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference:      18351
>
> The following script:
> ...
> VACUUM tbl;
> VACUUM FULL tbl;
> " | psql -d db$c >psql-2-$c.log 2>&1 &
>    done
> wait
> grep 'missing chunk number' server.log  && break;
> done
>

> So it looks like recently dead tuples might have no correspondent toast
> values to detoast attributes of such tuples.
>
> Reproduced on REL_12_STABLE .. master.
> Probably, the same issue was reported some time ago:
> https://www.postgresql.org/message-id/flat/CALdSSPhmqoN02ciT4UxS6ax0N84NpRwPWm87nKJ_%2B0G-Na8qOQ%40mail.gmail.com

I've discovered the old bug #5998
https://www.postgresql.org/message-id/flat/201104291506.p3TF6dA9003698%40wwwmaster.postgresql.org
with similar symptoms and a very relevant question:
 > So, how is it that the vacuum removed the underlying toast data but not
 > the putatively-recently-dead tuple?

In fact, the script presented here, produces the error even on 83b758494,
so this anomaly is pretty old.
But for the given case, the answer to that question is different.
With additional debug and verbose logging (on REL_13_STABLE, as
lazy_scan_heap() was more flat back then) I see:

2024-02-27 08:22:04.859 UTC|law|db9|65dd9bac.12914b|LOG:  statement: VACUUM (VERBOSE) tbl;

2024-02-27 08:22:04.859 UTC|law|db9|65dd9bac.12914b|LOG: !!!vacuum_rel: relid: 16980, snapshot->xmin: 845, 
snapshot->xmax: 865

2024-02-27 08:22:04.859 UTC|law|db9|65dd9bac.12914b|LOG: !!!lazy_scan_heap| relname: tbl, blkno: 0, offnum: 1, 
ItemIdIsDead(itemid): 0

2024-02-27 08:22:04.859 UTC|law|db9|65dd9bac.12914b|LOG: !!!lazy_scan_heap| relname: tbl, blkno: 0, offnum: 1, 
tuple.t_data->t_choice.t_heap.t_xmin: 859, tuple.t_data->t_choice.t_heap.t_xmax: 864, OldestXmin: 845, 
HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf): 2
### The tbl's tuple is RECENTLY_DEAD ###

2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|INFO:  "tbl": found 0 removable, 1 nonremovable row versions in 1 
out of 1 pages
2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|DETAIL:  1 dead row versions cannot be removed yet, oldest xmin:
845
     There were 0 unused item identifiers.
     Skipped 0 pages due to buffer pins, 0 frozen pages.
     0 pages are entirely empty.
     CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|LOG: !!!lazy_scan_heap| relname: pg_toast_16980, blkno: 0, offnum:

1, ItemIdIsDead(itemid): 1
2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|CONTEXT:  while scanning block 0 of relation
"pg_toast.pg_toast_16980"

2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|LOG: !!!lazy_scan_heap| relname: pg_toast_16980, blkno: 0, offnum:

2, ItemIdIsDead(itemid): 1
2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|CONTEXT:  while scanning block 0 of relation
"pg_toast.pg_toast_16980"
### But the toast's tuples are DEAD according to their lp_flags ###

2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|INFO: "pg_toast_16980": found 2 removable, 0 nonremovable row 
versions in 1 out of 1 pages
2024-02-27 08:22:04.860 UTC|law|db9|65dd9bac.12914b|DETAIL:  0 dead row versions cannot be removed yet, oldest xmin:
866
     There were 0 unused item identifiers.
     Skipped 0 pages due to buffer pins, 0 frozen pages.
     0 pages are entirely empty.
     CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

2024-02-27 08:22:04.861 UTC|law|db9|65dd9bac.12914b|STATEMENT: VACUUM FULL tbl;

2024-02-27 08:22:04.862 UTC|law|db9|65dd9bac.12914b|ERROR:  missing chunk number 0 for toast value 17045 in
pg_toast_16980

So in this case, the heap tuple and the pg_toast_xxx tuples are out-of-sync
in regard to their DEAD status after DELETE, then VACUUM removes dead
tuples from pg_toast_xxx, and the following VACUUM FULL
(heapam_relation_copy_for_cluster()) fails to copy the heap tuple in
absence of corresponding toast tuples.

Best regards,
Alexander



pgsql-bugs by date:

Previous
From: Matti Aarnio
Date:
Subject: Re: BUG #18366: Not present at postgresql.org YUM repository
Next
From: jian he
Date:
Subject: Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build