jboes@nexcerpt.com (Jeff Boes) writes:
> We have a few tables that seem to have suffered some kind of
> corruption in their TOAST files. Symptoms include:
> + Runaway CPU usage during VACUUM ANALYZE (load average
> spikes of 20x normal for 10-15 minutes at a time) and
> more rarely during other operations
This seems odd. Can you attach to one of the runaway backends with a
debugger and get a stack trace? That might give some clue what the
problem is.
> + Recurring messages of "ERROR: Parent tuple was not found"
> during VACUUM FULL
The cause of this (or a cause of this, anyway) is known and fixed for
7.3. If you like you could try the attached back-patch for 7.2.3,
which fixes the most common case. (Plug: this patch is straight out of
the RHDB sources; if you are on a Red Hat platform I'd suggest trying
7.2.3-RH as soon as it's out.)
If that doesn't seem to help then you can go on with plan B:
> what I would like to try is:
> 1. pg_dump the table
> 2. truncate the table
> 3. VACUUM FULL (is this necessary?)
No, not if you truncated the table.
> 4. reload from pg_dump file
This should work to remove any data corruption in the table, assuming
you are able to get a clean dump. (You want a data-only dump of
course.)
> Can I use the same snippet of code generated by pg_dump to disable the
> triggers?
Should work. I think pg_dump may provide that for free in a data-only
dump, anyway.
BTW, there is an oversight in 7.2's TRUNCATE code: it only truncates the
given relation and not the TOAST table for same. But fortunately it
also fails to check just what you're truncating, so you can manually
issue a TRUNCATE against the TOAST table after truncating the main
table. (But I think you might have to run a standalone postgres with
-O -P to be allowed to do the latter.)
regards, tom lane
*** src/backend/commands/vacuum.c.orig Mon Sep 30 15:45:57 2002
--- src/backend/commands/vacuum.c Sat Oct 12 14:21:29 2002
***************
*** 187,192 ****
--- 187,196 ----
if (IsTransactionBlock())
elog(ERROR, "%s cannot run inside a BEGIN/END block", stmttype);
+ /* Running VACUUM from a function would free the function context */
+ if (!MemoryContextContains(QueryContext, vacstmt))
+ elog(ERROR, "%s cannot be executed from a function", stmttype);
+
/*
* Send info about dead objects to the statistics collector
*/
***************
*** 1320,1326 ****
usable_free_size = 0;
}
! if (usable_free_size > 0 && num_vtlinks > 0)
{
qsort((char *) vtlinks, num_vtlinks, sizeof(VTupleLinkData),
vac_cmp_vtlinks);
--- 1324,1331 ----
usable_free_size = 0;
}
! /* don't bother to save vtlinks if we will not call repair_frag */
! if (fraged_pages->num_pages > 0 && num_vtlinks > 0)
{
qsort((char *) vtlinks, num_vtlinks, sizeof(VTupleLinkData),
vac_cmp_vtlinks);
***************
*** 1602,1608 ****
*/
if ((tuple.t_data->t_infomask & HEAP_UPDATED &&
!TransactionIdPrecedes(tuple.t_data->t_xmin, OldestXmin)) ||
! (!(tuple.t_data->t_infomask & HEAP_XMAX_INVALID) &&
!(ItemPointerEquals(&(tuple.t_self),
&(tuple.t_data->t_ctid)))))
{
--- 1607,1614 ----
*/
if ((tuple.t_data->t_infomask & HEAP_UPDATED &&
!TransactionIdPrecedes(tuple.t_data->t_xmin, OldestXmin)) ||
! (!(tuple.t_data->t_infomask & (HEAP_XMAX_INVALID |
! HEAP_MARKED_FOR_UPDATE)) &&
!(ItemPointerEquals(&(tuple.t_self),
&(tuple.t_data->t_ctid)))))
{
***************
*** 1633,1639 ****
* If this tuple is in the begin/middle of the chain then
* we have to move to the end of chain.
*/
! while (!(tp.t_data->t_infomask & HEAP_XMAX_INVALID) &&
!(ItemPointerEquals(&(tp.t_self),
&(tp.t_data->t_ctid))))
{
--- 1639,1646 ----
* If this tuple is in the begin/middle of the chain then
* we have to move to the end of chain.
*/
! while (!(tp.t_data->t_infomask & (HEAP_XMAX_INVALID |
! HEAP_MARKED_FOR_UPDATE)) &&
!(ItemPointerEquals(&(tp.t_self),
&(tp.t_data->t_ctid))))
{