Thread: rewriteheap.c bug: toast rows don't get XIDs matching their parents

rewriteheap.c bug: toast rows don't get XIDs matching their parents

From
Tom Lane
Date:
While working on bug #6393 I was reminded of the truth of $SUBJECT: any
rows inserted into the new toast table will have the xmin of the CLUSTER
or VACUUM FULL operation, and invalid xmax, whereas their parent heap
rows will have xmin/xmax copied from the previous instance of the table.
This does not matter much for ordinary live heap rows, but it's also
necessary for CLUSTER/VACUUM FULL to copy recently-dead,
insert-in-progress, and delete-in-progress rows.  In such cases, a later
plain VACUUM might reap the parent heap rows and not the toast rows,
leading to a storage leak that won't be recovered short of another
CLUSTER/VACUUM FULL.

I can't remember if we discussed this risk when the heap rewrite code
was written.  I'm not sure it's worth fixing, but at the least it ought
to be documented in the comments in rewriteheap.c.
        regards, tom lane


Re: rewriteheap.c bug: toast rows don't get XIDs matching their parents

From
Robert Haas
Date:
On Thu, Jan 12, 2012 at 4:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While working on bug #6393 I was reminded of the truth of $SUBJECT: any
> rows inserted into the new toast table will have the xmin of the CLUSTER
> or VACUUM FULL operation, and invalid xmax, whereas their parent heap
> rows will have xmin/xmax copied from the previous instance of the table.
> This does not matter much for ordinary live heap rows, but it's also
> necessary for CLUSTER/VACUUM FULL to copy recently-dead,
> insert-in-progress, and delete-in-progress rows.  In such cases, a later
> plain VACUUM might reap the parent heap rows and not the toast rows,
> leading to a storage leak that won't be recovered short of another
> CLUSTER/VACUUM FULL.
>
> I can't remember if we discussed this risk when the heap rewrite code
> was written.  I'm not sure it's worth fixing, but at the least it ought
> to be documented in the comments in rewriteheap.c.

People run CLUSTER and VACUUM FULL to recover wasted storage, so it's
a bit unfortunate if those operations can themselves introduce a
storage leak.  So I think it would be nice to fix this, but if that's
more than we can manage right now, then I agree we should at least add
a code comment so that it has a better chance of getting fixed later.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company