Home > mailing lists

回复：POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From	孔凡深(云梳)
Subject	回复：POC: Cleaning up orphaned files using undo logs
Date	March 29, 2022 13:07:56
Msg-id	09930dc0-aa70-43a2-8b8f-1ea534e1b473.fanshen.kfs@alibaba-inc.com Whole thread Raw
In response to	Re: POC: Cleaning up orphaned files using undo logs (Antonin Houska <ah@cybertec.at>)
Responses	Re: 回复：POC: Cleaning up orphaned files using undo logs (Antonin Houska <ah@cybertec.at>)
List	pgsql-hackers

Tree view

Hi, Antonin. I am more interested in zheap. Recently reviewing the patch you submitted.

When I use pg_undodump-tool to dump the undo page chunk, I found that some chunk header is abnormal.

After reading the relevant codes in 0006-The-initial-implementation-of-the-pg_undodump-tool.patch,

I feel that there is a bug in the function parse_undo_page.

According to my understanding The size in chunk Header includes chunk header + type-specific header + undo record.

If the entire chunk spans pages, also need to add the size of the page header.

But I found Now only the scenario of chunk header spanning pages is considered, and the scenario of type-specific header is not considered.

* The page header size must eventually be subtracted from

* chunk_bytes_left because it's included in the chunk size. However,

* since chunk_bytes_left is unsigned, we do not subtract anything from it

* if it's still zero. This can happen if we're still reading the chunk

* header or the type-specific header. (The underflow should not be a

* problem because the chunk size will eventually be added, but it seems

* ugly and it makes debugging less convenient.)

if (s->chunk_bytes_left > 0)

{

/* Chunk should not end within page header. */

Assert(s->chunk_bytes_left >= SizeOfUndoPageHeaderData);

s->chunk_bytes_left -= SizeOfUndoPageHeaderData;

s->chunk_bytes_to_skip = 0;

}

/* Processing the chunk header?

else if (s->chunk_hdr_bytes_left > 0 )

s->chunk_bytes_to_skip = SizeOfUndoPageHeaderData;

------------------------------------------------

Should this code be fixed as this?When the type-specific header spans the undo page, the page header should be skipped.

else if (s->chunk_hdr_bytes_left > 0 || s->type_hdr_bytes_left > 0)

s->chunk_bytes_to_skip = SizeOfUndoPageHeaderData;

------------------------------------------------------------------
发件人：Antonin Houska <ah@cybertec.at>
发送时间：2022年3月29日(星期二) 17:25
收件人：Dmitry Dolgov <9erthalion6@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>
主　题：Re: POC: Cleaning up orphaned files using undo logs

The cfbot complained that the patch series no longer applies, so I've rebased
it and also tried to make sure that the other flags become green.

One particular problem was that pg_upgrade complained that "live undo data"
remains in the old cluster. I found out that the temporary undo log causes the
problem, so I've adjusted the query in check_for_undo_data() accordingly until
the problem gets fixed properly.

The problem of the temporary undo log is that it's loaded into local buffers
and that backend can exit w/o flushing local buffers to disk, and thus we are
not guaranteed to find enough information when trying to discard the undo log
the backend wrote. I'm thinking about the following solutions:

1. Let the backend manage temporary undo log on its own (even the slot
metadata would stay outside the shared memory, and in particular the
insertion pointer could start from 1 for each session) and remove the
segment files at the same moment the temporary relations are removed.

However, by moving the temporary undo slots away from the shared memory,
computation of oldestFullXidHavingUndo (see the PROC_HDR structure) would
be affected. It might seem that a transaction which only writes undo log
for temporary relations does not need to affect oldestFullXidHavingUndo,
but it needs to be analyzed thoroughly. Since oldestFullXidHavingUndo
prevents transactions to be truncated from the CLOG too early, I wonder if
the following is possible (This scenario is only applicable to the zheap
storage engine [1], which is not included in this patch, but should already
be considered.):

A transaction creates a temporary table, does some (many) changes and then
gets rolled back. The undo records are being applied and it takes some
time. Since XID of the transaction did not affect oldestFullXidHavingUndo,
the XID can disappear from the CLOG due to truncation. However zundo.c in
[1] indicates that the transaction status *is* checked during undo
execution, so we might have a problem.

Or do I miss something? UndoDiscard() in zheap seems to ignore temporary
undo:

/* We can't process temporary undo logs. */
if (log->meta.persistence == UNDO_TEMP)
continue;

2. Do not load the temporary undo into local buffers. If it's always in the
shared buffers, we should never see incomplete data when trying to discard
undo. In this case, persistence levels UNDOPERSISTENCE_UNLOGGED and
UNDOPERSISTENCE_TEMP could be merged into a single level.

3. Implement the discarding in another way, but I don't have new idea right
now.

Suggestions are welcome.

[1] https://github.com/EnterpriseDB/zheap/tree/master

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachment

undo-20210909.tgz

pgsql-hackers by date:

From: Amit Kapila
Date: 29 March 2022, 13:00:22
Subject: Re: Column Filtering in Logical Replication

From: Julien Rouhaud
Date: 29 March 2022, 13:08:46
Subject: Re: Add parameter jit_warn_above_fraction

回复：POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

Attachment

Previous

Next