Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id 45176.1631199861@antos
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Antonin Houska <ah@cybertec.at>)
Responses Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
回复:POC: Cleaning up orphaned files using undo logs  ("孔凡深(云梳)" <fanshen.kfs@alibaba-inc.com>)
List pgsql-hackers
The cfbot complained that the patch series no longer applies, so I've rebased
it and also tried to make sure that the other flags become green.

One particular problem was that pg_upgrade complained that "live undo data"
remains in the old cluster. I found out that the temporary undo log causes the
problem, so I've adjusted the query in check_for_undo_data() accordingly until
the problem gets fixed properly.

The problem of the temporary undo log is that it's loaded into local buffers
and that backend can exit w/o flushing local buffers to disk, and thus we are
not guaranteed to find enough information when trying to discard the undo log
the backend wrote. I'm thinking about the following solutions:

1. Let the backend manage temporary undo log on its own (even the slot
   metadata would stay outside the shared memory, and in particular the
   insertion pointer could start from 1 for each session) and remove the
   segment files at the same moment the temporary relations are removed.

   However, by moving the temporary undo slots away from the shared memory,
   computation of oldestFullXidHavingUndo (see the PROC_HDR structure) would
   be affected. It might seem that a transaction which only writes undo log
   for temporary relations does not need to affect oldestFullXidHavingUndo,
   but it needs to be analyzed thoroughly. Since oldestFullXidHavingUndo
   prevents transactions to be truncated from the CLOG too early, I wonder if
   the following is possible (This scenario is only applicable to the zheap
   storage engine [1], which is not included in this patch, but should already
   be considered.):

   A transaction creates a temporary table, does some (many) changes and then
   gets rolled back. The undo records are being applied and it takes some
   time. Since XID of the transaction did not affect oldestFullXidHavingUndo,
   the XID can disappear from the CLOG due to truncation. However zundo.c in
   [1] indicates that the transaction status *is* checked during undo
   execution, so we might have a problem.

   Or do I miss something? UndoDiscard() in zheap seems to ignore temporary
   undo:

           /* We can't process temporary undo logs. */
        if (log->meta.persistence == UNDO_TEMP)
            continue;

2. Do not load the temporary undo into local buffers. If it's always in the
   shared buffers, we should never see incomplete data when trying to discard
   undo. In this case, persistence levels UNDOPERSISTENCE_UNLOGGED and
   UNDOPERSISTENCE_TEMP could be merged into a single level.

3. Implement the discarding in another way, but I don't have new idea right
   now.

Suggestions are welcome.

[1] https://github.com/EnterpriseDB/zheap/tree/master

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Migração Postgresql 8.3 para versão Postgresql 9.3
Next
From: Fujii Masao
Date:
Subject: Re: Possible missing segments in archiving on standby