Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Kuntal Ghosh
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAGz5QC+sW_JBsFwizgV3Jvh-366FXPByH=u_uU9W+1T6fj65Lw@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs
List pgsql-hackers
Hello Thomas,

In pg_buffercache contrib module, the file pg_buffercache--1.3--1.4.sql is missing. AFAICS, this file should be added as part of the following commit:
Add SmgrId to smgropen() and BufferTag

Otherwise, I'm not able to compile the contrib modules. I've also attached the patch to fix the same.


On Fri, May 10, 2019 at 11:48 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, May 9, 2019 at 6:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> Patches can be applied on top of undo branch [1] commit:
> (cb777466d008e656f03771cf16ec7ef9d6f2778b)

Hello all,

Here is a new patch set which includes all of the patches discussed in
this thread in one go, rebased on today's master.  To summarise the
main layers, from the top down we have:

 0013:       undo-based orphaned file clean-up ($SUBJECT, a demo of
undo technology)
 0009-0010:  undo processing (execution of undo actions when rolling back)
 0008:       undo records
 0001-0007:  undo storage

The main changes to the storage layer since the last time I posted the
full patch stack:

* pg_upgrade support: you can't have any live undo logs (much like 2PC
transactions, we want to be free to change the format), but some work
was required to make sure that all "discarded" undo record pointers
from the old cluster still appear as discarded in the new cluster, as
well as any from the new cluster

* tweaks to various other src/bin tools that are aware of files under
pgdata and were confused by undo segment files

* the fsync of undo log segment files when they're created or recycled
is now handed off to the checkpointer (this was identified as a
performance problem for zheap)

* code tidy-up, removing dead code (undo log rewind, prevlen, prevlog
were no longer needed by patches higher up in the stack), removing
global variables, noisy LOG messages about undo segment files now
reduced to DEBUG1

* new extension contrib/undoinspect, for developer use, showing what
will be undone if you abort:

postgres=# begin;
BEGIN
postgres=# create table t();
CREATE TABLE
postgres=# select * from undoinspect();
     urecptr      |  rmgr   | flags | xid |
description
------------------+---------+-------+-----+---------------------------------------------
 00000000000032FA | Storage | P,T   | 487 | CREATE dbid=12934,
tsid=1663, relfile=16393
(1 row)

One silly detail: I had to change the default max_worker_processes
from 8 to 12, because otherwise a couple of tests run with fewer
parallel workers than they expect, due to undo worker processes using
up slots.  There is probably a better solution to that problem.

I put the patches in a tarball here, but they are also available from
https://github.com/EnterpriseDB/zheap/tree/undo.

--
Thomas Munro
https://enterprisedb.com


--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Bug in reindexdb's error reporting
Next
From: DHRUVI VADALIA
Date:
Subject: Regarding GSoD