Re: PoC: Simplify recovery after dropping a table by LOGGING the restore LSN - Mailing list pgsql-hackers

From Jim Jones
Subject Re: PoC: Simplify recovery after dropping a table by LOGGING the restore LSN
Date
Msg-id 3bda6b56-16bd-48fe-8e23-5ef58a6a4e34@uni-muenster.de
Whole thread Raw
List pgsql-hackers
Hi Andrey, Nikolay, and Kirk

On 08.11.24 04:46, Kirk Wolak wrote:
> Hackers,
>   The concept was driven by an all too common support request.  A user
> accidentally dropped the wrong table.  (this could also be applied to
> dropping a database, etc).
>
>   If we had the LSN before the drop, this would be easier.  So we
> actually log the LSN when the lock is required so that we have an
> accurate LSN and the recovery is much simpler.
>

I've tested the patch it it works as described.

postgres=# CREATE TABLE t();
postgres=# SELECT pg_current_wal_lsn( );
 pg_current_wal_lsn
--------------------
 0/2A0677D8
(1 row)
postgres=# DROP TABLE t;

2024-11-26 14:12:28.915 CET [338209] LOG:  Aquired drop table lock on
table t. Restore at 0/2A0677D8
2024-11-26 14:12:28.915 CET [338209] STATEMENT:  DROP TABLE t;


A few comments:

1) non "permanent" tables

Since TEMPORARY TABLES are automatically dropped in the end of a
session, simply leaving a session that contains a TEMPORARY TABLE also
logs the LSN:

postgres=# CREATE TEMPORARY TABLE t_temporary AS SELECT
generate_series(1,100);
SELECT 100
postgres=# quit

2024-11-26 14:34:26.596 CET [386777] LOG:  Aquired drop table lock on
table t_temporary. Restore at 0/2A149EC0

Dropping the TEMPORARY TABLE also triggers this log message

postgres=# CREATE TEMPORARY TABLE t_temporary AS SELECT
generate_series(1,100);
postgres=# DROP TABLE t_temporary;

2024-11-26 14:35:13.174 CET [387378] LOG:  Aquired drop table lock on
table t_temporary. Restore at 0/2A1664E8
2024-11-26 14:35:13.174 CET [387378] STATEMENT:  DROP TABLE t_temporary;

The same applies for UNLOGGED TABLES:

postgres=# CREATE UNLOGGED TABLE t_unlogged AS SELECT
generate_series(1,100);
postgres=# SELECT * FROM page_header(get_raw_page('public.t_unlogged',0));
 lsn | checksum | flags | lower | upper | special | pagesize | version |
prune_xid
-----+----------+-------+-------+-------+---------+----------+---------+-----------
 0/0 |        0 |     0 |   424 |  4992 |    8192 |     8192 |       4
|         0
(1 row)
postgres=# DROP TABLE t_unlogged;

2024-11-26 14:33:00.397 CET [338209] LOG:  Aquired drop table lock on
table t_unlogged. Restore at 0/2A143DC8
2024-11-26 14:33:00.397 CET [338209] STATEMENT:  DROP TABLE t_unlogged;

2) log message readability

For better readability, perhaps the table name should be logged with
double quotes, as in :

2024-11-26 13:25:35.525 CET [338209] ERROR:  table "x" does not exist
2024-11-26 13:25:35.525 CET [338209] STATEMENT:  DROP TABLE x;

3) format of existing log messages containing LSN

I haven't looked too deep into it, but the log format used by checkpoint
might be helpful for applications that need to parse log files. For
instance, a log entry triggered after a DROP DATABASE:

postgres=# CREATE DATABASE db;
postgres=# DROP DATABASE db;

2024-11-26 13:30:08.166 CET [329713] LOG:  checkpoint starting:
immediate force wait
2024-11-26 13:30:08.177 CET [329713] LOG:  checkpoint complete: wrote 3
buffers (0.0%), wrote 1 SLRU buffers; 0 WAL file(s) added, 0 removed, 1
recycled; write=0.002 s, sync=0.002 s, total=0.011 s; sync files=4,
longest=0.001 s, average=0.001 s; distance=4306 kB, estimate=4308 kB;
lsn=0/318AFD8, redo lsn=0/318AF80

>   This is a rough patch, very simple and effective.  We are looking
> for feedback.
>
>   Comments are appreciated!
>
>   Should we ALSO consider this for:
>      - DROP DATABASE
>      - TRUNCATE TABLE
>      - DELETE (only when it is without a WHERE clause?)
>      - UPDATE (only when it is without a WHERE clause?)   
>
+1

Thanks!

Best, Jim



pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Re: Auto Vacuum optimisation
Next
From: "David E. Wheeler"
Date:
Subject: Re: Potential ABI breakage in upcoming minor releases