Re: Why copy_relation_data only use wal when WALarchiving is enabled - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Why copy_relation_data only use wal when WALarchiving is enabled
Date
Msg-id 471602A3.8010604@enterprisedb.com
Whole thread Raw
In response to Re: Why copy_relation_data only use wal when WALarchiving is enabled  (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses Re: Why copy_relation_data only use wal when WALarchiving is enabled
List pgsql-hackers
I wrote:
> Unfortunately I don't see any easy way to fix it. One approach would be
> to avoid reusing the relfilenodes until next checkpoint, but I don't see
> any nice place to keep track of OIDs that have been dropped since last
> checkpoint.

Ok, here's one idea:

Instead of deleting the file immediately on commit of DROP TABLE, the
file is truncated to release the space, but not unlink()ed, to avoid
reusing that relfilenode. The truncated file can be deleted after next
checkpoint.

Now, how does checkpoint know what to delete? We can use the fsync
request mechanism for that. When a file is truncated, a new kind of
fsync request, a "deletion request", is sent to the bgwriter, which
collects all such requests to a list. Before checkpoint calculates new
RedoRecPtr, the list is swapped with an empty one, and after writing the
new checkpoint record, all the files that were in the list are deleted.

We would leak empty files on crashes, but we leak files on crashes
anyway, so that shouldn't be an issue. This scheme wouldn't require
catalog changes, so it would be suitable for backpatching.

Any better ideas?

Do we care enough about this to fix this? Enough to backpatch? The
probability of this happening is pretty small, but the consequences are
really bad, so my vote is "yes" and "yes".

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Why copy_relation_data only use wal when WALarchiving is enabled
Next
From: "Florian G. Pflug"
Date:
Subject: Re: Why copy_relation_data only use wal when WALarchiving is enabled