Re: In-placre persistance change of a relation - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: In-placre persistance change of a relation |
Date | |
Msg-id | 6f9828f5-e582-4eeb-a781-2d8dc92dff04@iki.fi Whole thread Raw |
In response to | Re: In-placre persistance change of a relation (Thom Brown <thom@linux.com>) |
List | pgsql-hackers |
On 05/04/2025 00:29, Thom Brown wrote: > On Fri, 27 Dec 2024 at 08:26, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: >> >> Hello. This is the updated version. >> >> (Sorry for the delay; I've been a little swamped.) >> >> - Undo logs are primarily stored in a fixed number of fixed-length >> slots and are spilled into files under some conditions. >> >> The number of slots is 32 (ULOG_SLOT_NUM), and the buffer length is >> 1024 (ULOG_SLOT_BUF_LEN). Both are currently non-configurable. >> >> - Undo logs are now used only during recovery and no longer involved >> in transaction ends for normal backends. Pending deletes for aborts >> have been restored. >> >> - Undo logs are stored on a per-Top-XID basis. >> >> - RelationPreserverStorate() is no longer modified. >> >> In this version, in the part following the introduction of orphan >> storage prevention, the restriction on prepared transactions >> persisting beyond server crashes (i.e., the prohibition) has been >> removed. This is because handling for such cases has been reverted to >> pendingDeletes. >> >> Let me know if you have any questions or concerns. > > I just went to give this a test drive, but HEAD has drifted too far, > at least for 0017 to apply. Could you please rebase and make the > necessary modifications? I had a quick look a this latest version now, up to "v36-0005-Prevent-orphan-storage-files-after-server-crash.patch" (because I'm very interested in that, but not in the rest of the patches). Sorry I haven't gotten around to it earlier. Overall I'm pretty happy with the design. The main thing that's now missing is documentation. The main SGML docs should surely have a section on the UNDO log. A new README to describe how modules should use the undo log etc. would probably also be in order. Off the top of my head, some subtle high-level things that should be explained somewhere: - The UNDO log is only used to clean up after crash of a relation creation. It is *not* used for aborting or crash recovery of data, like on most systems. As a result, it's not as performance critical as you might think. - The UNDO log is not a single sequential log like on many other systems. One way to think about it is that it's a per-transaction file, with a cache in shared memory for performance. - The UNDO log is not used to handle controlled aborts, only for cleanup after a crash. - What happens if you fail to process the UNDO log for some reason? Some storage files are leaked. Is that still considered OK, i.e. is the UNDO log a nice-to-have, or are there some more serious consequences? - The interaction between REDO and UNDO. Every record inserted to the UNDO log of a transaction is WAL-logged in the REDO log. The undo log is like data file in that sense. Writing to the undo log follows the usual "WAL-before-write" rule: the WAL is flushed before the corresponding undo log entry is written to disk. (Is that true? I'm not 100% sure) - When a new relation is created, do you flush the WAL before creating the file? Or is there still a small window where it can leak, if the file creation makes it to disk before crash but the undo log (or the WAL record of the undo log entry) does not? Have you done any performance testing of this? By "this" I mean the overhead of the undo-logging on create/drop table. -- Heikki Linnakangas Neon (https://neon.tech)
pgsql-hackers by date: