Re: Combine Prune and Freeze records emitted by vacuum - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Combine Prune and Freeze records emitted by vacuum |
Date | |
Msg-id | 20240320191710.y6b4hterii5qf76y@liskov Whole thread Raw |
In response to | Re: Combine Prune and Freeze records emitted by vacuum (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: Combine Prune and Freeze records emitted by vacuum
|
List | pgsql-hackers |
On Wed, Mar 20, 2024 at 03:15:49PM +0200, Heikki Linnakangas wrote: > On 20/03/2024 03:36, Melanie Plageman wrote: > > On Mon, Mar 18, 2024 at 01:15:21AM +0200, Heikki Linnakangas wrote: > > > On 15/03/2024 02:56, Melanie Plageman wrote: > > > > Okay, so I was going to start using xl_heap_prune for vacuum here too, > > > > but I realized it would be bigger because of the > > > > snapshotConflictHorizon. Do you think there is a non-terrible way to > > > > make the snapshotConflictHorizon optional? Like with a flag? > > > > > > Yeah, another flag would do the trick. > > > > Okay, I've done this in attached v4 (including removing > > XLOG_HEAP2_VACUUM). I had to put the snapshot conflict horizon in the > > "main chunk" of data available at replay regardless of whether or not > > the record ended up including an FPI. > > > > I made it its own sub-record (xlhp_conflict_horizon) less to help with > > alignment (though we can use all the help we can get there) and more to > > keep it from getting lost. When you look at heapam_xlog.h, you can see > > what a XLOG_HEAP2_PRUNE record will contain starting with the > > xl_heap_prune struct and then all the sub-record types. > > Ok, now that I look at this, I wonder if we're being overly cautious about > the WAL size. We probably could just always include the snapshot field, and > set it to InvalidTransactionId and waste 4 bytes when it's not needed. For > the sake of simplicity. I don't feel strongly either way though, the flag is > pretty simple too. That will mean that all vacuum records are at least 3 bytes bigger than before -- which makes it somewhat less defensible to get rid of xl_heap_vacuum. That being said, I ended up doing an unaligned access when I packed it and made it optional, so maybe it is less user-friendly. But I also think that making it optional is more clear for vacuum which will never use it. > I realized that the WAL record format changes are pretty independent from > the rest of the patches. They could be applied before the rest. Without the > rest of the changes, we'll still write two WAL records per page in vacuum, > one to prune and another one to freeze, but it's another meaningful > incremental step. So I reshuffled the patches, so that the WAL format is > changed first, before the rest of the changes. Ah, great idea! That eliminates the issue of preliminary commits having larger WAL records that then get streamlined. > 0001-0008: These are the WAL format changes. There's some comment cleanup > needed, but as far as the code goes, I think these are pretty much ready to > be squashed & committed. My review in this email is *only* for 0001-0008. I have not looked at the rest yet. > From 06d5ff5349a8aa95cbfd06a8043fe503b7b1bf7b Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Wed, 20 Mar 2024 14:50:14 +0200 > Subject: [PATCH v5 01/26] Merge prune, freeze and vacuum WAL record formats > > The new combined WAL record is now used for pruning, freezing and 2nd > pass of vacuum. This is in preparation of changing vacuuming to write > a combined prune+freeze record per page, instead of separate two > records. The new WAL record format now supports that, but the code > still always writes separate records for pruning and freezing. Attached patch changes-for-0001.patch has a bunch of updated comments -- especially for heapam_xlog.h (including my promised diagram). And a few suggestions (mostly changes that I should have made before). > > XXX I tried to lift-and-shift the code from v4 patch set as unchanged > as possible, for easier review, but some noteworthy changes: In the final commit message, I think it is worth calling out that all of those freeze logging functions heap_log_freeze_eq/cmp/etc are lifted and shifted from one file to another. When I am reviewing a big diff, it is always helpful to know where I need to review line-by-line. > > - Instead of passing PruneState and PageFreezeResult to > log_heap_prune_and_freeze(), pass the arrays of frozen, redirected > et al offsets directly. That way, it can be called from other places. good idea. > - moved heap_xlog_deserialize_prune_and_freeze() from xactdesc.c to > heapdesc.c. (Because that's clearly where it belongs) :) > From cd6cdaebb362b014733e99ecd868896caf0fb3aa Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Wed, 20 Mar 2024 13:45:01 +0200 > Subject: [PATCH v5 02/26] Keep the original numbers for existing WAL records > > Doesn't matter much because the WAL format is not compatible across > major versions anyway. But still seems nice to keep the identifiers > unchanged when we can. (There's some precedence for this if you search > the git history for "is free, was"). sounds good. > From d3207bb557aa1d2868a50d357a06318a6c0cb5cd Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Wed, 20 Mar 2024 13:48:29 +0200 > Subject: [PATCH v5 03/26] Rename record to XLOG_HEAP2_PRUNE_FREEZE > > To clarify that it also freezes now, and to make it clear that it's > significantly different from the old XLOG_HEAP2_PRUNE format. +1 > From 5d6fc2ffbdd839e0b69242af16446a46cf6a2dc7 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Wed, 20 Mar 2024 13:49:59 +0200 > Subject: [PATCH v5 04/26] 'nplans' is a pointer > > I'm surprised the compiler didn't warn about this oops. while looking at this, I noticed that the asserts I added that nredirected > 0, ndead > 0, and nunused > 0 have the same problem. > --- > src/backend/access/rmgrdesc/heapdesc.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c > index 8b94c869faf..9ef8a745982 100644 > --- a/src/backend/access/rmgrdesc/heapdesc.c > +++ b/src/backend/access/rmgrdesc/heapdesc.c > @@ -155,8 +155,7 @@ heap_xlog_deserialize_prune_and_freeze(char *cursor, uint8 flags, > cursor += sizeof(OffsetNumber) * *nunused; > } > > - if (nplans > 0) > From 59f3f80f82ed7a63d86c991d0cb025e4cde2caec Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Wed, 20 Mar 2024 13:36:41 +0200 > Subject: [PATCH v5 06/26] Fix logging snapshot conflict horizon. > > - it was accessed without proper alignment, which won't work on > architectures that are strict about alignment. Use memcpy. wow, oops. thanks for fixing this! > - in heap_xlog_prune_freeze, the code tried to access the xid with > "(xlhp_conflict_horizon *) (xlrec + SizeOfHeapPrune);" But 'xlrec' > was "xl_heap_prune *" rather than "char *". That happened to work, > because sizeof(xl_heap_prune) == 1, but make it more robust by > adding a cast to char *. good catch. > - remove xlhp_conflict_horizon and store a TransactionId directly. A > separate struct would make sense if we needed to store anything else > there, but for now it just seems like more code. makes sense. I just want to make sure heapam_xlog.h makes it extra clear that this is happening. I see your comment addition. If you combine it with my comment additions in the attached patch for 0001, hopefully that makes it clear enough. > From 8af186ee9dd8c7dc20f37a69b34cab7b95faa43b Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Wed, 20 Mar 2024 14:03:06 +0200 > Subject: [PATCH v5 07/26] Add comment to log_heap_prune_and_freeze(). > > XXX: This should be rewritten, but I tried to at least list some > important points. Are you thinking that it needs to mention more things or that the things it mentions need more detail? > From b26e36ba8614d907a6e15810ed4f684f8f628dd2 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > Date: Wed, 20 Mar 2024 14:53:31 +0200 > Subject: [PATCH v5 08/26] minor refactoring in log_heap_prune_and_freeze() > > Mostly to make local variables more tightly-scoped. So, I don't think you can move those sub-records into the tighter scope. If you run tests with this applied, you'll see it crashes and a lot of the data in the record is wrong. If you move the sub-record declarations out to a wider scope, the tests pass. The WAL record data isn't actually copied into the buffer until recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_PRUNE_FREEZE); after registering everything. So all of those sub-records you made are out of scope the time it tries to copy from them. I spent the better part of a day last week trying to figure out what was happening after I did the exact same thing. I must say that I found the xloginsert API incredibly unhelpful on this point. I would like to review the rest of the suggested changes in this patch after we fix the issue I just mentioned. - Melanie
Attachment
pgsql-hackers by date: