Re: Show various offset arrays for heap WAL records - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Show various offset arrays for heap WAL records |
Date | |
Msg-id | CAH2-Wz=pyNmfMnjeKpOcSqai9jiTzS3yZ+df=Nj1XGEUJuTRww@mail.gmail.com Whole thread Raw |
In response to | Re: Show various offset arrays for heap WAL records (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Show various offset arrays for heap WAL records
|
List | pgsql-hackers |
On Mon, Mar 13, 2023 at 4:01 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > On Fri, Jan 27, 2023 at 3:02 PM Robert Haas <robertmhaas@gmail.com> wrote: > > I'm not sure what's best in terms of formatting details but I > > definitely like the idea of making pg_waldump show more details. > If I'm not mistaken, this would be quite difficult without changing > rm_desc to return some kind of self-describing data type. I'd say that it would depend on how far you went with it. Basic information about the tuple wouldn't require any of that. I suggest leaving this part out for now, though. > So, we can scrap any README or big comment, but are there other changes > to the code or structure you think would avoid it being seen as an > API? I think that it would be good to try to build something that looks like an API, while making zero promises about its stability -- at least until further notice. Kind of like how there are no guarantees about the stability of internal interfaces within the Linux kernel. There is no reason to not take a firm position on some things now. Things like punctuation, and symbol names for generic cross-record symbols like snapshotConflictHorizon. Many of the differences that exist now are wholly gratuitous -- just accidents. It would make sense to standardize-away these clearly unnecessary variations. And to document the new standard. I'd be surprised if anybody disagreed with me on this point. > I have added detail to xl_btree_delete and xl_btree_vacuum. I have added > the updated/deleted target offset numbers and the updated tuples > metadata. > > I wondered if there was any reason to do xl_btree_dedup deduplication > intervals. No reason. It wouldn't be hard to cover xl_btree_dedup deduplication intervals -- each element is a page offset number, and a corresponding count of index tuples to merge together in the REDO routine. That's slightly different to anything else, but not in a way that seems like it requires very much additional effort. > I wanted to include at least a minimal example for those following along > with this thread that would cause creation of one of the record types > which I have enhanced, but I had a little trouble making a reliable > example. > > Below is my strategy for getting a Heap PRUNE record with redirects, but > it occasionally doesn't end up working and I wasn't sure why (I can do > more investigation if we think that having some kind of test for this is > useful). I'm not sure, but offhand I think that there could be a number of annoying little implementation details that make it hard to come up with a perfectly reliable test case. Perhaps try it while using VACUUM VERBOSE, with the proviso that we should only expect the revised example workflow to show a redirect record as intended when the VERBOSE output confirms that VACUUM actually ran as expected, in whatever way. For example, VACUUM can't have failed to acquire a cleanup lock on a heap page due to the current phase of the moon. VACUUM shouldn't have its "removable cutoff" held back by who-knows-what when the test case is run, either. Some of the tests for VACUUM use a temp table, since they conveniently cannot have their "removable cutoff" held back -- not since commit a7212be8. Of course, that strategy won't help you here. Getting VACUUM to behave very predictably for testing purposes has proven tricky at times. > > I agree, in general, though long term the best approach is one that > > has a configurable level of verbosity, with some kind of roughly > > uniform definition of verbosity (kinda like DEBUG1 - DEBUG5, though > > probably with only 2 or 3 distinct levels). > > Given this comment and Robert's concern quoted below, I am wondering if > the consensus is that a lack of verbosity control is a dealbreaker for > adding offsets or not. There are several different things that seem important to me personally. These are in tension with each other, to a degree. These are: 1. Like Andres, I'd really like to have some way of inspecting things like heapam PRUNE, VACUUM, and FREEZE_PAGE records in significant detail. These record types happen to be very important in general, and the ability to see detailed information about the WAL record would definitely help with some debugging scenarios. I've really missed stuff like this while debugging serious issues under time pressure. 2. To a lesser extent I would like to see similar detailed information for nbtree's DELETE, VACUUM, and possibly DEDUPLICATE record types. They might also come in handy for debugging, in about the same way. 3. More manageable verbosity. I think that it would be okay to put off coming up with a solution to 3, for now. 1 and 2 seem more important than 3. > I think if there was a more structured output of rmgrdesc, then this > would also solve the verbosity level problem. Consumers could decide on > their verbosity level -- in various pg_walinspect function outputs, that > would probably just be column selection. For pg_waldump, I imagine that > some kind of parameter or flag would work. > > Unless you are suggesting that we add a verbosity parameter to the > rmgrdesc function API now? The verbosity problem will get somewhat worse if we do just my items 1 and 2, so it would be nice if we at least had a strategy in mind that delivers on item 3/verbosity -- though the implementation can appear in a later release. Maybe something simple would work, like promising to output (say) 30 characters or less in terse mode, and making no such promise otherwise. Terse mode wouldn't just truncate the output of verbose mode -- it would never display information that could in principle exceed the 30 character allowance, even with records that happen to fall under the limit. I can't feel too bad about putting this part off. A pager like pspg is already table stakes when using pg_walinspect in any sort of serious way. As I said upthread, absurdly wide output is already reasonably common in most cases. -- Peter Geoghegan
pgsql-hackers by date: