Re: Problems with the FSM, heap fillfactor, and temporal locality - Mailing list pgsql-hackers
From | John Naylor |
---|---|
Subject | Re: Problems with the FSM, heap fillfactor, and temporal locality |
Date | |
Msg-id | CACPNZCvii_nZNL13QuzFAGQjH1OSnbeH+M1JHszWRSQEcXpWaA@mail.gmail.com Whole thread Raw |
In response to | Problems with the FSM, heap fillfactor, and temporal locality (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Problems with the FSM, heap fillfactor, and temporal locality
|
List | pgsql-hackers |
On Fri, Aug 21, 2020 at 2:48 AM Peter Geoghegan <pg@bowt.ie> wrote: > > I'm concerned about how the FSM gives out pages to heapam. Disabling > the FSM entirely helps TPC-C/BenchmarkSQL, which uses non-default heap > fillfactors for most tables [1]. Hi Peter, Interesting stuff. Is lower-than-default fillfactor important for the behavior you see? > located on the same blocks (or at least on neighboring blocks). It's > possible that one orderline will span two neighboring blocks here, but > it will never span three or more blocks. Each order has 5 - 15 order > lines, and so I was surprised to see that a small minority or order > line tuples end up occupying as many as 5 or 7 heap pages on the > master branch (i.e. with the current FSM intact during bulk loading). Hmm. You focus on FSM, but I'm thinking also multiple VM bits potentially got cleared (maybe not this benchmark but other scenarios). > If the FSM tried to get free space from a close-by block, then we > might at least see related updates that cannot fit a successor tuple > on the same block behave in a coordinated fashion. We might at least > have both updates relocate the successor tuple to the same > mostly-empty block -- they both notice the same nearby free block, so > both sets of successor tuples end up going on the same most-empty > block. The two updating backends don't actually coordinate, of course > -- this just happens as a consequence of looking for nearby free > space. I'm not sure I follow the "close-by" criterion. It doesn't seem immediately relevant to what I understand as the main problem you found, of free space. In other words, if we currently write to 5 blocks, but with smarter FSM logic we can find a single block, it seems that should be preferred over close-by blocks each with less space? Or are you envisioning scoring by both free space and distance from the original block? > supposed to be reserved. It should not be enough for the space used on > the page to shrink by just 1% (to 89%). Maybe it should have to reach > as low as 50% or less before we flip it back to "free to take space > from for new unrelated tuples". The idea is that fill factor space is > truly reserved for updates -- that should be "sticky" for all of this > to work well. Makes sense. If we go this route, I wonder if we should keep the current behavior and use any free space if the fillfactor is 100%, since that's in line with the intention. Also, the 50% (or whatever) figure could be scaled depending on fillfactor. As in, check if freespace > (100% - fillfactor * 0.6), or something. I'm not sure how to distinguish blocks that have never reached fillfactor vs. ones that did but haven't gained back enough free space to be considered again. Naively, we could start by assuming the last block can always be filled up to fillfactor, but earlier blocks must use the stricter rule. That's easy since we already try the last block anyway before extending the relation. The flip side of this is: Why should vacuum be in a hurry to dirty a page, emit WAL, and update the FSM if it only removes one dead tuple? This presentation [1] (pages 35-43) from Masahiko Sawada had the idea of a "garbage map", which keeps track of which parts of the heap have the most dead space, and focus I/O efforts on those blocks first. It may or may not be worth the extra complexity by itself, but it seems it would work synergistically with your proposal: Updating related tuples would concentrate dead tuples on fewer pages, and vacuums would more quickly free up space that can actually be used to store multiple new tuples. [1] https://www.slideshare.net/masahikosawada98/vacuum-more-efficient-than-ever-99916671 -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: