hint bit i/o reduction - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | hint bit i/o reduction |
Date | |
Msg-id | CAHyXU0y3qSR2i86cL3-rHKrJjtf-eh4sQ2YitFnQ84GeWjX8FA@mail.gmail.com Whole thread Raw |
Responses |
Re: hint bit i/o reduction
|
List | pgsql-hackers |
hackers, A while back I made an effort implementing a backend local hint bit cache with the intention of mitigating i/o impact for scan heavy workloads that involve moving a lot of records around per transaction.The basic concept was to keep some backend privatememory that tracked the resolution of recently seen transactions and worked in a similar fashion to the hint bits: if (!commit_hint_bit_set && hint_bit_cache(xid) == committed) { The other interesting aspect was that, if the bit was found in the cache, the bit was set on the tuple but the page was not dirtied. If the xid was not found in the cache, the bit was set and the page was dirtied normally. From a strictly performance standpoint, limited testing seemed to indicate it more or less fixed the hint bit i/o problems. Still, injecting extra logic (especially a cache lookup) into the visibility routines is not to be taken lightly, and maybe there's a better way. There simply has to be a way to ameliorate hint bit i/o without spending a lot of cycles and hopefully not too much impacting other workloads. Unfortunately, assuming the CRC patch makes it in, any of the branch of tricks in the line of 'set the bit but avoid dirtying the page' aren't going to fly. I think, at the end of the day, we have to avoid setting the bit, but only in cases where we are fairly sure we would prefer not to do that. I'm thinking something fairly simple would get some good mileage: 1) Keep track # times the last transaction id was repeatedly seen in tqual.c (resetting as soon as a new xid was touched. We can do this just for xmin, or separately for both xmin and xmax. 2) right after checking the hint bit (assuming it wasn't set), test to see if our xid is the 'last xid', and iff #times_seen_same > #some_number (say, 100 or 1000), use it as if the hint bit was set. #some_number can be chosen to some fairly conservative defined value, or perhaps we can be sneaky and try and adjust it based on things like how many unforced clog faults we're doing -- maybe even keeping some accounting at the page level. We can also try to be smart and disable the 'avoid setting the hint bit' logic if the page is already dirty. The presumption here is that if you're seeing the same xid over and over, there simply isn't a lot of value in recording it in page after page after page. It's tempting to suggest that there is already an 'is this xid the same as the last one AKA last_transaction_id' at the clog visibility checking level, but that doesn't help for the following reasons: 1) while it does avoid the jump out to clog, it occurs at the wrong place in the sequence of visibility checking (after you've wasted cycles in TransactionIdIsInProgress() etc) and 2) by being in the clog level can't influence visibility behaviors of whether or not to set the bit. 3) it's not inline I see two potential downsides here: 1) maybe the extra logic in visibility is too expensive (but i doubt it) 2) you're missing early opportunities to set the all visible bit which in turn will influence index only scans. The above might be a small price to pay for saving all the i/o and sooner or later if the records are long lived vacuum will swing around and tag them. merlin
pgsql-hackers by date: