Thread: hint bit i/o reduction
hackers, A while back I made an effort implementing a backend local hint bit cache with the intention of mitigating i/o impact for scan heavy workloads that involve moving a lot of records around per transaction.The basic concept was to keep some backend privatememory that tracked the resolution of recently seen transactions and worked in a similar fashion to the hint bits: if (!commit_hint_bit_set && hint_bit_cache(xid) == committed) { The other interesting aspect was that, if the bit was found in the cache, the bit was set on the tuple but the page was not dirtied. If the xid was not found in the cache, the bit was set and the page was dirtied normally. From a strictly performance standpoint, limited testing seemed to indicate it more or less fixed the hint bit i/o problems. Still, injecting extra logic (especially a cache lookup) into the visibility routines is not to be taken lightly, and maybe there's a better way. There simply has to be a way to ameliorate hint bit i/o without spending a lot of cycles and hopefully not too much impacting other workloads. Unfortunately, assuming the CRC patch makes it in, any of the branch of tricks in the line of 'set the bit but avoid dirtying the page' aren't going to fly. I think, at the end of the day, we have to avoid setting the bit, but only in cases where we are fairly sure we would prefer not to do that. I'm thinking something fairly simple would get some good mileage: 1) Keep track # times the last transaction id was repeatedly seen in tqual.c (resetting as soon as a new xid was touched. We can do this just for xmin, or separately for both xmin and xmax. 2) right after checking the hint bit (assuming it wasn't set), test to see if our xid is the 'last xid', and iff #times_seen_same > #some_number (say, 100 or 1000), use it as if the hint bit was set. #some_number can be chosen to some fairly conservative defined value, or perhaps we can be sneaky and try and adjust it based on things like how many unforced clog faults we're doing -- maybe even keeping some accounting at the page level. We can also try to be smart and disable the 'avoid setting the hint bit' logic if the page is already dirty. The presumption here is that if you're seeing the same xid over and over, there simply isn't a lot of value in recording it in page after page after page. It's tempting to suggest that there is already an 'is this xid the same as the last one AKA last_transaction_id' at the clog visibility checking level, but that doesn't help for the following reasons: 1) while it does avoid the jump out to clog, it occurs at the wrong place in the sequence of visibility checking (after you've wasted cycles in TransactionIdIsInProgress() etc) and 2) by being in the clog level can't influence visibility behaviors of whether or not to set the bit. 3) it's not inline I see two potential downsides here: 1) maybe the extra logic in visibility is too expensive (but i doubt it) 2) you're missing early opportunities to set the all visible bit which in turn will influence index only scans. The above might be a small price to pay for saving all the i/o and sooner or later if the records are long lived vacuum will swing around and tag them. merlin
> 1) Keep track # times the last transaction id was repeatedly seen in > tqual.c (resetting as soon as a new xid was touched. We can do this > just for xmin, or separately for both xmin and xmax. Will this be done when we see a new xid duricg scan of tuple, if yes then Won't this logic impact incase there are very few records per transaction in database. As in that case the new optimization won't help much, but it adds the new instructions. > We can also try to be smart and disable > the 'avoid setting the hint bit' logic if the page is already dirty. Whats the benefit for not setting hint bit for a already dirty page. -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Merlin Moncure Sent: Wednesday, June 13, 2012 4:28 AM To: PostgreSQL-development Subject: [HACKERS] hint bit i/o reduction hackers, A while back I made an effort implementing a backend local hint bit cache with the intention of mitigating i/o impact for scan heavy workloads that involve moving a lot of records around per transaction.The basic concept was to keep some backend privatememory that tracked the resolution of recently seen transactions and worked in a similar fashion to the hint bits: if (!commit_hint_bit_set && hint_bit_cache(xid) == committed) { The other interesting aspect was that, if the bit was found in the cache, the bit was set on the tuple but the page was not dirtied. If the xid was not found in the cache, the bit was set and the page was dirtied normally. From a strictly performance standpoint, limited testing seemed to indicate it more or less fixed the hint bit i/o problems. Still, injecting extra logic (especially a cache lookup) into the visibility routines is not to be taken lightly, and maybe there's a better way. There simply has to be a way to ameliorate hint bit i/o without spending a lot of cycles and hopefully not too much impacting other workloads. Unfortunately, assuming the CRC patch makes it in, any of the branch of tricks in the line of 'set the bit but avoid dirtying the page' aren't going to fly. I think, at the end of the day, we have to avoid setting the bit, but only in cases where we are fairly sure we would prefer not to do that. I'm thinking something fairly simple would get some good mileage: 1) Keep track # times the last transaction id was repeatedly seen in tqual.c (resetting as soon as a new xid was touched. We can do this just for xmin, or separately for both xmin and xmax. 2) right after checking the hint bit (assuming it wasn't set), test to see if our xid is the 'last xid', and iff #times_seen_same > #some_number (say, 100 or 1000), use it as if the hint bit was set. #some_number can be chosen to some fairly conservative defined value, or perhaps we can be sneaky and try and adjust it based on things like how many unforced clog faults we're doing -- maybe even keeping some accounting at the page level. We can also try to be smart and disable the 'avoid setting the hint bit' logic if the page is already dirty. The presumption here is that if you're seeing the same xid over and over, there simply isn't a lot of value in recording it in page after page after page. It's tempting to suggest that there is already an 'is this xid the same as the last one AKA last_transaction_id' at the clog visibility checking level, but that doesn't help for the following reasons: 1) while it does avoid the jump out to clog, it occurs at the wrong place in the sequence of visibility checking (after you've wasted cycles in TransactionIdIsInProgress() etc) and 2) by being in the clog level can't influence visibility behaviors of whether or not to set the bit. 3) it's not inline I see two potential downsides here: 1) maybe the extra logic in visibility is too expensive (but i doubt it) 2) you're missing early opportunities to set the all visible bit which in turn will influence index only scans. The above might be a small price to pay for saving all the i/o and sooner or later if the records are long lived vacuum will swing around and tag them. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 13, 2012 at 3:57 AM, Amit Kapila <amit.kapila@huawei.com> wrote: >> 1) Keep track # times the last transaction id was repeatedly seen in >> tqual.c (resetting as soon as a new xid was touched. We can do this >> just for xmin, or separately for both xmin and xmax. > > Will this be done when we see a new xid duricg scan of tuple, if yes then > Won't this logic impact incase there are very few records per transaction in > database. > As in that case the new optimization won't help much, but it adds the new > instructions. Yes, but only in the unhinted case -- in oltp workloads tuples get hinted fairly quickly so I doubt this would be a huge impact. Hinted scans will work exactly as they do now. In the unhinted case for OLTP a few tests are added but that's noise compared to the other stuff going on. >> We can also try to be smart and disable >> the 'avoid setting the hint bit' logic if the page is already dirty. > > Whats the benefit for not setting hint bit for a already dirty page. Nothing -- we are disabling the '*avoid* setting the hint bit' logic (double negative there). In other words we would be disabling the setting of hint bits if we see the same transaction many times in a row, but only if the page isn't dirty. merlin
> Yes, but only in the unhinted case -- in oltp workloads tuples get > hinted fairly quickly so I doubt this would be a huge impact. Hinted > scans will work exactly as they do now. In the unhinted case for OLTP > a few tests are added but that's noise compared to the other stuff > going on. I believe the design you have purposed is for the unhinted cases only, means if the tuple has already hint set then it will not go to new logic. Is that right? In unhinted cases, there can be 2 ways hint bit can be set 1. It gets the transaction status from Clog memory buffers 2. It needs to do I/O to get the transaction status from Clog. So, I think it will add overhead in case-1 where the processing is fast, but now we will add some new instructions to it. However I think if you have already tested the scenario for same then there should not be any problem. -----Original Message----- From: Merlin Moncure [mailto:mmoncure@gmail.com] Sent: Wednesday, June 13, 2012 6:50 PM To: Amit Kapila Cc: PostgreSQL-development Subject: Re: [HACKERS] hint bit i/o reduction On Wed, Jun 13, 2012 at 3:57 AM, Amit Kapila <amit.kapila@huawei.com> wrote: >> 1) Keep track # times the last transaction id was repeatedly seen in >> tqual.c (resetting as soon as a new xid was touched. We can do this >> just for xmin, or separately for both xmin and xmax. > > Will this be done when we see a new xid duricg scan of tuple, if yes then > Won't this logic impact incase there are very few records per transaction in > database. > As in that case the new optimization won't help much, but it adds the new > instructions. Yes, but only in the unhinted case -- in oltp workloads tuples get hinted fairly quickly so I doubt this would be a huge impact. Hinted scans will work exactly as they do now. In the unhinted case for OLTP a few tests are added but that's noise compared to the other stuff going on. >> We can also try to be smart and disable >> the 'avoid setting the hint bit' logic if the page is already dirty. > > Whats the benefit for not setting hint bit for a already dirty page. Nothing -- we are disabling the '*avoid* setting the hint bit' logic (double negative there). In other words we would be disabling the setting of hint bits if we see the same transaction many times in a row, but only if the page isn't dirty. merlin
On Wed, Jun 13, 2012 at 9:02 AM, Amit Kapila <amit.kapila@huawei.com> wrote: >> Yes, but only in the unhinted case -- in oltp workloads tuples get >> hinted fairly quickly so I doubt this would be a huge impact. Hinted >> scans will work exactly as they do now. In the unhinted case for OLTP >> a few tests are added but that's noise compared to the other stuff >> going on. > > I believe the design you have purposed is for the unhinted cases only, means > if the tuple has already hint set then it will not go to new logic. Is that > right? > > In unhinted cases, there can be 2 ways hint bit can be set > 1. It gets the transaction status from Clog memory buffers > 2. It needs to do I/O to get the transaction status from Clog. > > So, I think it will add overhead in case-1 where the processing is fast, but > now we will add some new instructions to it. yeah -- but you still have to call: *) TransactionIdIsCurrentTransactionId(), which adds a couple of tests (and a bsearch in the presence of subtransactions) *) TransactionIdIsInProgress() which adds a few tests and a out of line call in the typical case *) TransactionIdDidCommit() which adds a few tests and two out of line calls in the typical case and, while setting the hint it: *) Several more tests plus: *) TransactionIdGetCommitLSN() another out of line call and a test *) XLogNeedsFlush() two more out of line calls plus a few tests *) SetBufferCommitInfoNeedsSave out of line call, several more tests adding a few instructions to the above probably won't be measurable (naturally it would be tested to confirm that). merlin
> yeah -- but you still have to call: > *) TransactionIdIsCurrentTransactionId(), which adds a couple of tests > (and a bsearch in the presence of subtransactions) > *) TransactionIdIsInProgress() which adds a few tests and a out of line call in the typical case > *) TransactionIdDidCommit() which adds a few tests and two out of line calls in the typical case and, while setting the hint it: > *) Several more tests plus: > *) TransactionIdGetCommitLSN() another out of line call and a test > *) XLogNeedsFlush() two more out of line calls plus a few tests > *) SetBufferCommitInfoNeedsSave out of line call, several more tests >adding a few instructions to the above probably won't be measurable You are right, adding in above path should not impact any performance and moreover all the checks and assignment you are purposing is backend local so no contention also. Just an observation that in some cases, it will just do only point-1 or point-1 and point-2. For example 1. xmin is committed and hint bit is already set. 2. xmax is hint bit is not set, and its still not visible to current transaction. In this case it will just do point-1 and point-2. -----Original Message----- From: Merlin Moncure [mailto:mmoncure@gmail.com] Sent: Wednesday, June 13, 2012 8:36 PM To: Amit Kapila Cc: PostgreSQL-development Subject: Re: [HACKERS] hint bit i/o reduction On Wed, Jun 13, 2012 at 9:02 AM, Amit Kapila <amit.kapila@huawei.com> wrote: >> Yes, but only in the unhinted case -- in oltp workloads tuples get >> hinted fairly quickly so I doubt this would be a huge impact. Hinted >> scans will work exactly as they do now. In the unhinted case for OLTP >> a few tests are added but that's noise compared to the other stuff >> going on. > > I believe the design you have purposed is for the unhinted cases only, means > if the tuple has already hint set then it will not go to new logic. Is that > right? > > In unhinted cases, there can be 2 ways hint bit can be set > 1. It gets the transaction status from Clog memory buffers > 2. It needs to do I/O to get the transaction status from Clog. > > So, I think it will add overhead in case-1 where the processing is fast, but > now we will add some new instructions to it. yeah -- but you still have to call: *) TransactionIdIsCurrentTransactionId(), which adds a couple of tests (and a bsearch in the presence of subtransactions) *) TransactionIdIsInProgress() which adds a few tests and a out of line call in the typical case *) TransactionIdDidCommit() which adds a few tests and two out of line calls in the typical case and, while setting the hint it: *) Several more tests plus: *) TransactionIdGetCommitLSN() another out of line call and a test *) XLogNeedsFlush() two more out of line calls plus a few tests *) SetBufferCommitInfoNeedsSave out of line call, several more tests adding a few instructions to the above probably won't be measurable (naturally it would be tested to confirm that). merlin