Thread: hint bit i/o reduction

hint bit i/o reduction

From

Merlin Moncure

Date:

12 June 2012, 19:58:18

hackers,

A while back I made an effort implementing a backend local hint bit
cache with the intention of mitigating i/o impact for scan heavy
workloads that involve moving a lot of records around per transaction.The basic concept was to keep some backend
privatememory that
 
tracked the resolution of recently seen transactions and worked in a
similar fashion to the hint bits:

if (!commit_hint_bit_set && hint_bit_cache(xid) == committed)
{

The other interesting aspect was that, if the bit was found in the
cache, the bit was set on the tuple but the page was not dirtied.  If
the xid was not found in the cache, the bit was set and the page was
dirtied normally.  From a strictly performance standpoint, limited
testing seemed to indicate it more or less fixed the hint bit i/o
problems.  Still, injecting extra logic (especially a cache lookup)
into the visibility routines is not to be taken lightly, and maybe
there's a better way.  There simply has to be a way to ameliorate hint
bit i/o without spending a lot of cycles and hopefully not too much
impacting other workloads.  Unfortunately, assuming the CRC patch
makes it in, any of the branch of tricks in the line of 'set the bit
but avoid dirtying the page' aren't going to fly.  I think, at the end
of the day, we have to avoid setting the bit, but only in cases where
we are fairly sure we would prefer not to do that.

I'm thinking something fairly simple would get some good mileage:

1) Keep track # times the last transaction id was repeatedly seen in
tqual.c (resetting as soon as a new xid was touched.  We can do this
just for xmin, or separately for both xmin and xmax.
2) right after checking the hint bit (assuming it wasn't set), test to
see if our xid is the 'last xid', and iff #times_seen_same >
#some_number (say, 100 or 1000), use it as if the hint bit was set.
#some_number can be chosen to some fairly conservative defined value,
or perhaps we can be sneaky and try and adjust it based on things like
how many unforced clog faults we're doing -- maybe even keeping some
accounting at the page level.  We can also try to be smart and disable
the 'avoid setting the hint bit' logic if the page is already dirty.

The presumption here is that if you're seeing the same xid over and
over, there simply isn't a lot of value in recording it in page after
page after page.  It's tempting to suggest that there is already an
'is this xid the same as the last one AKA last_transaction_id' at the
clog visibility checking level, but that doesn't help for the
following reasons:

1) while it does avoid the jump out to clog, it occurs at the wrong
place in the sequence of visibility checking (after you've wasted
cycles in TransactionIdIsInProgress() etc) and
2) by being in the clog level can't influence visibility behaviors of
whether or not to set the bit.
3) it's not inline

I see two potential downsides here:
1) maybe the extra logic in visibility is too expensive (but i doubt it)
2) you're missing early opportunities to set the all visible bit which
in turn will influence index only scans.

The above might be a small price to pay for saving all the i/o and
sooner or later if the records are long lived vacuum will swing around
and tag them.

merlin

Re: hint bit i/o reduction

From

Amit Kapila

Date:

13 June 2012, 06:00:14

> 1) Keep track # times the last transaction id was repeatedly seen in
> tqual.c (resetting as soon as a new xid was touched.  We can do this
> just for xmin, or separately for both xmin and xmax.

Will this be done when we see a new xid duricg scan of tuple, if yes then
Won't this logic impact incase there are very few records per transaction in
database.
As in that case the new optimization won't help much, but it adds the new
instructions.

> We can also try to be smart and disable
> the 'avoid setting the hint bit' logic if the page is already dirty.

Whats the benefit for not setting hint bit for a already dirty page. 

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Merlin Moncure
Sent: Wednesday, June 13, 2012 4:28 AM
To: PostgreSQL-development
Subject: [HACKERS] hint bit i/o reduction

hackers,

A while back I made an effort implementing a backend local hint bit
cache with the intention of mitigating i/o impact for scan heavy
workloads that involve moving a lot of records around per transaction.The basic concept was to keep some backend
privatememory that
 
tracked the resolution of recently seen transactions and worked in a
similar fashion to the hint bits:

if (!commit_hint_bit_set && hint_bit_cache(xid) == committed)
{

The other interesting aspect was that, if the bit was found in the
cache, the bit was set on the tuple but the page was not dirtied.  If
the xid was not found in the cache, the bit was set and the page was
dirtied normally.  From a strictly performance standpoint, limited
testing seemed to indicate it more or less fixed the hint bit i/o
problems.  Still, injecting extra logic (especially a cache lookup)
into the visibility routines is not to be taken lightly, and maybe
there's a better way.  There simply has to be a way to ameliorate hint
bit i/o without spending a lot of cycles and hopefully not too much
impacting other workloads.  Unfortunately, assuming the CRC patch
makes it in, any of the branch of tricks in the line of 'set the bit
but avoid dirtying the page' aren't going to fly.  I think, at the end
of the day, we have to avoid setting the bit, but only in cases where
we are fairly sure we would prefer not to do that.

I'm thinking something fairly simple would get some good mileage:

1) Keep track # times the last transaction id was repeatedly seen in
tqual.c (resetting as soon as a new xid was touched.  We can do this
just for xmin, or separately for both xmin and xmax.
2) right after checking the hint bit (assuming it wasn't set), test to
see if our xid is the 'last xid', and iff #times_seen_same >
#some_number (say, 100 or 1000), use it as if the hint bit was set.
#some_number can be chosen to some fairly conservative defined value,
or perhaps we can be sneaky and try and adjust it based on things like
how many unforced clog faults we're doing -- maybe even keeping some
accounting at the page level.  We can also try to be smart and disable
the 'avoid setting the hint bit' logic if the page is already dirty.

The presumption here is that if you're seeing the same xid over and
over, there simply isn't a lot of value in recording it in page after
page after page.  It's tempting to suggest that there is already an
'is this xid the same as the last one AKA last_transaction_id' at the
clog visibility checking level, but that doesn't help for the
following reasons:

1) while it does avoid the jump out to clog, it occurs at the wrong
place in the sequence of visibility checking (after you've wasted
cycles in TransactionIdIsInProgress() etc) and
2) by being in the clog level can't influence visibility behaviors of
whether or not to set the bit.
3) it's not inline

I see two potential downsides here:
1) maybe the extra logic in visibility is too expensive (but i doubt it)
2) you're missing early opportunities to set the all visible bit which
in turn will influence index only scans.

The above might be a small price to pay for saving all the i/o and
sooner or later if the records are long lived vacuum will swing around
and tag them.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: hint bit i/o reduction

From

Merlin Moncure

Date:

13 June 2012, 10:20:03

On Wed, Jun 13, 2012 at 3:57 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> 1) Keep track # times the last transaction id was repeatedly seen in
>> tqual.c (resetting as soon as a new xid was touched.  We can do this
>> just for xmin, or separately for both xmin and xmax.
>
> Will this be done when we see a new xid duricg scan of tuple, if yes then
> Won't this logic impact incase there are very few records per transaction in
> database.
> As in that case the new optimization won't help much, but it adds the new
> instructions.

Yes, but only in the unhinted case -- in oltp workloads tuples get
hinted fairly quickly so I doubt this would be a huge impact.  Hinted
scans will work exactly as they do now.  In the unhinted case for OLTP
a  few tests are added but that's noise compared to the other stuff
going on.

>> We can also try to be smart and disable
>> the 'avoid setting the hint bit' logic if the page is already dirty.
>
> Whats the benefit for not setting hint bit for a already dirty page.

Nothing -- we are disabling the '*avoid* setting the hint bit' logic
(double negative there).  In other words we would be disabling the
setting of hint bits if we see the same transaction many times in a
row, but only if the page isn't dirty.

merlin

Re: hint bit i/o reduction

From

Amit Kapila

Date:

13 June 2012, 11:03:14

> Yes, but only in the unhinted case -- in oltp workloads tuples get
> hinted fairly quickly so I doubt this would be a huge impact.  Hinted
> scans will work exactly as they do now.  In the unhinted case for OLTP
> a  few tests are added but that's noise compared to the other stuff
> going on.

I believe the design you have purposed is for the unhinted cases only, means
if the tuple has already hint set then it will not go to new logic. Is that
right?

In unhinted cases, there can be 2 ways hint bit can be set
1. It gets the transaction status from Clog memory buffers
2. It needs to do I/O to get the transaction status from Clog.

So, I think it will add overhead in case-1 where the processing is fast, but
now we will add some new instructions to it.
However I think if you have already tested the scenario for same then there
should not be any problem.

-----Original Message-----
From: Merlin Moncure [mailto:mmoncure@gmail.com]
Sent: Wednesday, June 13, 2012 6:50 PM
To: Amit Kapila
Cc: PostgreSQL-development
Subject: Re: [HACKERS] hint bit i/o reduction

On Wed, Jun 13, 2012 at 3:57 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> 1) Keep track # times the last transaction id was repeatedly seen in
>> tqual.c (resetting as soon as a new xid was touched.  We can do this
>> just for xmin, or separately for both xmin and xmax.
>
> Will this be done when we see a new xid duricg scan of tuple, if yes then
> Won't this logic impact incase there are very few records per transaction
in
> database.
> As in that case the new optimization won't help much, but it adds the new
> instructions.

Yes, but only in the unhinted case -- in oltp workloads tuples get
hinted fairly quickly so I doubt this would be a huge impact.  Hinted
scans will work exactly as they do now.  In the unhinted case for OLTP
a  few tests are added but that's noise compared to the other stuff
going on.

>> We can also try to be smart and disable
>> the 'avoid setting the hint bit' logic if the page is already dirty.
>
> Whats the benefit for not setting hint bit for a already dirty page.

Nothing -- we are disabling the '*avoid* setting the hint bit' logic
(double negative there).  In other words we would be disabling the
setting of hint bits if we see the same transaction many times in a
row, but only if the page isn't dirty.

merlin

Re: hint bit i/o reduction

From

Merlin Moncure

Date:

13 June 2012, 12:06:16

On Wed, Jun 13, 2012 at 9:02 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> Yes, but only in the unhinted case -- in oltp workloads tuples get
>> hinted fairly quickly so I doubt this would be a huge impact.  Hinted
>> scans will work exactly as they do now.  In the unhinted case for OLTP
>> a  few tests are added but that's noise compared to the other stuff
>> going on.
>
> I believe the design you have purposed is for the unhinted cases only, means
> if the tuple has already hint set then it will not go to new logic. Is that
> right?
>
> In unhinted cases, there can be 2 ways hint bit can be set
> 1. It gets the transaction status from Clog memory buffers
> 2. It needs to do I/O to get the transaction status from Clog.
>
> So, I think it will add overhead in case-1 where the processing is fast, but
> now we will add some new instructions to it.

yeah -- but you still have to call:
*) TransactionIdIsCurrentTransactionId(), which adds a couple of tests
(and a bsearch in the presence of subtransactions)
*) TransactionIdIsInProgress() which adds a few tests and a out of
line call in the typical case
*) TransactionIdDidCommit() which adds a few tests and two out of line
calls in the typical case
and, while setting the hint it:
*) Several more tests plus:
*) TransactionIdGetCommitLSN()  another out of line call and a test
*) XLogNeedsFlush() two more out of line calls plus a few tests
*) SetBufferCommitInfoNeedsSave out of line call, several more tests

adding a few instructions to the above probably won't be measurable
(naturally it would be tested to confirm that).

merlin

Re: hint bit i/o reduction

From

Amit Kapila

Date:

14 June 2012, 04:00:22

> yeah -- but you still have to call:
> *) TransactionIdIsCurrentTransactionId(), which adds a couple of tests
> (and a bsearch in the presence of subtransactions)
> *) TransactionIdIsInProgress() which adds a few tests and a out of  line call in the typical case
> *) TransactionIdDidCommit() which adds a few tests and two out of line calls in the typical case
and, while setting the hint it:
> *) Several more tests plus:
> *) TransactionIdGetCommitLSN()  another out of line call and a test
> *) XLogNeedsFlush() two more out of line calls plus a few tests
> *) SetBufferCommitInfoNeedsSave out of line call, several more tests

>adding a few instructions to the above probably won't be measurable

You are right, adding in above path should not impact any performance and
moreover all the checks and assignment you are purposing is backend local so
no contention also.

Just an observation that in some cases, it will just do only point-1 or
point-1 and point-2.
For example
1. xmin is committed and hint bit is already set.
2. xmax is hint bit is not set, and its still not visible to current
transaction.
In this case it will just do point-1 and point-2.


-----Original Message-----
From: Merlin Moncure [mailto:mmoncure@gmail.com]
Sent: Wednesday, June 13, 2012 8:36 PM
To: Amit Kapila
Cc: PostgreSQL-development
Subject: Re: [HACKERS] hint bit i/o reduction

On Wed, Jun 13, 2012 at 9:02 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> Yes, but only in the unhinted case -- in oltp workloads tuples get
>> hinted fairly quickly so I doubt this would be a huge impact.  Hinted
>> scans will work exactly as they do now.  In the unhinted case for OLTP
>> a  few tests are added but that's noise compared to the other stuff
>> going on.
>
> I believe the design you have purposed is for the unhinted cases only,
means
> if the tuple has already hint set then it will not go to new logic. Is
that
> right?
>
> In unhinted cases, there can be 2 ways hint bit can be set
> 1. It gets the transaction status from Clog memory buffers
> 2. It needs to do I/O to get the transaction status from Clog.
>
> So, I think it will add overhead in case-1 where the processing is fast,
but
> now we will add some new instructions to it.

yeah -- but you still have to call:
*) TransactionIdIsCurrentTransactionId(), which adds a couple of tests
(and a bsearch in the presence of subtransactions)
*) TransactionIdIsInProgress() which adds a few tests and a out of
line call in the typical case
*) TransactionIdDidCommit() which adds a few tests and two out of line
calls in the typical case
and, while setting the hint it:
*) Several more tests plus:
*) TransactionIdGetCommitLSN()  another out of line call and a test
*) XLogNeedsFlush() two more out of line calls plus a few tests
*) SetBufferCommitInfoNeedsSave out of line call, several more tests

adding a few instructions to the above probably won't be measurable
(naturally it would be tested to confirm that).

merlin