Re: Process local hint bit cache - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: Process local hint bit cache
Date
Msg-id AANLkTi=fUKMYRkEES1SO=iyC+ynX92pv+q2yKfTSLzHq@mail.gmail.com
Whole thread Raw
In response to Re: Process local hint bit cache  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Process local hint bit cache
List pgsql-hackers
On Wed, Mar 30, 2011 at 2:35 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Mar 30, 2011 at 11:23 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> On 30.03.2011 18:02, Robert Haas wrote:
>>>
>>> On Wed, Mar 30, 2011 at 10:40 AM, Greg Stark<gsstark@mit.edu>  wrote:
>>>>
>>>> But one way or another the hint bits have to get set sometime. The
>>>> sooner that happens the less clog i/o has to happen in the meantime.
>>>
>>> I talked about this with Merlin a bit yesterday.  I think that his
>>> thought is that most transactions will access a small enough number of
>>> distinct CLOG pages, and the cache accesses might be fast enough, that
>>> we really wouldn't need to get the hint bits set, or at least that
>>> vacuum/freeze time would be soon enough.  I'm not sure if that's
>>> actually true, though - I think the overhead of the cache might be
>>> higher than he's imagining.  However, there's a sure-fire way to find
>>> out... code it up and see how it plays.
>>
>> I did a little experiment: I hacked SetHintBits() to do nothing, so that
>> hint bits are never set. I then created a table with 100000 rows in it:
>>
>> postgres=# \d foo
>>      Table "public.foo"
>>  Column |  Type   | Modifiers
>> --------+---------+-----------
>>  a      | integer |
>>
>> postgres=# INSERT INTO foo SELECT generate_series(1, 100000);
>> INSERT 0 100000
>>
>> And ran queries on the table:
>>
>> postgres=# do $$
>> declare
>>  i int4;
>> begin
>>  loop
>>    perform COUNT(*) FROM foo;
>>  end loop;
>> end;
>> $$;
>>
>> This test case is designed to exercise the visibility tests as much as
>> possible. However, all the tuples are loaded in one transaction, so the
>> one-element cache in TransactionLogFetch is 100% effective.
>>
>> I ran oprofile on that. It shows that about 15% of the time is spent in
>> HeapTupleSatisfiesMVCC and its subroutines. 6.6% is spent in
>> HeapTupleSatisfiesMVCC itself. Here's the breakdown of that:
>>
>> $ opreport  -c --global-percent
>>
>> CPU: Intel Architectural Perfmon, speed 2266 MHz (estimated)
>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
>> mask of 0x00 (No unit mask) count 100000
>> samples  %        app name                 symbol name
>> ...
>> -------------------------------------------------------------------------------
>>  2143      0.4419  postgres                 postgres heapgettup_pagemode
>>  73277    15.1091  postgres                 postgres heapgetpage
>> 31858 6.5688  postgres                 postgres HeapTupleSatisfiesMVCC
>>  31858 6.5688  postgres                 postgres HeapTupleSatisfiesMVCC
>> [self]
>>  12809     2.6411  postgres                 postgres
>> TransactionIdIsInProgress
>>  12091     2.4931  postgres                 postgres XidInMVCCSnapshot
>>  7150      1.4743  postgres                 postgres
>> TransactionIdIsCurrentTransactionId
>>  7056      1.4549  postgres                 postgres TransactionIdDidCommit
>>  1839      0.3792  postgres                 postgres TransactionIdPrecedes
>>  1467      0.3025  postgres                 postgres SetHintBits
>>  1155      0.2382  postgres                 postgres TransactionLogFetch
>> -------------------------------------------------------------------------------
>> ...
>>
>> I then ran the same test again with an unpatched version, to set the hint
>> bits. After the hint bits were set, I ran oprofile again:
>>
>> -------------------------------------------------------------------------------
>>  275       0.4986  postgres                 heapgettup_pagemode
>>  4459      8.0851  postgres                 heapgetpage
>> 3005      5.4487  postgres                 HeapTupleSatisfiesMVCC
>>  3005      5.4487  postgres                 HeapTupleSatisfiesMVCC [self]
>>  1620      2.9374  postgres                 XidInMVCCSnapshot
>>  110       0.1995  postgres                 TransactionIdPrecedes
>> -------------------------------------------------------------------------------
>>
>> So with hint bits set, HeapTupleSatisfiesMVCC accounts for 8% of the total
>> CPU time, and without hint bits, 15%.
>>
>> Even if clog access was free, hint bits still give a significant speedup
>> thanks to skipping all the other overhead like TransactionIdIsInProgress()
>> and TransactionIdIsCurrentTransactionId(). Speeding up clog access is
>> important; when the one-element cache isn't saving you the clog access
>> becomes a dominant factor. But you have to address that other overhead too
>> before you can get rid of hint bits.
>
> Here is a patch demonstrating the caching action, but without the
> cache table, which isn't done yet -- It only works using the very last
> transaction id fetched.  I used macros so I could keep the changes
> quite localized.

While playing with the data structure to implement an xid cache inside
TransactionLogFetch, I learned a nasty lesson.
HeapTupleSatisifiesMVCC is super sensitive to any changes and even
jumping through a couple of calls to TransactionLogFetch was
measurable in a couple of performance test case I came up with.  I
hauntingly recalled Tom's warnings to beware unexpected performance
downsides.

In short, irregardless of implementation, I unfortunately don't think
any caching as or under the clog abstraction is going to work without
impacting at least some cases.

OTOH, maybe some micro optimization of HeapTupleSatisifiesMVCC could
provide the same benefit I'm going after here.  If you save off the
the last xid that came back committed in a static variable, you can
check that at the same level as if it were a hint bit.  From my
testing, this is well and truly 'free' in the sense that it can
defer/eliminate clog checks and hint bit i/o.

The patch attached gives all the advantages I was after in the
previous patch without the downsides I found (extra calls to
TransactionIdInProgress and inefficient calls out to
TransactionLogFetch).

It remains to be determined if it's possible to make a structure that
is fast enough to expand the above to more than 1 xid.  For starters,
all the calls should be inline.  Considering we are only after the
commit bits, a super tight hashing/bucketing algorithm might fit the
bill.

merlin


diff -x '*.sql' -x '*.o' -x '*.txt' -rupN
postgresql-9.1alpha4/src/backend/utils/time/tqual.c
postgresql-9.1alpha4_hbcache/src/backend/utils/time/tqual.c
--- postgresql-9.1alpha4/src/backend/utils/time/tqual.c 2011-03-09
08:19:24.000000000 -0600
+++ postgresql-9.1alpha4_hbcache/src/backend/utils/time/tqual.c
2011-03-31 17:03:43.135195002 -0500
@@ -912,7 +912,10 @@ boolHeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
          Buffer buffer){ 
-       if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
+       static TransactionId LastCommitXid = InvalidTransactionId;
+
+       if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
+               && LastCommitXid != HeapTupleHeaderGetXmin(tuple))       {               if (tuple->t_infomask &
HEAP_XMIN_INVALID)                      return false; 
@@ -985,8 +988,11 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader t               else if
(TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))                       return false;               else if
(TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
+               {
+                       LastCommitXid = HeapTupleHeaderGetXmin(tuple);                       SetHintBits(tuple, buffer,
HEAP_XMIN_COMMITTED,                                              HeapTupleHeaderGetXmin(tuple)); 
+               }               else               {                       /* it must have aborted or crashed */


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: cast from integer to money
Next
From: Merlin Moncure
Date:
Subject: Re: Process local hint bit cache