Thread: HOT pgbench results

HOT pgbench results

From
Heikki Linnakangas
Date:
I ran some CPU intensive pgbench tests on HOT. Results are not
surprising, HOT makes practically no difference on the total transaction
rate, but reduces the need to vacuum:

        unpatched    HOT
tps        3680        3790
WAL written(MB)    5386        4804
checkpoints    10        9
autovacuums    116        43
autoanalyzes    139        60

I believe the small gain in tps is due to the reduction in WAL volume.
WAL is checksummed, and calculating the CRC uses some CPU. The tps
difference is almost within the margin of error, though.

HOT greatly reduces the number of vacuums needed. That's good, that's
where the gains in throughput in longer I/O bound runs comes from.


The tests were run with fsync=off, with following commands:

pgbench -i -s 10 postgres
pgbench -c 5 -t 1000000 postgres -l

The version used was CVS HEAD, with Simple-HOT-v2.patch applied in the
HOT run. The cluster was initdb'd and created from scratch before each
test run. Attached is the full postgresql.conf and test script used.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Attachment

Re: HOT pgbench results

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
>         unpatched    HOT    
> autovacuums    116        43
> autoanalyzes    139        60

> HOT greatly reduces the number of vacuums needed. That's good, that's
> where the gains in throughput in longer I/O bound runs comes from.

But surely failing to auto-analyze after a HOT update is a bad thing.
        regards, tom lane


Re: HOT pgbench results

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>>         unpatched    HOT    
>> autovacuums    116        43
>> autoanalyzes    139        60
> 
>> HOT greatly reduces the number of vacuums needed. That's good, that's
>> where the gains in throughput in longer I/O bound runs comes from.
> 
> But surely failing to auto-analyze after a HOT update is a bad thing.

Hmm, I suppose. I don't think we've spend any time thinking about how to
factor in HOT updates into the autovacuum and autoanalyze formulas yet.

I'd argue that HOT updates are not as significant as cold ones from
statistics point of view, though, because they don't change indexed
columns. HOT-updated fields are not likely used as primary search quals.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: HOT pgbench results

From
"Zeugswetter Andreas ADI SD"
Date:
> >         unpatched    HOT
> > autovacuums    116        43
> > autoanalyzes    139        60
>
> > HOT greatly reduces the number of vacuums needed. That's
> good, that's
> > where the gains in throughput in longer I/O bound runs comes from.
>
> But surely failing to auto-analyze after a HOT update is a bad thing.

Well, the definition is that no index columns changed, so this seems
debateable.
It seems for OLTP you should not need an analyze, but for DSS filtering
or joining
on non indexed columns you would. And that would also only be relevant
if you created
custom statistics on non indexed columns.

Andreas


Re: HOT pgbench results

From
Mark Mielke
Date:
Heikki Linnakangas wrote: <blockquote cite="mid:46B8844C.2050506@enterprisedb.com" type="cite"><pre wrap="">Tom Lane
wrote:</pre><blockquote type="cite"><pre wrap="">Heikki Linnakangas <a class="moz-txt-link-rfc2396E"
href="mailto:heikki@enterprisedb.com"><heikki@enterprisedb.com></a>writes:   </pre><blockquote type="cite"><pre
wrap="">HOTgreatly reduces the number of vacuums needed. That's good, that's
 
where the gains in throughput in longer I/O bound runs comes from.     </pre></blockquote><pre wrap="">But surely
failingto auto-analyze after a HOT update is a bad thing.   </pre></blockquote><pre wrap="">
 
Hmm, I suppose. I don't think we've spend any time thinking about how to
factor in HOT updates into the autovacuum and autoanalyze formulas yet.

I'd argue that HOT updates are not as significant as cold ones from
statistics point of view, though, because they don't change indexed
columns. HOT-updated fields are not likely used as primary search quals. </pre></blockquote> Even for fields that are
usedin primary searches, HOT updates avoid changing the disk block layout, and as reading from the disk is usually the
mostexpensive operation, the decisions shouldn't change much before and after a HOT update compared to before and after
aregular update.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- 
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>

Re: HOT pgbench results

From
"Simon Riggs"
Date:
On Tue, 2007-08-07 at 15:40 +0100, Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Heikki Linnakangas <heikki@enterprisedb.com> writes:
> >>         unpatched    HOT    
> >> autovacuums    116        43
> >> autoanalyzes    139        60
> > 
> >> HOT greatly reduces the number of vacuums needed. That's good, that's
> >> where the gains in throughput in longer I/O bound runs comes from.
> > 
> > But surely failing to auto-analyze after a HOT update is a bad thing.
> 
> Hmm, I suppose. I don't think we've spend any time thinking about how to
> factor in HOT updates into the autovacuum and autoanalyze formulas yet.

> I'd argue that HOT updates are not as significant as cold ones from
> statistics point of view, though, because they don't change indexed
> columns. HOT-updated fields are not likely used as primary search quals.

I agree with that thought, but the changes to unindexed fields are just
as important for selectivity calculations so we should ANALYZE just as
frequently. ANALYZE is cheap, so we aren't saving anything by avoiding
them.

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



Re: HOT pgbench results

From
"Simon Riggs"
Date:
On Tue, 2007-08-07 at 13:16 +0100, Heikki Linnakangas wrote:
> I ran some CPU intensive pgbench tests on HOT. Results are not
> surprising, HOT makes practically no difference on the total transaction
> rate, but reduces the need to vacuum:
> 
>         unpatched    HOT    
> tps        3680        3790
> WAL written(MB)    5386        4804
> checkpoints    10        9
> autovacuums    116        43
> autoanalyzes    139        60

Nor would I expect anything else, on this test.

The pgbench database has 4 tables, of which 3 have one index and 1 has
no indexes at all. 

A table without indexes is uncommon and most major entities such as
accounts have 2-3 indexes if not more. So I would be inclined to add a
PK to HISTORY and add two additional indexes to ACCOUNTS and then repeat
the test to see what difference it makes.

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



Re: HOT pgbench results

From
Gregory Stark
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:

> On Tue, 2007-08-07 at 13:16 +0100, Heikki Linnakangas wrote:
>> I ran some CPU intensive pgbench tests on HOT. Results are not
>> surprising, HOT makes practically no difference on the total transaction
>> rate, but reduces the need to vacuum:
>> ...
> Nor would I expect anything else, on this test.

I think the surprising thing was that it wasn't slower due to the extra cpu
spent pruning tuples.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: HOT pgbench results

From
"Merlin Moncure"
Date:
On 8/7/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
> I ran some CPU intensive pgbench tests on HOT. Results are not
> surprising, HOT makes practically no difference on the total transaction
> rate, but reduces the need to vacuum:
>
>                 unpatched       HOT
> tps             3680            3790
> WAL written(MB) 5386            4804
> checkpoints     10              9
> autovacuums     116             43
> autoanalyzes    139             60

Here are some more results...all stock except for partial writes, 24
segments (fsync on).  hardware is four 15k sas in a raid 10.  I am
seeing very good results in other real wold scenarios outside of
pgbench....anyone is interested drop me a line.  Note I cut the
transaction runs down to 100k from 1M.

*** HOT ***
[postgres@efsd-main root]$ time pgbench -c 5 -t 100000
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 5
number of transactions per client: 100000
number of transactions actually processed: 500000/500000
tps = 1156.605130 (including connections establishing)
tps = 1156.637464 (excluding connections establishing)

real    7m12.311s
user    0m26.784s
sys     0m25.429s

*** cvs, HOT ***
[postgres@efsd-main pgsql]$ time pgbench -c 5 -t 100000
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 5
number of transactions per client: 100000
number of transactions actually processed: 500000/500000
tps = 630.510918 (including connections establishing)
tps = 630.520485 (excluding connections establishing)

real    13m13.019s
user    0m27.278s
sys     0m26.092s


Re: HOT pgbench results

From
"Simon Riggs"
Date:
On Tue, 2007-08-07 at 20:27 +0100, Gregory Stark wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> 
> > On Tue, 2007-08-07 at 13:16 +0100, Heikki Linnakangas wrote:
> >> I ran some CPU intensive pgbench tests on HOT. Results are not
> >> surprising, HOT makes practically no difference on the total transaction
> >> rate, but reduces the need to vacuum:
> >> ...
> > Nor would I expect anything else, on this test.
> 
> I think the surprising thing was that it wasn't slower due to the extra cpu
> spent pruning tuples.

...balanced by the extra time spent adding new blocks and doing
block-spanning updates without HOT.

For CPU bound situations, the real-world difference lies in the logical
I/O we avoid by not doing index insertions. Larger tables have deeper
index trees, so cause more block accesses to locate the block into which
to insert. Small tables with few indexes aren't a real test of that,
even if it does illustrate the basic CPU balance that HOT now offers in
its latest incarnation (well done Heikki and Pavan). 

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



Re: HOT pgbench results

From
"Merlin Moncure"
Date:
On 8/8/07, Merlin Moncure <mmoncure@gmail.com> wrote:
> On 8/7/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
> > I ran some CPU intensive pgbench tests on HOT. Results are not
> > surprising, HOT makes practically no difference on the total transaction
> > rate, but reduces the need to vacuum:
> >
> >                 unpatched       HOT
> > tps             3680            3790
> > WAL written(MB) 5386            4804
> > checkpoints     10              9
> > autovacuums     116             43
> > autoanalyzes    139             60
>
> Here are some more results...all stock except for partial writes, 24
> segments (fsync on).  hardware is four 15k sas in a raid 10.  I am
> seeing very good results in other real wold scenarios outside of
> pgbench....anyone is interested drop me a line.  Note I cut the
> transaction runs down to 100k from 1M.
>
> *** HOT ***
> [postgres@efsd-main root]$ time pgbench -c 5 -t 100000
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 10
> number of clients: 5
> number of transactions per client: 100000
> number of transactions actually processed: 500000/500000
> tps = 1156.605130 (including connections establishing)
> tps = 1156.637464 (excluding connections establishing)
>
> real    7m12.311s
> user    0m26.784s
> sys     0m25.429s
>
> *** cvs, HOT ***
> [postgres@efsd-main pgsql]$ time pgbench -c 5 -t 100000
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 10
> number of clients: 5
> number of transactions per client: 100000
> number of transactions actually processed: 500000/500000
> tps = 630.510918 (including connections establishing)
> tps = 630.520485 (excluding connections establishing)
>
> real    13m13.019s
> user    0m27.278s
> sys     0m26.092s

oops! second case was w/o HOT patch applied (but we knew that) :D

merlin


Re: HOT pgbench results

From
ITAGAKI Takahiro
Date:
Heikki Linnakangas <heikki@enterprisedb.com> wrote:

> I ran some CPU intensive pgbench tests on HOT. Results are not
> surprising, HOT makes practically no difference on the total transaction
> rate, but reduces the need to vacuum:
> 
>                 unpatched     HOT
> tps             3680          3790
> WAL written(MB) 5386          4804
> checkpoints     10            9
> autovacuums     116           43
> autoanalyzes    139           60

I also ran pgbench with/without HOT using a bit different configurations
(pgbench -s10 -c10 -t500000). There were 10% performance win on HOT,
although the test was CPU intensive and with FILLFACTOR=100%.
               unpatched     HOT
tps             3366          3634
WAL written(MB) 4969          4374
checkpoints     9             8
autovacuums     126           42
autoanalyzes    146           59


I gathered oprofile logs. There were 4 HOT-related functions, that didn't
appear in the unpatched test. But it is probably not so serious.- heap_page_prune           1.84%-
PageRepairFragmentation  0.94%- pg_qsort                  0.44% (called from PageRepairFragmentation)
 

On the other hand, the number of _bt_compare and _bt_checkkeys were
reduced by HOT, because we avoid the most part of index insertions.
It looks like LWLockAcquire/Release were also reduced, but I cannot
assure it is a benefits of HOT or a measurement deviation.

unpatched HOT %     symbol name
4.0867    4.2314    AllocSetAlloc
2.7839    2.8022    base_yyparse         1.8392    heap_page_prune
1.8459    1.6659    SearchCatCache
1.7405    1.6087    MemoryContextAllocZeroAligned
1.6936    1.5743    hash_search_with_hash_value
1.0672    1.1822    base_yylex
1.2430    1.1570    XLogInsert         0.9356    PageRepairFragmentation
1.3549    0.8911    LWLockAcquire
1.0977    0.8663    LWLockRelease
0.8018    0.7284    nocachegetattr
0.7568    0.7124    FunctionCall2
0.5264    0.6795    ScanKeywordLookup
0.7115    0.6462    hash_any
0.7399    0.5963    AllocSetFree
0.6650    0.5925    GetSnapshotData
0.5536    0.5789    MemoryContextAlloc
0.5643    0.5547    hash_seq_search
0.4660    0.5005    expression_tree_walker
0.5293    0.4777    ExecInitExpr         0.4381    pg_qsort
0.4376    0.4321    hash_uint32
0.4160    0.4268    expression_tree_mutator
0.4322    0.4183    LockAcquire
0.6933    0.3911    _bt_compare
0.5270    0.3828    PinBuffer
0.4025    0.3798    fmgr_info_cxt_security
0.4458    0.3758    MemoryContextAllocZero
0.5101              _bt_checkkeys

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: HOT pgbench results

From
Heikki Linnakangas
Date:
Thanks for the testing,

ITAGAKI Takahiro wrote:
> I gathered oprofile logs. There were 4 HOT-related functions, that didn't
> appear in the unpatched test. But it is probably not so serious.
>  - heap_page_prune           1.84%
>  - PageRepairFragmentation   0.94%
>  - pg_qsort                  0.44% (called from PageRepairFragmentation)

That's expected. Those functions are involved in removing the dead HOT
tuples, replacing VACUUMs. Maybe we could make them cheaper, but it's
not too bad as it is.

> On the other hand, the number of _bt_compare and _bt_checkkeys were
> reduced by HOT, because we avoid the most part of index insertions.
> It looks like LWLockAcquire/Release were also reduced, but I cannot
> assure it is a benefits of HOT or a measurement deviation.

It could very well be real. Because of the reduction of index
insertions, there's less locking of the index pages.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: HOT pgbench results

From
"Merlin Moncure"
Date:
On 8/14/07, ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> wrote:
>
> > I ran some CPU intensive pgbench tests on HOT. Results are not
> > surprising, HOT makes practically no difference on the total transaction
> > rate, but reduces the need to vacuum:
> >
> >                 unpatched     HOT
> > tps             3680          3790
> > WAL written(MB) 5386          4804
> > checkpoints     10            9
> > autovacuums     116           43
> > autoanalyzes    139           60
>
> I also ran pgbench with/without HOT using a bit different configurations
> (pgbench -s10 -c10 -t500000). There were 10% performance win on HOT,
> although the test was CPU intensive and with FILLFACTOR=100%.

I'm curious why I am seeing results so different from everybody else
(I had almost double tps with HOT).  Are you running fsync on/off?
Any other changes to postgresql.conf?

merlin