Thread: HOT pgbench results
I ran some CPU intensive pgbench tests on HOT. Results are not surprising, HOT makes practically no difference on the total transaction rate, but reduces the need to vacuum: unpatched HOT tps 3680 3790 WAL written(MB) 5386 4804 checkpoints 10 9 autovacuums 116 43 autoanalyzes 139 60 I believe the small gain in tps is due to the reduction in WAL volume. WAL is checksummed, and calculating the CRC uses some CPU. The tps difference is almost within the margin of error, though. HOT greatly reduces the number of vacuums needed. That's good, that's where the gains in throughput in longer I/O bound runs comes from. The tests were run with fsync=off, with following commands: pgbench -i -s 10 postgres pgbench -c 5 -t 1000000 postgres -l The version used was CVS HEAD, with Simple-HOT-v2.patch applied in the HOT run. The cluster was initdb'd and created from scratch before each test run. Attached is the full postgresql.conf and test script used. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Attachment
Heikki Linnakangas <heikki@enterprisedb.com> writes: > unpatched HOT > autovacuums 116 43 > autoanalyzes 139 60 > HOT greatly reduces the number of vacuums needed. That's good, that's > where the gains in throughput in longer I/O bound runs comes from. But surely failing to auto-analyze after a HOT update is a bad thing. regards, tom lane
Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: >> unpatched HOT >> autovacuums 116 43 >> autoanalyzes 139 60 > >> HOT greatly reduces the number of vacuums needed. That's good, that's >> where the gains in throughput in longer I/O bound runs comes from. > > But surely failing to auto-analyze after a HOT update is a bad thing. Hmm, I suppose. I don't think we've spend any time thinking about how to factor in HOT updates into the autovacuum and autoanalyze formulas yet. I'd argue that HOT updates are not as significant as cold ones from statistics point of view, though, because they don't change indexed columns. HOT-updated fields are not likely used as primary search quals. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
> > unpatched HOT > > autovacuums 116 43 > > autoanalyzes 139 60 > > > HOT greatly reduces the number of vacuums needed. That's > good, that's > > where the gains in throughput in longer I/O bound runs comes from. > > But surely failing to auto-analyze after a HOT update is a bad thing. Well, the definition is that no index columns changed, so this seems debateable. It seems for OLTP you should not need an analyze, but for DSS filtering or joining on non indexed columns you would. And that would also only be relevant if you created custom statistics on non indexed columns. Andreas
Heikki Linnakangas wrote: <blockquote cite="mid:46B8844C.2050506@enterprisedb.com" type="cite"><pre wrap="">Tom Lane wrote:</pre><blockquote type="cite"><pre wrap="">Heikki Linnakangas <a class="moz-txt-link-rfc2396E" href="mailto:heikki@enterprisedb.com"><heikki@enterprisedb.com></a>writes: </pre><blockquote type="cite"><pre wrap="">HOTgreatly reduces the number of vacuums needed. That's good, that's where the gains in throughput in longer I/O bound runs comes from. </pre></blockquote><pre wrap="">But surely failingto auto-analyze after a HOT update is a bad thing. </pre></blockquote><pre wrap=""> Hmm, I suppose. I don't think we've spend any time thinking about how to factor in HOT updates into the autovacuum and autoanalyze formulas yet. I'd argue that HOT updates are not as significant as cold ones from statistics point of view, though, because they don't change indexed columns. HOT-updated fields are not likely used as primary search quals. </pre></blockquote> Even for fields that are usedin primary searches, HOT updates avoid changing the disk block layout, and as reading from the disk is usually the mostexpensive operation, the decisions shouldn't change much before and after a HOT update compared to before and after aregular update.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
On Tue, 2007-08-07 at 15:40 +0100, Heikki Linnakangas wrote: > Tom Lane wrote: > > Heikki Linnakangas <heikki@enterprisedb.com> writes: > >> unpatched HOT > >> autovacuums 116 43 > >> autoanalyzes 139 60 > > > >> HOT greatly reduces the number of vacuums needed. That's good, that's > >> where the gains in throughput in longer I/O bound runs comes from. > > > > But surely failing to auto-analyze after a HOT update is a bad thing. > > Hmm, I suppose. I don't think we've spend any time thinking about how to > factor in HOT updates into the autovacuum and autoanalyze formulas yet. > I'd argue that HOT updates are not as significant as cold ones from > statistics point of view, though, because they don't change indexed > columns. HOT-updated fields are not likely used as primary search quals. I agree with that thought, but the changes to unindexed fields are just as important for selectivity calculations so we should ANALYZE just as frequently. ANALYZE is cheap, so we aren't saving anything by avoiding them. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Tue, 2007-08-07 at 13:16 +0100, Heikki Linnakangas wrote: > I ran some CPU intensive pgbench tests on HOT. Results are not > surprising, HOT makes practically no difference on the total transaction > rate, but reduces the need to vacuum: > > unpatched HOT > tps 3680 3790 > WAL written(MB) 5386 4804 > checkpoints 10 9 > autovacuums 116 43 > autoanalyzes 139 60 Nor would I expect anything else, on this test. The pgbench database has 4 tables, of which 3 have one index and 1 has no indexes at all. A table without indexes is uncommon and most major entities such as accounts have 2-3 indexes if not more. So I would be inclined to add a PK to HISTORY and add two additional indexes to ACCOUNTS and then repeat the test to see what difference it makes. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Tue, 2007-08-07 at 13:16 +0100, Heikki Linnakangas wrote: >> I ran some CPU intensive pgbench tests on HOT. Results are not >> surprising, HOT makes practically no difference on the total transaction >> rate, but reduces the need to vacuum: >> ... > Nor would I expect anything else, on this test. I think the surprising thing was that it wasn't slower due to the extra cpu spent pruning tuples. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On 8/7/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote: > I ran some CPU intensive pgbench tests on HOT. Results are not > surprising, HOT makes practically no difference on the total transaction > rate, but reduces the need to vacuum: > > unpatched HOT > tps 3680 3790 > WAL written(MB) 5386 4804 > checkpoints 10 9 > autovacuums 116 43 > autoanalyzes 139 60 Here are some more results...all stock except for partial writes, 24 segments (fsync on). hardware is four 15k sas in a raid 10. I am seeing very good results in other real wold scenarios outside of pgbench....anyone is interested drop me a line. Note I cut the transaction runs down to 100k from 1M. *** HOT *** [postgres@efsd-main root]$ time pgbench -c 5 -t 100000 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 10 number of clients: 5 number of transactions per client: 100000 number of transactions actually processed: 500000/500000 tps = 1156.605130 (including connections establishing) tps = 1156.637464 (excluding connections establishing) real 7m12.311s user 0m26.784s sys 0m25.429s *** cvs, HOT *** [postgres@efsd-main pgsql]$ time pgbench -c 5 -t 100000 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 10 number of clients: 5 number of transactions per client: 100000 number of transactions actually processed: 500000/500000 tps = 630.510918 (including connections establishing) tps = 630.520485 (excluding connections establishing) real 13m13.019s user 0m27.278s sys 0m26.092s
On Tue, 2007-08-07 at 20:27 +0100, Gregory Stark wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > > On Tue, 2007-08-07 at 13:16 +0100, Heikki Linnakangas wrote: > >> I ran some CPU intensive pgbench tests on HOT. Results are not > >> surprising, HOT makes practically no difference on the total transaction > >> rate, but reduces the need to vacuum: > >> ... > > Nor would I expect anything else, on this test. > > I think the surprising thing was that it wasn't slower due to the extra cpu > spent pruning tuples. ...balanced by the extra time spent adding new blocks and doing block-spanning updates without HOT. For CPU bound situations, the real-world difference lies in the logical I/O we avoid by not doing index insertions. Larger tables have deeper index trees, so cause more block accesses to locate the block into which to insert. Small tables with few indexes aren't a real test of that, even if it does illustrate the basic CPU balance that HOT now offers in its latest incarnation (well done Heikki and Pavan). -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On 8/8/07, Merlin Moncure <mmoncure@gmail.com> wrote: > On 8/7/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote: > > I ran some CPU intensive pgbench tests on HOT. Results are not > > surprising, HOT makes practically no difference on the total transaction > > rate, but reduces the need to vacuum: > > > > unpatched HOT > > tps 3680 3790 > > WAL written(MB) 5386 4804 > > checkpoints 10 9 > > autovacuums 116 43 > > autoanalyzes 139 60 > > Here are some more results...all stock except for partial writes, 24 > segments (fsync on). hardware is four 15k sas in a raid 10. I am > seeing very good results in other real wold scenarios outside of > pgbench....anyone is interested drop me a line. Note I cut the > transaction runs down to 100k from 1M. > > *** HOT *** > [postgres@efsd-main root]$ time pgbench -c 5 -t 100000 > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 10 > number of clients: 5 > number of transactions per client: 100000 > number of transactions actually processed: 500000/500000 > tps = 1156.605130 (including connections establishing) > tps = 1156.637464 (excluding connections establishing) > > real 7m12.311s > user 0m26.784s > sys 0m25.429s > > *** cvs, HOT *** > [postgres@efsd-main pgsql]$ time pgbench -c 5 -t 100000 > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 10 > number of clients: 5 > number of transactions per client: 100000 > number of transactions actually processed: 500000/500000 > tps = 630.510918 (including connections establishing) > tps = 630.520485 (excluding connections establishing) > > real 13m13.019s > user 0m27.278s > sys 0m26.092s oops! second case was w/o HOT patch applied (but we knew that) :D merlin
Heikki Linnakangas <heikki@enterprisedb.com> wrote: > I ran some CPU intensive pgbench tests on HOT. Results are not > surprising, HOT makes practically no difference on the total transaction > rate, but reduces the need to vacuum: > > unpatched HOT > tps 3680 3790 > WAL written(MB) 5386 4804 > checkpoints 10 9 > autovacuums 116 43 > autoanalyzes 139 60 I also ran pgbench with/without HOT using a bit different configurations (pgbench -s10 -c10 -t500000). There were 10% performance win on HOT, although the test was CPU intensive and with FILLFACTOR=100%. unpatched HOT tps 3366 3634 WAL written(MB) 4969 4374 checkpoints 9 8 autovacuums 126 42 autoanalyzes 146 59 I gathered oprofile logs. There were 4 HOT-related functions, that didn't appear in the unpatched test. But it is probably not so serious.- heap_page_prune 1.84%- PageRepairFragmentation 0.94%- pg_qsort 0.44% (called from PageRepairFragmentation) On the other hand, the number of _bt_compare and _bt_checkkeys were reduced by HOT, because we avoid the most part of index insertions. It looks like LWLockAcquire/Release were also reduced, but I cannot assure it is a benefits of HOT or a measurement deviation. unpatched HOT % symbol name 4.0867 4.2314 AllocSetAlloc 2.7839 2.8022 base_yyparse 1.8392 heap_page_prune 1.8459 1.6659 SearchCatCache 1.7405 1.6087 MemoryContextAllocZeroAligned 1.6936 1.5743 hash_search_with_hash_value 1.0672 1.1822 base_yylex 1.2430 1.1570 XLogInsert 0.9356 PageRepairFragmentation 1.3549 0.8911 LWLockAcquire 1.0977 0.8663 LWLockRelease 0.8018 0.7284 nocachegetattr 0.7568 0.7124 FunctionCall2 0.5264 0.6795 ScanKeywordLookup 0.7115 0.6462 hash_any 0.7399 0.5963 AllocSetFree 0.6650 0.5925 GetSnapshotData 0.5536 0.5789 MemoryContextAlloc 0.5643 0.5547 hash_seq_search 0.4660 0.5005 expression_tree_walker 0.5293 0.4777 ExecInitExpr 0.4381 pg_qsort 0.4376 0.4321 hash_uint32 0.4160 0.4268 expression_tree_mutator 0.4322 0.4183 LockAcquire 0.6933 0.3911 _bt_compare 0.5270 0.3828 PinBuffer 0.4025 0.3798 fmgr_info_cxt_security 0.4458 0.3758 MemoryContextAllocZero 0.5101 _bt_checkkeys Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Thanks for the testing, ITAGAKI Takahiro wrote: > I gathered oprofile logs. There were 4 HOT-related functions, that didn't > appear in the unpatched test. But it is probably not so serious. > - heap_page_prune 1.84% > - PageRepairFragmentation 0.94% > - pg_qsort 0.44% (called from PageRepairFragmentation) That's expected. Those functions are involved in removing the dead HOT tuples, replacing VACUUMs. Maybe we could make them cheaper, but it's not too bad as it is. > On the other hand, the number of _bt_compare and _bt_checkkeys were > reduced by HOT, because we avoid the most part of index insertions. > It looks like LWLockAcquire/Release were also reduced, but I cannot > assure it is a benefits of HOT or a measurement deviation. It could very well be real. Because of the reduction of index insertions, there's less locking of the index pages. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 8/14/07, ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > Heikki Linnakangas <heikki@enterprisedb.com> wrote: > > > I ran some CPU intensive pgbench tests on HOT. Results are not > > surprising, HOT makes practically no difference on the total transaction > > rate, but reduces the need to vacuum: > > > > unpatched HOT > > tps 3680 3790 > > WAL written(MB) 5386 4804 > > checkpoints 10 9 > > autovacuums 116 43 > > autoanalyzes 139 60 > > I also ran pgbench with/without HOT using a bit different configurations > (pgbench -s10 -c10 -t500000). There were 10% performance win on HOT, > although the test was CPU intensive and with FILLFACTOR=100%. I'm curious why I am seeing results so different from everybody else (I had almost double tps with HOT). Are you running fsync on/off? Any other changes to postgresql.conf? merlin