In order to simulate real-world clog contention, we need to use
benchmarks that deal with real world situations.
Currently, pgbench pre-loads data using COPY and executes a VACUUM so
that all hint bits are set on every row of every page of every table.
Thus, as pgbench runs it sees zero clog accesses from historical data.
As a result, clog access is minimised and the effects of clog
contention in the real world go unnoticed.
The following patch adds a pgbench option -I to load data using
INSERTs, so that we can begin benchmark testing with rows that have
large numbers of distinct un-hinted transaction ids. With a database
pre-created using this we will be better able to simulate and thus
more easily measure clog contention. Note that current clog has space
for 1 million xids, so a scale factor of greater than 10 is required
to really stress the clog.
The patch uses multiple connections to load data using a predefined
script similar to the -N or -S logic.
$ pgbench --help
pgbench is a benchmarking tool for PostgreSQL.
Usage:
pgbench [OPTIONS]... [DBNAME]
Initialization options:
-i invokes initialization mode using COPY
-I invokes initialization mode using INSERTs
...
$ pgbench -I -c 4 -t 10000
creating tables...
filling accounts table with 100000 rows using inserts
set primary key...
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pgbench_branches_pkey" for table "pgbench_branches"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pgbench_tellers_pkey" for table "pgbench_tellers"
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pgbench_accounts_pkey" for table "pgbench_accounts"
done.
transactions option ignored
transaction type: Load pgbench_accounts using INSERTs
scaling factor: 1
query mode: simple
number of clients: 4
number of threads: 1
number of transactions per client: 25000
number of transactions actually processed: 100000/100000
tps = 828.194854 (including connections establishing)
tps = 828.440330 (excluding connections establishing)
Yes, my laptop really is that slow. Contributions to improve that
situation gratefully received.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services