Re: Deleting older versions in unique indexes to avoid page splits - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Deleting older versions in unique indexes to avoid page splits
Date
Msg-id CAH2-Wz=z1kiHW4Z8YjOn84tiMjP5gt=LQH8=ti2NxBF=F5N1RQ@mail.gmail.com
Whole thread Raw
In response to Re: Deleting older versions in unique indexes to avoid page splits  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Thu, Nov 12, 2020 at 3:00 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Attached is v8, which has the enhancements for low cardinality data
> that I mentioned earlier today. It also simplifies the logic for
> dealing with posting lists that we need to delete some TIDs from.
> These posting list simplifications also make the code a bit more
> efficient, which might be noticeable during benchmarking.

One more thing: I repeated a pgbench test that was similar to my
earlier low cardinality tests -- same indexes (fiver, tenner, score,
aid_pkey_include_abalance). And same queries. But longer runs: 4 hours
each. Plus a larger DB: scale 2,500. Plus a rate-limit of 5000 TPS.

Here is the high level report, with 4 runs -- one pair with 16
clients, another pair with 32 clients:

2020-11-11 19:03:26 -0800 - Start of initial data load for run
"patch.r1c16" (DB is also used by later runs)
2020-11-11 19:18:16 -0800 - End of initial data load for run "patch.r1c16"
2020-11-11 19:18:16 -0800 - Start of pgbench run "patch.r1c16"
2020-11-11 23:18:43 -0800 - End of pgbench run "patch.r1c16":
patch.r1c16.bench.out: "tps = 4999.100006 (including connections
establishing)" "latency average = 3.355 ms" "latency stddev = 58.455
ms"
2020-11-11 23:19:12 -0800 - Start of initial data load for run
"master.r1c16" (DB is also used by later runs)
2020-11-11 23:34:33 -0800 - End of initial data load for run "master.r1c16"
2020-11-11 23:34:33 -0800 - Start of pgbench run "master.r1c16"
2020-11-12 03:35:01 -0800 - End of pgbench run "master.r1c16":
master.r1c16.bench.out: "tps = 5000.061623 (including connections
establishing)" "latency average = 8.591 ms" "latency stddev = 64.851
ms"
2020-11-12 03:35:41 -0800 - Start of pgbench run "patch.r1c32"
2020-11-12 07:36:10 -0800 - End of pgbench run "patch.r1c32":
patch.r1c32.bench.out: "tps = 5000.141420 (including connections
establishing)" "latency average = 1.253 ms" "latency stddev = 9.935
ms"
2020-11-12 07:36:40 -0800 - Start of pgbench run "master.r1c32"
2020-11-12 11:37:19 -0800 - End of pgbench run "master.r1c32":
master.r1c32.bench.out: "tps = 5000.542942 (including connections
establishing)" "latency average = 3.069 ms" "latency stddev = 24.640
ms"
2020-11-12 11:38:18 -0800 - Start of pgbench run "patch.r2c16"

We see a very significant latency advantage for the patch here. Here
is the breakdown on query latency from the final patch run,
patch.r1c32:

scaling factor: 2500
query mode: prepared
number of clients: 32
number of threads: 8
duration: 14400 s
number of transactions actually processed: 72002280
latency average = 1.253 ms
latency stddev = 9.935 ms
rate limit schedule lag: avg 0.406 (max 694.645) ms
tps = 5000.141420 (including connections establishing)
tps = 5000.142503 (excluding connections establishing)
statement latencies in milliseconds:
         0.002  \set aid1 random_gaussian(1, 100000 * :scale, 4.0)
         0.001  \set aid2 random_gaussian(1, 100000 * :scale, 4.5)
         0.001  \set aid3 random_gaussian(1, 100000 * :scale, 4.2)
         0.001  \set bid random(1, 1 * :scale)
         0.001  \set tid random(1, 10 * :scale)
         0.001  \set delta random(-5000, 5000)
         0.063  BEGIN;
         0.361  UPDATE pgbench_accounts SET abalance = abalance +
:delta WHERE aid = :aid1;
         0.171  SELECT abalance FROM pgbench_accounts WHERE aid = :aid2;
         0.172  SELECT abalance FROM pgbench_accounts WHERE aid = :aid3;
         0.074  END;

Here is the equivalent for master:

scaling factor: 2500
query mode: prepared
number of clients: 32
number of threads: 8
duration: 14400 s
number of transactions actually processed: 72008125
latency average = 3.069 ms
latency stddev = 24.640 ms
rate limit schedule lag: avg 1.695 (max 1097.628) ms
tps = 5000.542942 (including connections establishing)
tps = 5000.544213 (excluding connections establishing)
statement latencies in milliseconds:
         0.002  \set aid1 random_gaussian(1, 100000 * :scale, 4.0)
         0.001  \set aid2 random_gaussian(1, 100000 * :scale, 4.5)
         0.001  \set aid3 random_gaussian(1, 100000 * :scale, 4.2)
         0.001  \set bid random(1, 1 * :scale)
         0.001  \set tid random(1, 10 * :scale)
         0.001  \set delta random(-5000, 5000)
         0.078  BEGIN;
         0.560  UPDATE pgbench_accounts SET abalance = abalance +
:delta WHERE aid = :aid1;
         0.320  SELECT abalance FROM pgbench_accounts WHERE aid = :aid2;
         0.308  SELECT abalance FROM pgbench_accounts WHERE aid = :aid3;
         0.102  END;

So even the UPDATE is much faster here.

This is also something we see with pg_statio_tables, which looked like
this by the end for patch:

-[ RECORD 1 ]---+-----------------
schemaname      | public
relname         | pgbench_accounts
heap_blks_read  | 117,384,599
heap_blks_hit   | 1,051,175,835
idx_blks_read   | 24,761,513
idx_blks_hit    | 4,024,776,723

For the patch:

-[ RECORD 1 ]---+-----------------
schemaname      | public
relname         | pgbench_accounts
heap_blks_read  | 191,947,522
heap_blks_hit   | 904,536,584
idx_blks_read   | 65,653,885
idx_blks_hit    | 4,002,061,803

Notice that heap_blks_read is down from 191,947,522 on master, to
117,384,599 with the patch -- so it's ~0.611x with the patch. A huge
reduction like this is possible with the patch because it effectively
amortizes the cost of accessing heap blocks to find garbage to clean
up ("nipping the [index bloat] problem in the bud" is much cheaper
than letting it get out of hand for many reasons, locality in
shared_buffers is one more reason). The patch accesses garbage tuples
in heap blocks close together in time for all indexes, at a point in
time when the blocks are still likely to be found in shared_buffers.

Also notice that idx_blks_read is ~0.38x with the patch. That's less
important, but still significant.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Next
From: Euler Taveira
Date:
Subject: Re: recovery_target immediate timestamp